lightingthedark 6 days ago

Can someone explain how the claim of higher performance works here? In C, which lacks generics, an intrusive list is preferred because otherwise you end up with each node having a void pointer to the data it holds. The previous Zig version was a generic, so the data type could easily be a value type. Given that the compiler is allowed to rearrange the layout of both structs unless data is packed or extern (in which case it almost certainly won't want to have a linked list node in the middle anyway) isn't the resulting type exactly the same in memory unless you intentionally made T a pointer type?

3
ashdnazg 6 days ago

I don't understand the higher performance either. What I know as the significant advantage is that you can have one object in multiple lists.

messe 6 days ago

> What I know as the significant advantage is that you can have one object in multiple lists.

Another advantage is smaller code size, as the compiler doesn't need to generate code for SinglyLinkedList(i32), SinglyLinkedList(User), and SinglyLinkedList(PointCloud). This could have a performance impact by making it more likely that code remains in the cache.

lightingthedark 6 days ago

Technically that should be possible the other way by using a Node<T> so the type of the second list ends up being a Node<Node<T>> but I can see why an intrusive list would be preferred to that, and also the linked list API might prevent that pattern.

Usually if I have multiple lists holding something I have one that's the 'owner' and then the secondary data structures would have a non-owning pointer to it. Is that the case where the performance would be better with an intrusive list? My intuition would be that having multiple Node members would pollute the cache and not actually be a performance win but maybe it is still better off because it's all colocated? Seems like the kind of thing I'd have to benchmark to know since it would depend on the number of lists and the size of the actual data.

lightingthedark 6 days ago

Okay so in the multiple list case performance would actually be worse because you'd have a double pointer dereference. I was thinking you'd have the list nodes contiguous in memory so the first dereference would always hit cache but that's a bad assumption for a linked list.

Since you shouldn't reach for a linked list as a default data structure modern hardware anyway, I actually do see how this change makes sense for Zig. Neat!

codr7 4 days ago

Allocating list nodes in one block of memory is very common in the intrusive case.

grayhatter 6 days ago

> Can someone explain how the claim of higher performance works here? In C, which lacks generics, an intrus

I can only give a handwavey answer because I've yet to see any data, and if an engineer tells you something is better but doesn't provide any data, they're not a very good engineer. So grain of salt and all that. But the answer I got was because cache performance. Writing code this way your CPU will spend less time waiting for main memory, and the branch predictor will have better luck. The argument makes sense, but like I said,I've yet to see real world data.

> isn't the resulting type exactly the same in memory unless you intentionally made T a pointer type?

Yes and no. If I understand what you mean, the bit layout will be the 'same'. But I think your confusion is more about how what a compiler means by pointer type, and what a human means. If you pull away enough layers of abstraction, the compiler doesn't see *Type it'll only see *anyopaque, phrased completely differently; according to the compiler, all pointers are the same and are exactly memory_address_size() big. *Type doesn't really exist.

Writing it this way, imagine using just the LinkedList type, without a container of any kind. node to node to node, without any data. While it would be pointless, that would (might) be faster to walk that list, right? There's no extra data loads for the whole struct? That's what this is. Using it this way it gets complicated, but translating theory to asm is always messy. Even more messy when you try to account for speculative execution.

Zambyte 6 days ago

> But I think your confusion is more about how what a compiler means by pointer type, and what a human means. If you pull away enough layers of abstraction, the compiler doesn't see *Type it'll only see *anyopaque, phrased completely differently; according to the compiler, all pointers are the same and are exactly memory_address_size() big. *Type doesn't really exist.

Check out the implementation of SinglyLinkedList in the latest release (before the change in the post)[0]. You'll notice the argument for SinglyLinkedList is (comptime T: type), which is used in the internal Node struct for the data field, which is of type T. Notably, the data field is not a *T.

In Zig, when you call the function SinglyLinkedList with the argument i32 (like SinglyLinkedList(i32)) to return a type for a list of integers, the i32 is used in the place of T, and a Node struct that is unique for i32 is defined and used internally in the returned type. Similarly, if you had a struct like Person with fields like name and age, and you created a list of Persons like SinglyLinkedList(Person), a new Node type would be internally defined for the new struct type returned for Person lists. This internal Node struct would instead use Person in place of T. The memory for the Node type used internally in SinglyLinkedList(Person) actually embeds the memory for the Person type, rather than just containing a pointer to a Person.

These types are very much known to the compiler, as the layout of a Node for a SinglyLinkedList(i32) is not the same as the layout of a Node for a SinglyLinkedList(Person), because the argument T is not used as a pointer. Unless, as the gp mentioned, T is explicitly made to be a pointer (like SinglyLinkedList(*Person)).

[0] https://ziglang.org/documentation/0.14.0/std/#src/std/linked...

anarazel 6 days ago

Intrusive lists are often used to enqueued pre-existing structures onto lists. And often the same object can be in different lists at different times.

That's not realistically dealt with by the compiler re-organizing the struct layout.