Item 43680271

I wonder why should a language implement in its libraries a SingleLinkedList and a DoubleLinkedList.

I do get why you need an array-like list, a dictionary/hashmap, a stack and a queue. I got the impression that linked lists aren't used very often. Or maybe it's a different case with Zig?

messe • 6 days ago

Maybe because it's used elsewhere in the standard library?

SinglyLinkedList is used in the standard library in std.Thread.Pool, and std.heap.ArenaAllocator. DoublyLinkedList is used in the http client's connection pool, as well as the ObjectCache for the package manager's git client.

boomlinde • 6 days ago

> I got the impression that linked lists aren't used very often.

I did some cursory research on the software I've been using today and presented the results here: https://news.ycombinator.com/item?id=43684121

I think there is a misconception around linked lists mostly stemming from the fact that they're not great for cache locality. Someone presents the idea that linked lists under common circumstances are unintuitively slow and therefore should be considered carefully for performance sensitive applications. A game of Chinese whispers immediately forms around that idea and what comes out the other end is a new idea: that linked lists are bad, shouldn't be used and aren't used. Meanwhile, they continue to be used extensively in system software.

2 replies

bsder • 6 days ago

Linked lists get a bad reputation because they wound up being a default data structure--and that's almost always a mistake on modern computers.

Back in days of yore, small memory and CPU meant that linked lists and arrays were really the only effective data structures. Allocating and computation were super expensive, so following a link or having an index is really the only thing that works.

On today's computers, CPU is so much faster than memory that it is almost always better to recompute a value than try to look it up. Locality is so important that linear actions trump asymptotic complexity until you get quite a lot more elements than you might expect. Memory is so cheap that it is worth exponentially overallocating in order to amortize complexity. etc.

At the end of the day with modern programming, you should almost always be reaching for something simpler than a linked list (array/vector) or you should be reaching for an abstraction that is more complex than a linked list (hash/dict, queue/deque, etc.).

With modern CPUs, if you are using a linked list for anything other than building a more complex data structure, you're probably making an incorrect choice on almost all fronts.

2 replies

Validark • 6 days ago

> Locality is so important that linear actions trump asymptotic complexity until you get quite a lot more elements than you might expect.

The reverse of this can be true. In the following paper, they found that William's Heap Construction (one by one, O(n log n) time) beat Floyd's Heap Construction (O(n) time) when the input is larger than the size of off-chip cache.

https://doi.org/10.1145/351827.384257

xxs • 4 days ago

>actions trump asymptotic complexity

Which operation of a linked list has any better complexity. The only one I can think of is a frequent removal of random elements in the middle -- and opt not to use tombstones.

xxs • 4 days ago

Linked lists have higher constant const memory wise, much slower to iterate, effectively removals of a random in the middle are the same as vector. When used as queues the ring buffers are better for most applications.

Their prime use is that adding new elements lack a terrible worst case, aside the fact it puts more pressure on the memory allocator - need to be concurrent friendly.

If there is a full control of the memory allocation (e.g. kernels) linked lists make sense, for virtually all other cases there are better data structures.

1 reply

boomlinde • 4 days ago

I'd first like to point out that yours is an argument from the perspective that performance is crucial and dependent on the use or non-use of linked lists. This is not necessarily the case. Performance may not be crucial, and even when it is, linked lists can find use in areas where they're not even nearly the bottleneck, for being the most obvious data structure that satisfies the required operations on the collection. That pointer chasing is slow is not an argument against the use of linked lists in general.

What if I only ever iterate over a list of some hundred entries in the rare case of an error? Should I be reaching for the most performant data structure or the one that most obviously and clearly maps to the operations the program needs to perform on it at some infinitesimal cost?

> Linked lists have higher constant const memory wise

Than a fixed vector with a known maximum number of elements, yes. With something like a dynamic array you are however usually better off using a linked list memory-wise, unless you want to reallocate every few insertions or your node data is smaller than the link.

> effectively removals of a random in the middle are the same as vector.

You're conflating concepts. Seeking is O(n). Removal is O(1). There are cases where seeking for the purpose of removal is either amortized (e.g. you're already iterating over all the nodes, executing an associated handler for each which may or may not determine that the node should be unlinked in-place) or completely unnecessary (some other data structure already tracks a reference to the node and some code associated with that determines if/when it is removed).

One might similarly argue that I am conflating compacting (O(n)) with removal (O(1)) in the vector case, but in the case of linked lists, the difference is just a natural consequence of design, while for a typical implementation of vector operations they are the same operation and require an additional data structure to be anything but that in order to track the holes.

Moving elements around in memory also comes with a host of other issues, for example if there are many references to the elements, requiring maintaining and indirectly accessing memory via an index array.

> aside the fact it puts more pressure on the memory allocator - need to be concurrent friendly.

This is again a conflation of concepts: that of using linked lists and that of making space for the nodes or preparing an allocator for concurrent application. The nodes could be on the stack for some applications, in a static pool for others.

bsder • 6 days ago

What do you think is the data structure underneath a hash, stack or queue? It's likely a linked list.

So, you are completely correct. The issue is simply that you are at a different level of abstraction.