Item 43680766

rowanG077 • 6 days ago

All of these sound like grievous sources of bugs to me. Sure it's nice that it's possible but please don't do those things unless there isn't a safer way to do it.

Diggsey • 6 days ago

They are definitely more error-prone usage patterns, and coming from Rust the amount of hidden unsafety is quite unsettling (in addition to the obvious potential for user errors, depending on the aliasing model of your language/compiler, which I suspect Zig hasn't figured out yet, this parent pointer trick could even lead to miscompilations...)

Having said that, intrusive linked lists are quite an important data structure in code which is required to be allocation-free. Kernel code, signal handlers and code which requires consistent performance (eg. audio processing) need to be able to move objects between containers or have objects in multiple containers without allocating.

3 replies

rowanG077 • 6 days ago

Use a separate data structure and then index into it with a linked list like API. No need to do something so dangerous if you simply need an allocation less linked list interface.

But yes there is a place for these data structures. It just shouldn't be common.

1 reply

Diggsey • 6 days ago

If you just mean using array indexes instead of pointers, that can work in some cases, but can run into concurrency issues when the container needs to be resized. With pointers to nodes, new nodes can be allocated and old nodes freed on a separate thread without interfering with the performance sensitive thread, whereas resizing an array requires all access to stop while the backing memory is re-allocated.

I don't think you mean this, but elaborating on why a non-intrusive linked list is not very useful, that's because it takes linear time to find an object in a non-intrusive linked list, but constant time in an intrusive linked list.

For example:

    struct Node {
        Node* prev_by_last_update;
        Node* next_by_last_update;
        Node* prev_with_owner;
        Node* next_with_owner;
        Payload data;
    }

This intrusive structure allows reassigning and updating nodes, whilst being able to efficiently find eg. the next most recently updated node, or to iterate over nodes with the same owner.

bananabiscuit • 6 days ago

Do you mean you will likely make an error when implementing a standard textbook linked list? Or that a standard textbook linked list has non-obvious errors in its design? If so, would be curious to learn more.

2 replies

simonask • 6 days ago

Most programmers can implement a linked list.

Extremely few can use one safely.

They're an elegant weapon for a more civilized age - namely one without multithreading, aliasing-based optimization, and a host of other nice things.

2 replies

tauoverpi • 6 days ago

Multithreading is exactly where linked lists are useful as in the implementation of MPSC, MPMC, SPMC, and so on it allows one to safely construct objects out of vire of other threads then use a CAS to append the new item to the list making it visible for other threads to consume even without needing to allocate additional memory.

They're also pretty useful for free-list structures which overwrite entries to be removed (they're scheduled for deletion thus assumptiong here is that the object is now dead) with linked list nodes that would fit in that space thus never needing to allocate any additional as you can reuse existing space.

1 reply

simonask • 5 days ago

Yeah, wait-free data structures often contain linked lists. That doesn’t cancel out all of the other problems.

Anecdotally, I’ve seen wait-free structures used incorrectly many more times than I’ve seen them used correctly. Some people think they are magically faster because “acquiring a mutex lock is expensive”, but that’s far from always the case.

1 reply

mgaunard • 2 days ago

a CAS-loop, as suggested in the message you're replying to, is certainly not wait-free, only lock-free.

codr7 • 4 days ago

I don't see a problem with their safety, even in C.

More the opposite, their simplicity makes them safer to deal with than the alternatives. And sometimes their unique features (mostly the intrusive variant) are exactly what you need.

Multi-threaded accesses to a vector would be just as bad?

im3w1l • 6 days ago

Implementing anything is more error prone than using something battle-tested.

throwawaymaths • 6 days ago

Maybe advocate for wrappers in the stdlib to implement typesafe. LL, typesafe extrusive LL, etc. Using this as a primitive?

tauoverpi • 6 days ago

You don't need everything to be generic to have safety as the API provided by the object using the linked list can maintain safety by making it difficult to do the wrong thing and being correct by construction. Of course that doesn't stop a user of a library ignoring the safe way to handle types which use intrusive linked list nodes but that same user is likely going to shoot themselves in the foot regardless even when given a "safer" interface.

Trying to write a safer version of the above without losing the performance / advantages takes you into the realm of ATS2 https://www.cs.bu.edu/~hwxi/atslangweb/ which quickly becomes more difficult to work with than practical (and no, rust can't solve this with it's implementation of resource types as far as I'm aware).

So yes, doing the above makes sense when the problem demands it as allocating memory you don't need when there's a clear way to represent a graph without it just wastes cycles on memory management you didn't even need to do.

immibis • 6 days ago

bugs* ironic

It depends on whether you approach programming from a perspective of writing piles of obviously correct abstraction layers (the total effect of which is to leak so much the solution ends up being wrong anyway), or from the direct unabstracted perspective of actually making the computer do what you want it to do (which does not preclude the solution from using abstractions).

I call them the Uncle Bob vs Casey Muratori school, after Casey's lectures such as "Clean Code, Horrible Performance": https://www.youtube.com/watch?v=tD5NrevFtbU

2 replies

rowanG077 • 6 days ago

Well uncle bob gives good advice. If you do exactly the opposite of what he tells you to do. so I don't think this comparison is apt at all.

1 reply

immibis • 6 days ago

Can you elaborate on why he's a hack and perhaps how this relates to the idea of forcing yourself to use highly constrained inefficient "safe" data structures?

1 reply

rowanG077 • 6 days ago

Because the excessive OOP and TDD he recommends is just bad. He also hates static typing and seems to think testing can somehow replace it.

I'm not for forcing inefficient safe data structures. You are pretending that this is a dichotomy when the vast majority of the time it isn't. You are completely misinterpreting what I said. I'm for safe data structures. And the data structures described are huge footguns. I made no point about efficiency.

1 reply

Ygg2 • 6 days ago

In programming and life, both extremes can be equally wrong. I'm not saying Uncle Bob is the shining beacon to steer your boat towards at all cost, but neither is Casey Muratori.

> He also hates static typing

I mean, in OOP there are many reasons to dislike it. It's harder to extend and test. Calls to such functions within other methods are a nuisance to remove. E.g. `Instant.now()`.

1 reply

rowanG077 • 6 days ago

I'm sorry I'm not familiar with Casey Muratori. So I have no clue what he says.

1 reply

Ygg2 • 6 days ago

He's very data and performance oriented. He gives advice on how to write games, high-performance systems. His advice is solid and great even, but not when your source of latency is a cloud DB.

1 reply

immibis • 6 days ago

I beg to differ even on that. When your source of latency is a cloud DB, learning about CPU caches won't help you, but learning to think about end-to-end performance absolutely will. "What sequence of queries achieves my goal with the shortest critical path?" is very similar to "What sequence of vector instructions achieves my goal with the shortest loop-carried dependency?"

1 reply

Ygg2 • 5 days ago

> but learning to think about end-to-end performance absolutely will.

In a cloud DB webservice what will save you is having a good tracing/debugging story. From it, it will be easy to see E2E performance. Sure, some things will translate, but going SIMD optimized layout on a class, that will have to talk to DB across the ocean is a fool's errands.

To optimize you must have a good target and make sure you are measuring correctly.

1 reply

immibis • 4 days ago

Why do you think that making a minimum number of queries to a cloud database is anything other than conceptually similar to making a SIMD optimized layout on a class?

1 reply

Ygg2 • 4 days ago

That's not what I was talking about. Optimizing queries to DBs is nothing like optimizing SIMD layout.

Ygg2 • 6 days ago

> do what you want it to do

Until a colleague or future you, or someone else calling the code, forgets.

Which leads to "I WANT TO KILL WHOEVER WROTE THIS" debugging sessions.

Encapsulation exists for a reason.