I'd like to refute the 'channels are slow' part of this article.
If you run a microbenchmark which seems like what has been done, then channels look slow.
If you try the contention with thousands of goroutines on a high core count machine, there is a significant inflection point where channels start outperforming sync.Mutex
The reason is that sync.Mutex, if left to wait long enough will enter a slow code path and if memory serves, will call out to a kernel futex. The channel will not do this because the mutex that a channel is built with is exists in the go runtime - that's the special sauce the author is complaining doesn't exist but didn't try hard enough to seek it out.
Anecdotally, we have ~2m lines of Go and use channels extensively in a message passing style. We do not use channels to increment a shared number, because that's ridiculous and the author is disingenuous in their contrived example. No serious Go shop is using a channel for that.
Do you have any benchmarks for the pattern you described where channels are more efficient?
> sync.Mutex, if left to wait long enough will enter a slow code path and if memory serves, will call out to a kernel futex. The channel will not do this because the mutex that a channel is built with is exists in the go runtime
Do you have any more details about this? Why isn’t sync.Mutex implemented with that same mutex channels use?
> [we] use channels extensively in a message passing style. We do not use channels to increment a shared number
What is the rule of thumb your Go shop uses for when to use channels vs mutexes?
> Do you have any benchmarks for the pattern you described where channels are more efficient?
https://go.dev/play/p/qXwMJoKxylT
go test -bench=.* -run=^$ -benchtime=1x
Since my critique of the OP is that it's a contrived example, I should mention so is this: the mutex version should be a sync.Atomic and the channel version should have one channel per goroutine if you were attempting to write a performant concurrent counter, both of those alternatives would have low or zero lock contention. In production code, I would be using sync.Atomic, of course.
On my 8c16t machine, the inflection point is around 2^14 goroutines - after which the mutex version becomes drastically slower; this is where I believe it starts frequently entering `lockSlow`. I encourage you to run this for yourself.
> Do you have any more details about this? Why isn’t sync.Mutex implemented with that same mutex channels use?
Why? Designing and implementing concurrent runtimes has not made its way onto my CV yet; hopefully a lurking Go contributor can comment.
The channel mutex: https://go.dev/src/runtime/chan.go
Is not the same mutex as a sync.Mutex: https://go.dev/src/internal/sync/mutex.go
If I had to guess, the channel mutex may be specialised since it protects only enqueuing or dequeuing onto a simple buffer. A sync.Mutex is a general construct that can protect any kind of critical region.
> What is the rule of thumb your Go shop uses for when to use channels vs mutexes?
Rule of thumb: if it feels like a Kafka use case but within the bounds of the local program, it's probably a good bet.
If the communication pattern is passing streams of work where goroutines have an acyclic communication dependency graph, then it's a no brainer: channels will be performant and a deadlock will be hard to introduce.
If you are using channels to protect shared memory, and you can squint and see a badly implemented Mutex or WaitGroup or Atomic; then you shouldn't be using channels.
Channels shine where goroutines are just pulling new work from a stream of work items. At least in my line of work, that is about 80% of the cases where a synchronization primitive is used.
Thanks for the example! I'll play around with it.
> On my machine, the inflection point is around 10^14 goroutines - after which the mutex version becomes drastically slower;
How often are you reaching 10^14 goroutines accessing a shared resource on a single process in production? We mostly use short-lived small AWS spot instances so I never see anything like that.
> Why? Designing and implementing concurrent runtimes has not made its way onto my CV yet; hopefully a lurking Go contributor can comment. > If I had to guess, the channel mutex may be specialised since it protects only enqueuing or dequeuing onto a simple buffer. A sync.Mutex is a general construct that can protect any kind of critical region.
Haha fair enough, I also know little about mutex implementation details. Optimized specialized tool vs generic tool feels like a reasonable first guess.
Though I wonder if you are able to use channels for more generic mutex purposes is it less efficient in those cases? I guess I'll have to do some benchmarking myself.
> If the communication pattern is passing streams of work where goroutines have an acyclic communication dependency graph, then it's a no brainer: channels will be performant and a deadlock will be hard to introduce.
I agree with your rules, I used to always use channels for single processt thread-safe queues (similar to your Kafka rule) but recently I ran into a cyclic communication pattern with a queue and eventually relented to using a Mutex. I wonder if there are other painful channel concurrency patterns lurking for me to waste time on.
> How often are you reaching 10^14 goroutines accessing a shared resource on a single process in production? We mostly use short-lived small AWS spot instances so I never see anything like that.
I apologize, that should've said 2^14, each sub-benchmark is a doubling of goroutines.
2^14 is 16000, which for contention of a shared resource is quite a reasonable order of magnitude.
> We do not use channels to increment a shared number, because that's ridiculous and the author is disingenuous in their contrived example. No serious Go shop is using a channel for that.
Talk about knocking down strawmen: it's a stand-in for shared state, and understanding that should be a minimum bar for serious discussion.
And implying I don't understand toy examples and responding with this is apparently above the bar for serious discussion.
Dude. Criticizing a contrived example that is never used in production code is a very fair game. You didn't need to go all "but actually!" here.
According to the article, channels are slow because they use mutexes under the hood. So it doesn't follow that channels are better than mutexes for large N. Or is the article wrong? Or my reasoning?
I have replied to another comment with more details: the channel mutex is not the same one that sync.Mutex is using.
The article that the OP article references does not show the code for their benchmark, but I must assume it's not using a large number of goroutines.