> riddled with state machines
Why is this bad? Normally, state machines are easy to reason about. The set of developers who say "I want to implement this logic as a state machine" is MUCH larger than the set of developers who say "I should make sure I fully understand every possible state and edge case ahead of time before making a state machine!"
> "I should make sure I fully understand every possible state and edge case ahead of time before making a state machine!"
Attempting to understand every state and edge case before writing code is a fool's errand because it would amount to writing the entire program anyway.
State machines are a clear, concise, elegant pattern to encapsulate logic. They're dead simple to read and reason about. And, get this, writing one FORCES YOU to fully understand every possible state and edge case of the problem you're solving.
You either have an explicit state machine, or an implicit one. In my entire career I have never regretted writing one the instant I even smell ambiguity coming on. They're an indefatigable sword to cut through spaghetti that's had poorly interacting logic sprinkled into it by ten devs over ten years, bring it into the light, and make the question and answer of how to fix it instantly articulable and solvable.
I truly don't understand what grudge you could have against the state machine. Of all the patterns in software development I'd go as far as to hold it in the highest regard above all others. If our job is to make computers do what we want them to do in an unambiguous and maintainable manner then our job is to write state machines.
> And, get this, writing one FORCES YOU to fully understand every possible state and edge case of the problem you're solving.
lol? ;P Here is just one example of a bug I know of that should exist, today, in Chrome, because, in fact, state machines are extremely hard to reason about. (I have filed previous ones that were fixed, and tons of bugs in Chrome in general are in this category. This one is top of mind as they haven't even acknowledged it yet.)
https://issues.webrtc.org/issues/339131894
Now, you are going to tell me "they are doing state machines wrong as they don't have a way to discriminate on what the state even is in the first place"... and yet, that's the problem: the term "state machine" does not, in fact, mean a very narrow piece of inherent algorithmic limitations any more than "regular expressions" implies the language is regular, as this is an engineering term, not a computer science one.
In the field, state machines just kind of happen by accident when a ton of engineers all try to add their own little corner of logic, and the state is then implied by the cross-product of the state of every variable manipulated by the edges of the machine. This results in a complete mess where, in fact, it is essentially impossible to prove that you've provided edges for every possible state. Nothing in the type system saves you from this, as the state isn't reified.
This contrasts with approaches to these problems that involve more structured concurrency, wherein the compiler is able to not only deduce but even some concept of what kinds of edges are possible and where the state lies. (This, FWIW, is a reason why async/await is so much preferable to the kind of callback hell that people would find themselves in, maintaining a massive implicit state machine and hoping they covered all the cases.)
>State machines are a clear, concise, elegant pattern to encapsulate logic. They're dead simple to read and reason about. And, get this, writing one FORCES YOU to fully understand every possible state and edge case of the problem you're solving.
It doesn't force you to do that at all.
You can start piling in hacks to handle edge cases inside of certain states, for instance, instead of splitting them into their own states. Or the next dev does.
Now it's an implicit ball of mud that pretends to be something else and has a execution pattern that's different from the rest of your company's business logic but not actually strictly "correct" still or easier to reason about for the edge cases.
And that's what most people do. They don't use it as a tool to force them to make things unambiguous. They bail when it gets hard and leave crappy implementations behind.
Copy-pasting from another reply to a different comment: As a simple example of something that's often left out in a way that fucks up a lot of devs' attempts at state machines, and is super annoying to draw in a typical state diagram: the passing of time.
This just hasn't been my experience but I suppose it's possible if your team is determined enough to write bad code. I'd still wager a bungled state machine is probably fairly easier to fix than a bungled mess of branches, but I've never seen such a thing.
I actually use passage of time as a factor in state machines all the time on game dev projects. It's pretty simple, just store a start time and check against it. I don't see how "ten seconds have passed since entering state A" is a more difficult condition than any other to draw or model.
In my experience Ye Olde Web App backend services tend to be particularly bad with time because so much is generally done in the request/response model.
For business-logic reasons, where I've generally seen it fall apart is when things go from fairly simple things like "after six months of inactivity this account is considered idle" to more complex interactions of timers and activity types. "Move fast and break things", "just get to MVP" attitudes rarely have the discipline to formally draw all the distinct states out as the number of potential states starts to exceed a couple dozen.
The times I’ve bothered to write explicit state machines have created the most solid, confident and bug-free pieces of software I’ve ever built. I would send someone to the moon with them.
Couldn't this be said about any alternative solution? I fail to see how this is specific to state machines.
What do you suggest instead of a state machine?
The "riddled with state machines" from the post I was replying to, while sounding negative, is at least better than the "single state machine" which is probably combinatorially huge and would be impossible to maintain.
My rough rule of thumb based on experience is that if the state machine being a state machine is visible outside of it's internal implementation (compared to just an interface with operational methods that don't hint at how things are managed behind the scenes) it's probably too leaky and/or incomplete.
I would trust code with extensive state-transition testing (regardless of internal implementation) - I wouldn't trust code that claimed to implement a state machine and didn't have that testing, or extensive documentation of edge cases and what was left out of the state machine.
As a simple example of something that's often left out in a way that fucks up state machines: the passing of time.
Like properly model a domain in domain terms?
And that won't be a state machine with the states having more fancy names?
It will be, but the idea of having an overview over the states is gone then. There is just modules-> objects with the transitions being method calls. Nobody will have to know all the things about all the state transitions, resulting in another problem (dys)solved by architecture obscurity.
If needs be the state-machine can be reconstructed on a whiteboard by a team of five.
A state machine makes the actual program state first class and easy to reason about. One does not even need mutable state to model one. Whereas you appear to be advocating mutable objects. The state space then becomes a combinatorial explosion of all the hidden mutable state "encapsulated" inside the objects. Object oriented programming is not the only way and often leads to a poor domain model. Some OOP evangelists even model a bank account with a mutable balance field and methods for making deposits. This is a absolutely not a faithful model of the domain (ledgers have been used for hundreds/thousands of years). In summary, yes a state machine can absolutely be a good domain model.
I don't follow the connection you're making.
State machines often are implemented with mutable objects.
And one does not need mutable objects to make "modules-> objects with the transitions being method calls". Every method call could return a fresh, immutable object, nothing requires mutation there.
I'd see a method like:
`TransitionTo(newState)`
as a major smell compared to an explicit
`TransistionToNewState`
and I think OOO can be helpful (hardly required, of course) in that one neat way of streamlining usage of your code is that if you're implementing objects then the object for something in "State A" might not even have "TransitionToStateC" if that's not a valid operation.
(No, you don't HAVE to write state machine code that allows you to ask for invalid operations, but it's a common pattern I've seen in real code and in online discussion/blogs/stack overflow.)
Objects and "method calls" generally implies mutable state to me, but yes the parent was not explicit about this. I assumed mutable (implicit) state was being argued in favour of an explicit state representation. Perhaps I misunderstood.
For a state machine, I would expect a function such as:
transition : (old_state, event) -> new_state
Or if we use immutable objects, and one method per simple event, then something like:
transition_event1 : () -> new_state
Which I think is similar to what you hace. So I think we are in agreement here.
> I assumed mutable (implicit) state was being argued in favour of an explicit state representation.
I definitely was not: I would argue for structured logic rather than implicit state. The idea you are discussing seems to be more about imperative vs. functional design, and that would also be a lot better... but these are Google engineers managing a million interacting state machines via a giant pile of global (mutable) state, resulting in tons of bugs such as this one that isn't even fixed yet:
https://issues.webrtc.org/issues/339131894
A reply I just left elsewhere on this thread, noting that "state machine" doesn't imply the clean perfectly-factored concept you'd learn in a computer science class: these are ad hoc state machines that result from having a number of developers who don't really care and who think they can "dead reckon" the state of the system if they just add enough transition functions.
> these are Google engineers managing a million interacting state machines via a giant pile of global (mutable) state
Yeah that doesn't sound good. I understand the point you are making now and agree.
The problem is, that the state transition information is usually a complex set of events unleashed upon the little world build into objects. Its basically a small reality simulation, parametrized by all sides, similar to a physics simulation with the external world filtered being input.
Now, if we said to someone to model a weather prediction via state-machine- that would be obvious madness. But if we take a small object (like a cubic meter of air) and modelled that part to handle inputs and transfer forces, that generic statemachine will do the job- because the atmospheric ocean of statemachines knows more about the whole system, than a single statemachine.
My point is- there is state in a system, that is not explicitly modeled.
It's interesting to know about what state machines you talk. From my experience most of the time it's an entity with state property with finger countable cardinality. And state is assumed to be changed directly. And it's not easy to reason because author only heard about state machines and state transitions are spread over all code base.
I was talking in the most general sense. I am sure there are state machine implementations that are terrible to reason about, especially any that emerge from a codegen tool. But hopefully they are the exception and not the rule.
Woah, this caught my eye:
> especially any that emerge from a codegen tool
Can you give an example? Implement as a state machine? But. Your program exists as a set of transforms upon memory. Your program is a state machine! You just need to define the proper morpisms to map your problem domain to the computer domain.
Transformations are separable by principle, it's a fundamental property of them that state machines have as an afterthought that is even hard to represent.
It doesn't matter if they have equivalent power. One of those representations fundamentally allows your software to have an architecture, the other doesn't.
How much of software architecture is required because of the architecture? If your program has types that are the possible states, and functions to transform between those states, what architecture is needed beyond that? A grouping of related types, perhaps?
Yeah, just one layer of functions is enough for everybody.
Let's look next at that "compiler" thing and high-level languages. The hardware-native one suffices, no need for all that bloat.
I have a coding problem.
I'll use a state machine!
Now, I have two problems :-(
I've never understood this claim. I find state machines very hard to follow because there's no easy way to tell what paths lead to a given state; they're like using goto instead of functions (indeed they're often implemented that way).
Please describe "normally". State machines can turn into nightmares, just like any design pattern used poorly.
State machines don't have syntax for "transition here when event is encountered no matter what state you are in" so the whole diagram becomes a spaghetti mess if you have a lot of those escape hatches.
> State machines don't have syntax for "transition here when event is encountered no matter what state you are in" so the whole diagram becomes a spaghetti mess if you have a lot of those escape hatches.
I place a note at the top of my diagrams stating what the default state would be on receipt of an unexpected event. There is no such thing as "event silently gets swallowed because no transition exists", because, in implementation, the state machine `switch` statement always has a `default` clause which triggers all the alarm bells.
Works very well in practice; I used to write hard real-time munitions control software for blowing shit up. Never had a problem.
> hard real-time munitions control software for blowing shit up. Never had a problem.
Ha, Ha, Ha! The juxtaposition of these two phrases is really funny. I would like to apply for a position on the Testing team :-)
> Ha, Ha, Ha! The juxtaposition of these two phrases is really funny. I would like to apply for a position on the Testing team :-)
It had its moments: used to go to a range where we'd set off detonators. Once or twice in production on site where we'd set off actual explosives.
State machines don't have a native syntax in C++ at all, so you can structure them however you want. It's easy to structure a state machine, if needed, so that all (or some) states can handle the same event in the same way.
I always thought this framework was neat: https://doc.qt.io/qt-5/statemachine-api.html
Downside of course is now you have a dependency on qt.
The downside is that you're now heap allocating at least one object for every state, and I'm willing to bet that each QState has an associated std::vector-style list of actions, and that each action is also its own object on the heap.
If you can afford to do things like this you can most likely use something other than C++ and save yourself a lot of headaches.
> If you can afford to do things like this you can most likely use something other than C++ and save yourself a lot of headaches.
Surely you can understand that, despite the recent c++ hate, my job doesn't give a fuck and we aren't migrating our massive codebase from c++ to... anything.
Switch + goto is very close to being a native syntax for state machines. It's also very efficient.
I believe HSMs can model this, but don't quote me. :)
Yes, of course in theory nested state machines should be able to model this. I feel like adding more complexity and bending the rules is a bit of a concession.
Back in the days we implemented HSM helper classes in about 500 LoC and generated them from Enterprise Architect. No need to write a GUI yourself, but better to have a visual for documentation and review. Worked very well until we replaced EA with docs-as-code, now I miss that there is no nice simulator and Modeler for that workflow.
They can be. Or they can be... less easy.
Imagine you have an informally-specified, undocumented, at-least-somewhat-incomplete state machine. Imagine that it interacts with several other similar state machines. Still easy to reason about?
Now add multithreading. Still easy?
Now add locking. Still easy?
Cleanly-done state machines can be the cleanest way to describe a problem, and the simplest way to implement it. But badly-done state machines can be a total mess.
Alas, I think that the last time I waded in such waters, what I left behind was pretty much on the "mess" side of the scale. It worked, it worked mostly solidly, and it did so for more than a decade. But it was still rather messy.
> Imagine you have an informally-specified, undocumented, at-least-somewhat-incomplete state machine. Imagine that it interacts with several other similar state machines. Still easy to reason about?
You think that developers that wrote an informally-specified, undocumented, at-least-somewhat-incomplete state-machine would have written that logic as a non-state-machine in a formally-specified, documented and at-least-somewhat-complete codebase?
State-machines are exceptionally easy to reason about because you can at least reverse-engineer a state-diagram from the state-machine code.
Almost-a-state-machine-but-not-quite are exceptionally difficult to reason about because you can not easily reverse-engineer the state-diagram from the state-machine code.
In fact state machines are great for documentation even if the code is not explicitly written as a state machine!
Yes, and it's much better than having a dozen or more `bool` values that indicate some event occurred and put it into some "mode" (e.g. "unhealthy", "input exhausted", etc) and you have to infer what the "hidden state machine is" based on all of those bool values.
Want to add another "bool state"? Hello exponential growth...
But that is just true of any problem-solving/programming technique.
In general, state/event machine transition table and decision table techniques of structuring code are easier to comprehend than adhoc and even worse, poorly understood pattern-based techniques are.
> Now add multithreading. Still easy?
> Now add locking. Still easy?
Don't do that then.
Or rather, either manipulate the state machine from only a single thread at a time; or explicitly turn the multithreading into more states. If you need to wait for something instead of having "do X" you transition into state "doing X". C#-style async does this in a state machine behind the scenes.