Item 42241276

ozim • 2 days ago

How do you find gimmicks from Bob Martin like (d + e*g) which in theory are great but to use it in practice would take loads of coaching?

WalterBright • 2 days ago

I'm not familiar with that gimmick.

One thing I learned, for example, is do not access global immutable state from within a function. All inputs come through the parameters, all outputs through the parameters or the return value.

3 replies

gpderetta • 2 days ago

Global immutable or global mutable. I vehemently agree with the latter, but while I could definitely make a case for the former [1], I think it is a bit too extreme especially without language support.

Would you access a global M_PI constant? Or another function name? Or would you require every dependency to passed through?

[1] i.e. a total capability based system.

1 reply

xmcqdpt2 • 1 day ago

Global mutable state is to be avoided at all costs of course, but IMO global immutable state is to be avoided... at some costs.

The main issue comes in when you change (in the code! not as mutation!) the global immutable state and now you have to track down a bunch of usages. If it wasn't global, you could change it only in some local areas and not others.

You aren't likely to change M_PI to a new value (int 3 for performance?) so for pure constants, fine, global immutable state works. However many usages of global state are things like singletons, loggers and string messages that often eventually benefit from being passed in (i18n, testability etc.)

As to ergonomics, you can stuff all that global state into a single instance and have one more parameter that is passed around. It will still allow calls to eg change logging on their downstream functions much more easily than having singleton configuration.

pizza-wizard • 2 days ago

As someone without a lot of experience (in my first dev job now), would you care to expand on this? Does this mean that you wouldn’t have a function fn() that manipulates a global variable VAR, but rather you’d pass VAR like fn(VAR)?

3 replies

WalterBright • 2 days ago

To expand on the other reply, some related things:

1. don't do console I/O in leaf functions. Instead, pass a parameter that's a "sink" for output, and let the caller decide what do with it. This helps a lot when converting a command line program to a gui program. It also makes it practical to unit test the function

2. don't allocate storage in a leaf function if the result is to be returned. Try to have storage allocated and free'd in the same function. It's a lot easier to keep track of it that way. Another use of sinks, output ranges, etc.

3. separate functions that do a read-only gathering of data, from functions that mutate the data

Give these a try. I bet you'll like the results!

2 replies

samatman • 1 day ago

I heartily agree with #2 if the language isn't Zig. Which actually supports your point: allocating in leaf functions is idiomatic in Zig, and it works out fine, because there's no allocation without an Allocator, and even if that's passed in implicitly as part of a struct argument, error{OutOfMemory} will be part of the function signature. So there's no losing track of what allocates and what doesn't.

This actually supports your broader point about always passing state to functions, and never accessing it implicitly. Although I don't know that I agree with extending that to constants, but maybe with another several decades of experience under my belt I might come to.

Zig also makes it easy for 'constants' to change based on build-specific parameters, so a different value for testing, or providing an override value in the build script. I've found that to eliminate any problems I've had in the past with global constants. Sometimes, of course, it turns out you want those values to be runtime configurable, but as refactorings go that's a relatively straightforward one.

1 reply

WalterBright • 1 day ago

> So there's no losing track of what allocates and what doesn't.

Having an allocator implicitly passed in with a struct argument is not quite what I meant. D once had allocators as member functions, but that wound up being deprecated because the allocation strategy is only rarely tied to the struct.

1 reply

samatman • 1 day ago

There are some meaningful differences between Zig and D in this specific area, specifically, D uses exceptions and has garbage collection as the default memory strategy. That will surely result in different approaches to the leaf-allocation question being better for the one than for the other.

chipdart • 2 days ago

> Give these a try. I bet you'll like the results!

It sounds like too many words to refer ro plain old inversion of control and CQRS. They're both tried and true techniques.

maxbond • 2 days ago

You've got the gist of it. By decoupling your function from the state of your application, you can test that function in isolation.

For instance, you might be tempted to write a function that opens an HTTP connection, performs an API call, parses the result, and returns it. But you'll have a really hard time testing that function. If you decompose it into several tiny functions (one that opens a connection, one that accepts an open connection and performs the call, and one that parses the result), you'll have a much easier time testing it.

(This clicked for me when I wrote code as I've described, wrote tests for it, and later found several bugs. I realized my tests did nothing and failed to catch my bugs, because the code I'd written was impossible to test. In general, side effects and global state are the enemies of testability.)

You end up with functions that take a lot of arguments (10+), which can feel wrong at first, but it's worth it, and IDEs help enormously.

This pattern is called dependency injection.

https://en.wikipedia.org/wiki/Dependency_injection

See also, the "functional core, imperative shell" pattern.

https://www.youtube.com/watch?v=yTkzNHF6rMs

pjc50 • 2 days ago

Yes. Global variables or singletons are deeply miserable when it comes to testing, because you have to explicitly reset them between tests and they cause problems if you multithread your tests.

A global variable is a hidden extra parameter to every function that uses it. It's much easier if the set of things you have to care about is just those in the declared parameters, not the hidden globals.

ozim • 2 days ago

Cool I am just confirming my own bias against much of „clean code” teachings. That it might be a bit easier to read order of the operations - but no one uses it so it doesn’t matter.

1 reply

WalterBright • 1 day ago

There are lots of things that look like great methods, but experience with them often leads to disillusionment. For another example, Hungarian notation is a really great idea, heavily adopted by Microsoft Windows, and just does not deliver on its promises.

For example, types can have long names, but that doesn't work with HN. Changing a declaration to have a different type then means you've got endless cascading identifiers that need to be redone. And so on.

1 reply

zozbot234 • 1 day ago

> Changing a declaration to have a different type then means you've got endless cascading identifiers that need to be redone.

This is actually a good thing, every mention of that identifier is a place that you might need to adapt for the new type. Hungarian notation is an excellent coping mechanism when you have to use compilers that don't do their own type checking - which used to be a huge issue when Hungarian notation was current.

1 reply

WalterBright • 22 hours ago

On balance, it isn't a good thing. Having high refactoring costs means:

1. you become reluctant to do it

2. lots of diffs cluttering up your git history. I like my git history to be fairly narrowly targeted.

I don't use languages that don't do type checking. Microsoft uses Hungarian notation on their C interface and example code.

pjc50 • 2 days ago

Could someone explain what this is since that expression is unsearchable?

1 reply

ozim • 2 days ago

So (d + e*g) is an example where if you do mathematical operations you put spaces between ones that will be lower rank and higher rank no spaces. This way you could a bit faster grasp which operation will be first so (2 + 3*4) you know first to evaluate 3*4 will be 12 and then you add 2 giving 14 - but given variable names of course you are quicker to evaluate result.

But no one has time to craft such details in the code.

3 replies

tinco • 2 days ago

I only have 20 years of development experience, so I'll defer to Walter here, but if I were to write that equation it would look like `d + (e * g)`. I don't trust mine or anyone's understanding of operator precedence. Just look at how ridiculously hard to read their implementations in parsers are.

Specifically d+e*g I might make an exception for in a code review (and allow it), since it's such a widely known precedence in mathematics you can expect the reader and writer to know the way it goes, but any more complex and I'd reject it in the review for lack of parentheses.

WalterBright • 2 days ago

Operator precedence is so deeply burned into my brain I would never think of adding parens for it or modify the spacing.

I will use parens, however, for << and a couple other cases. It would be a shame to use lack of spacing to imply precedence, and yet get it wrong. Oops!

I also like to line up things to make vertical formatting of similar expressions, something a formatting program doesn't do. Hence I don't use formatters.

1 reply

ozim • 1 day ago

Parens were not the main part - main part is having multiplication without spaces and addition with spaces.

I would say it is a neat detail but if no one cares or uses it - it is pretty much "feel good about yourself" use and not practical one.

pjc50 • 2 days ago

.. that seems like a strange optimization when there's a tool to indicate to both reader and compiler which operations will be performed first: brackets!