blibble 13 hours ago

I don't think it is a bad analogy

given how complicated the boot process is ([1]), and it occurs once a month, I'd rather it was as deterministic as possible

vs. shaving 1% off the boot time

[1]: distros continue to ship subtlety broken unit files, because the model is too complicated

3
Aurornis 13 hours ago

Most systems do not have 5 minute POST times. That’s an extreme outlier.

Linux runs all over, including embedded systems where boot time is important.

Optimizing for edge cases on outliers isn’t a priority. If you need specific boot ordering, configure it that way. It doesn’t make sense for the entire Linux world to sacrifice boot speed.

timcobb 12 hours ago

I don't even think my Pentium 166 took 5 minutes to POST. Did computers ever take that long to POST??

yjftsjthsd-h 11 hours ago

Old machines probably didn't, no, but I have absolutely seen machines (Enterprise™ Servers) that took longer than that to get to the bootloader. IIRC it was mostly a combination of hardware RAID controllers and RAM... something. Testing?

lazide 10 hours ago

It takes awhile to enumerate a couple TB worth of RAM dimms and 20+ disks.

yjftsjthsd-h 10 hours ago

Yeah, it was somewhat understandable. I also suspect the firmware was... let's say underoptimized, but I agree that the task is truly not trivial.

lazide 9 hours ago

One thing I ran across when trying to figure this out previously - while some firmware is undoubtably dumb, a decent amount of it was that it was doing a lot more than typical PC firmware.

For instance, the slow RAM check POST I was experiencing is because it was also doing a quick single pass memory test. Consumer firmware goes ‘meh, whatever’.

Disk spin up, it was also staging out the disk power ups so that it didn’t kill the PSU - not a concern if you have 3-4 drives. But definitely a concern if you have 20.

Also, the raid controller was running basic SMART tests and the like. Which consumer stuff typically doesn’t.

Now how much any of this is worthwhile depends on the use case of course. ‘Farm of cheap PCs’ type cloud hosting environments, most these types of conditions get handled by software, and it doesn’t matter much if any single box is half broken.

If you have one big box serving a bunch of key infra, and reboot it periodically as part of ‘scheduled maintenance’ (aka old school on prem), then it does.

BobbyTables2 12 hours ago

Look at enterprise servers.

Competing POST in under 2 minutes is not guaranteed.

Especially the 4 socket beasts with lots of DIMMs.

Twirrim 10 hours ago

Physical servers do. It's always astounding to me how long it takes to initialise all that hardware.

kcexn 12 hours ago

Oh? What's an example of a common way for unit files to be subtlely broken?

juped 4 hours ago

See: the comment above and its folkloric concept of systemd as some kind of constraint solver

Unfortunately no one has actually bothered to write down how systemd really works; the closest to a real writeup out there is https://blog.darknedgy.net/technology/2020/05/02/0/