Item 43627339

janalsncm • 10 days ago

I’d love to learn more about the pretty algorithm and the optimizations that have been tried out so far.

It seems like a pretty straightforward machine learning regression problem: given a sequence of word widths, find a line length which when applied to the sequence satisfy the constraint of being “pretty”.

Using a model would allow computing the answer in constant time.

deredede • 10 days ago

Machine learning is not magic -- no algorithm, machine learning or otherwise, will be able to treat an arbitrary-length sequence in constant time.

The actual problem is also more complex than fixed word widths due to hyphenation and justification - from what I recall, Knuth's paper (IIRC there's two and the second one is the one to read) on TeX's layout gives a good overview of the problem and an algorithm that's still state of the art. I think the Typst people might have a blog post about their modern take on it, but I'm not sure.

alwillis • 10 days ago

See https://docs.google.com/document/d/1jJFD8nAUuiUX6ArFZQqQo8yT...

accrual • 10 days ago

Could it be linear time? I thought larger inputs take longer for models to process. I suppose it depends on the length of each line and the number of lines.

ezfe • 9 days ago

It can't be linear time. At best it would be log(n) but that would require storing all the possible inputs in a lookup table.