neerajsi 3 days ago

Very interesting project!

I wonder if there's a way to make this set of techniques less brittle and more applicable to any language. I guess you're looking at a new backend or some enhancements to one of the parser generator tools.

1
adev_ 3 days ago

I have applied a subset of these techniques in a tokenizer in C++ to parse a language syntactically similar to Swift: no inline assembly, no intrinsics, no SWAR but reduce branching, cache optimization and SIMD parsing + explicit vectorization.

I get:

- ~4 MLOC/sec/core on a laptop

- ~ 8-9MLOC/sec/core on a modern AMD sever grade CPU with AVX512.

So yes, it is definitively possible.