I’ve enjoyed making it go faster by finding quirks, but at this point it’s mostly become “OK, what else can I offload to C?”
I should really learn Rust. Or Zig. I tried Nim (best of both worlds, Python-esque code that compiles to C!), but it wasn’t nearly as fast as my Python + C for my specific use case.
What exactly do you write, where your Python+C is faster than Nim which compiles to optimized C?
Generating millions of rows of synthetic data for testing RDBMS.
Tbf, I didn't spend much time trying to optimize the Nim code once I got a working PoC, so it's entirely possible that I could've made it faster.
You may no longer be interested in this kind of thing, but if you are there might be some ideas of note over at https://github.com/c-blake/nio/blob/main/db-bench.md (in particular the demo/gbyGen.nim program).