I made this NES emulator with Claude last week [0]. I'd say it was a pretty non-trivial task. It involved throwing a lot of NESDev docs, Disch mapper docs, and test rom output + assembly source code to the model to figure out.
I am considering training a custom Lora on atari roms and see if i could get a working game out of it with the Loras use. The thinking here is that atari, nes, snes, etc... roms are a lot smaller in size then a program that runs natively on whatever os. Lees lines of code to write for the LLM means less chance of a screw up. take the rom, convert it to assembly, perform very detailed captions on the rom and train.... if this works this would enable anyone to create games with one prompt which are a lot higher quality then the stuff being made now and with less complexity. If you made an emulator with the use of an llm, that means it understands assembly well enough so i think there might be hope for this idea.
Well the assembly I put into it was written by humans writing assembly intended to be well-understood by anyone reading it. On the contrary, many NES games abuse quirks specific to the NES that you can't translate to any system outside of the NES. Understanding what that assembly code is doing also requires a complete understanding of those quirks, which LLMs don't seem to have yet (My Mapper 4 implementation still has some bugs because my IRQ handling isn't perfect, and many games rely on precise IRQ timing).
How would you characterize the overall structural complexity of the project, and degree of novelty compared to other NES emulators Claude may have seen during training ?
I'd be a bit suspect of an LLM getting an emulator right, when all it has to go on is docs and no ability to test (since pass criteria is "behaves same as something you don't have access to")... Did you check to see the degree to which it may have been copying other NES emulators ?
> How would you characterize the overall structural complexity of the project, and degree of novelty compared to other NES emulators Claude may have seen during training ?
Highly complex, fairly novel.
Emulators themselves, for any chipset or system, have a very learnable structure: there are some modules, each having their own registers and ways of moving data between those registers, and perhaps ways to send interrupts between those modules. That's oversimplifying a bit, but if you've built an emulator once, you generally won't be blindsided when it comes to building another one. The bulk of the work lies in dissecting the hardware, which has already been done for the NES, and more open architectures typically have their entire pinouts and processes available online. All that to say - I don't think Claude would have difficulty implementing most emulators - it's good enough at programming and parsing assembly that as long as the underlying microprocessor architecture is known, it can implement it.
As far as other NES emulators goes, this project does many things in non-standard ways, for instance I use per-pixel rendering whereas many emulators use scanline rendering. I use an AudioWorklet with various mixing effects for audio, whereas other emulators use something much simpler or don't even bother fully implementing the APU. I can comfortably say there's no NES emulator out there written the way this one is written.
> I'd be a bit suspect of an LLM getting an emulator right, when all it has to go on is docs and no ability to test (since pass criteria is "behaves same as something you don't have access to")... Did you check to see the degree to which it may have been copying other NES emulators ?
Purely javascript-based NES emulators are few in number, and those that implement all aspects of the system even fewer, so I can comfortably say it doesn't copy any of the ones I've seen. I would be surprised if it did, since I came up with most of the abstractions myself and guided Claude heavily. While Claude can't get docs on it's own, I can. I put all the relevant documentation in the context window myself, along with the test rom output and source code. I'm still commanding the LLM myself, it's not like I told Claude to build an emulator and left it alone for 3 days.
Interesting - thanks!
Even with your own expert guidance, it does seem impressive that Claude was able complete a project like this without getting bogged down in the complexity.