This is the most insightful article on the intersection of LLMs and software development I have read to date. There is zero fluff here -- every point is key, every observation relevant. In a time of paradigm shift, this is a fantastic guide on how to stay in the driver's seat, and most effectively leverage these tools. The inevitable shift here is upwards, away from the gritty detail of code.
Documentation (as in "design doc", not "API reference") is the absolute initial entry point: iterating on the problem statement, stakeholder requirements, business constraints, etc, until a coherent plan emerges, then organizing it at a high level. Combining this with "deep research" mode can yield fantastic results, as it draws on existing solutions and best practices across a vast body of knowledge.
The trick then is a sliding scope context window: with a high-level design doc in context, iterate to produce an architecture document. Once that is reviewed and hand-tuned, you can use it in turn to produce more detailed technical designs for various components of the system. And so on and so forth down the scale of granularity, until you're working with code. The important part is to never try and hold the entire thing in scope, instead, balance the context and granularity so that there's enough information to guide the LLM, and enough space to grow the next tier of the solution. Work in a manner that creates natural interfaces where artifacts can be decoupled. Piecemeal, not all at once.
The test aspect is also incredibly relevant: as you're able to work across a vastly larger codebase, moving much more quickly, tests become truly invaluable. And they can be squared against the original design documentation, to gauge how well the produced artifacts fulfill the original intent.
I'll acknowledge that this is most relevant in context of greenfield projects; but, LLMs' ability to ingest and summarize code makes them useful tools in dealing with legacy solutions. The point about documentation stands; adding features or fixing issues in existing codebases is the bottom of the pyramid; with these tools now you can stir things at the PM level, and better shape both the understanding of problems, and the approaches to solving them.
It's a very exciting time, it feels like having worked by hand for decades, only to now have access to power tools and heavy machinery.
Thank you!
> The trick then is a sliding scope context window: with a high-level design doc in context, iterate to produce an architecture document.
Absolutely, I will be stealing this!
> It's a very exciting time, it feels like having worked by hand for decades, only to now have access to power tools and heavy machinery.
Very well put, captures my feeling precisely.
Of course there's fluff! The LLM part is fluff. It's so much fluff it smothers everything else.
A rule of thumb I’ve started using is: “if your function name and arguments aren’t good enough to have Copilot tab completion make a cogent attempt at implementing the full behavior, you need more comments/docstrings and/or you need to create utility methods that break down the complexity.”
Alternatively: “if you tab complete a docstring and it doesn’t match what you expect, your code can be clearer and you should add comments and rename variables accordingly.”
This isn’t hard and fast. Sometimes it risks yak shaving. But if an LLM can’t understand your intent, there’s a good chance a colleague or even your future self might similarly struggle.
This only really makes sense for boilerplate code, if Copilot can consistently tab complete everything you were about to write, maybe you should be working on something new, not rebuilding things that probably already exist as a library.
This is only true for the most generic code. Copilot isn't going to autocomplete my recursive JSON2Neo4j import function.
My strategy is generally to have a back and forward on the requirements with the LLM for 3/4 prompts, then get it write a summary, and then a plan. Then get it to convert the plan to a low level todo list and write it to TODO.md.
Then I get it to go through each section of the todo list and check each item off as it completes it. Generally results in completed tasks that stay on track but also means that I can stop half way through and go back to the tasks without having to prompt from the start again.
This article describes a method for LLM-assisted coding process but don’t provide anything of substance to back it up. It’s unclear whether the suggestions and techniques mentioned in the article came from personal experience or have otherwise been verified or experimented with with a real team and a real project.
> It’s unclear whether the suggestions and techniques mentioned in the article came from personal experience or have otherwise been verified or experimented with with a real team and a real project.
Since it's not a New Yorker article, I was hoping to spare the audience a long personal life story and deliver a somewhat succinct list of suggestions that others might find useful.
However, the question is valid, and yes, this is the result of personal experience of following and incorporating AI tools into my own development over the last couple of years, as well as watching my colleagues of various experience levels (in a team of 10 engineers) do the same. These are the practices that we collected, adopted, and trying to codify and develop further.
Ehm, I’m kind of in a weird position now. Responding with “I didn’t ask for personal story” sounds rude and I have no way if asking for more details without this. Like, if I was writing an article about optimisasation of an algorithm, I’d include information on the problem before and after, as well as some details how it was measured (the most interesting part). Otherwise it’s hard to discuss anything.
> Since it's not a New Yorker article, I was hoping to spare the audience a long personal life story
The New Yorker out here catching strays. Spare us, "maga," your excuses and weird insults! You didn't need to share your whole life story to include some useful context.
Did you, "maga"?
Apologies, no offence to New Yorkers (or the New Yorker) meant.
If you find my handle, "maga", interesting..., I'll have you know that it's been around long before it was appropriated by some movements in US and has nothing to do with them ;)
Just try it then.
I upvoted it because it aligns with my own findings working on real projects. Especially the bits about needing to “ground” the LLM in appropriate context, and being mindful of the sliding context window.
Can’t use AI at my day job unfortunately. I tried LLMs for code reviews and “opinions” in personal projects, it does pick up things I didn’t know about that I then explore myself, but these are small projects where I practice specific things rather than product development.
"Just try it then" is an irresponsible suggestion.
Of course, I can try it. But trying it does not prove anything. It must be tested. Testing is a much higher standard than "trying" and a lot harder to do.
Regardless, are there any good examples of projects generated by LLMs? There was a game like Angry Birds. But that was long time ago. I did some simple 'games'. If it's easy there should be a lot of open source projects, right?
Who says it's easy? It's deceptively easy, but LLM assisted, especially agentic coding actually has a fairly involved learning curve and many pitfalls and foot guns.
Documentation explains how an app works currently. But part of the context that the LLM lacks is how I imagine the app to work. This is difficult to integrate into version control.
I’ve landed on a few similar techniques and have been using unit tests quite a bit as a guardrail for LLMs. One thing that’s useful when using aider is alternating between the /add and /read-only contexts so that it can only edit the tests or the code but “see” both.