blixt 23 hours ago

> Enabled from the insight from our heavily-used Windsurf Editor, we got to work building a completely new data model (the shared timeline) and a training recipe that encapsulates incomplete states, long-running tasks, and multiple surfaces.

This data is very valuable if you're trying to create fully automated SWEs, while most foundation model providers have probably been scraping together second hand data to simulate long horizon engineering work. Cursor probably has way more of this data, and I wonder how Microsoft's own Copilot is doing (and how they share this data with the foundation model providers)...

3
whywhywhywhy 19 hours ago

There is a world where the wrapper makers surpass the current model makers in their area of focus. Cursor/Windsurf have all the data on when people got so frustrated with Claude they switched to Gemini/GPT and also all the data of when the problem was actually solved and when it wasn't.

lemming 20 hours ago

The company that is best placed to collect tons of high quality data of this type is undoubtedly Google. They’ve had publications talking about how they capture data from their in house SWE tools and use it to improve their tooling.

blixt 18 hours ago

They certainly can automate their own SWE but I wonder if that’s as good as getting full computer use logs (terminal, web browsing, code acceptance/rejection, etc etc — as claimed in the linked article) from millions of individuals and thousands of companies all with their quirky technology setups.

throwaway314155 7 hours ago

This summarizes Google's approach to software engineering well; just pretend the outside world doesn't exist and the "Google way" is the only way.

figassis 20 hours ago

And is probably why OpenAI paid $$$ to acquire