jdross 3 days ago

The pace of notable releases across the industry right now is unlike any time I remember since I started doing this in the early 2000's. And it feels like it's accelerating

3
achierius 3 days ago

How is this a notable release? It's strictly worse than Gemini 2.5 on coding &c, and only an iterative improvement over their own models. The only thing that struck me as particularly interesting was the native visual reasoning.

og_kalu 3 days ago

It's not worse on coding. SWE Bench, Aider, live bench coding all show noticeably better results.

qoez 3 days ago

Lots of releases but very little actual performance increases

int_19h 3 days ago

Sonnet and Gemini saw fairly substantial perf increases recenly

mchusma 3 days ago

Love Sonnet but 3.7 is not obviously an improvement over 3.5 in my real world usage. Gemini 2.5 pro is great, has replaced most others for me (Grok I use for things that require realtime answers)

int_19h 3 days ago

Are you comparing it with or without thinking? I'd say it's a fairly big improvement in long thinking mode.

BriggyDwiggs42 3 days ago

It does a lot better on philosophy questions.

emp17344 3 days ago

Not really. We’re definitely in the incremental improvement stage at this point. Certainly no indication that progress is “accelerating”.

Workaccount2 3 days ago

Integration is accelerating rapidly. Even if model development froze today, we would still probably have ~5 years of adoption and integration before it started to level off.

littlestymaar 3 days ago

You are both correct. It feels like the tech itself is kinda plateauing but it's still massively under-used. It will take a decade or more before the deployment starts slowing down.

nwienert 3 days ago

ChatGPT 3 : iPhone 1

A bunch of models later, we're about on the iPhone 4-5 now. Feels about right.

int_19h 3 days ago

It's more like GPT-3 is the Manchester Baby, and we're somewhere around IBM 700 series right now. Still a long way to go to iPhone, as much as the industry likes to pretend otherwise.

nwienert 2 days ago

Both were big consumer commercial breakouts and far better than predecessors. And several years later both see only iterative improvements.

Neither apply to your analogy.

adncors 3 days ago

But we're seeing incremental improvements every two months, so...