ozgune 9 days ago

I feel the article presents the data selectively in some places. Two examples:

* The article compares Gemini 2.5 Pro Experimental to DeepSeek-R1 in accuracy benchmarks. Then, when the comparison becomes about cost, it compares Gemini 2.0 Flash to DeepSeek-R1.

* In throughput numbers, DeepSeek-R1 is quoted at 24 tok/s. There are half a dozen providers, who give you easily 100+ tok/s and at scale.

There's no doubt that Gemini 2.5 Pro Experimental is a state of the art model. I just think it's very hard to win on every AI front these days.

2
JKCalhoun 8 days ago

Orthogonal — the remarkable thing about DeepSeek-R1 seems to me is that it shows how easy it in fact is to create an LLM. A quantitative hedge fund was able to throw money and develop a competitive LLM. Maybe that somewhat reveals that it's just a "man behind the curtain."

yalok 8 days ago

but also they compare reasoning and non-reasoning models - e.g. Meta's Llama 4