amazingamazing 3 days ago

In before people post contradictory anecdotes.

It would be more helpful if people posted the prompt, and the entire context, or better yet the conversation, so we can all judge for ourselves.

4
pcwelder 3 days ago

Gemini 2.5 pro hasn't been as good as Sonnet for me.

The prompt I have tried repeatedly is creating a react-vite-todo app.

It doesn't figure out tailwind related issues. Real chats:

Gemini: https://github.com/rusiaaman/chat.md/blob/main/samples/vite-...

Sonnet 3.7: https://github.com/rusiaaman/chat.md/blob/main/samples/vite-...

Exact same settings, using MCP server for tool calling, using OpenAI api interface.

PS: the formatting is off, but '#%%' starts a new block, view it in raw.

amazingamazing 3 days ago

your links don't work

pcwelder 3 days ago

The repo was private, updated. Thanks!!

genewitch 2 days ago

you have to dump a csv from the microsoft website. i linked the relevant parts below. I spent ~8 hours with copilot making a react "app" to someone else's spec, and most of it was moving things around and editing CSS back and forth because copilot has an idea of how things ought be, that didn't comport with what I was seeing on my screen.

However the MVP went live and everyone was happy. Code is on my github, "EMD" - conversation isn't. https://github.com/genewitch/emd

i'd link the site but i think it's still in "dev" mode and i don't really feel like restoring from a snapshot today.

note: i don't know javascript. At all. It looks like boilerplate and line noise to me. I know enough about programming to be able to fix things like "the icons were moving the wrong way", but i had to napkin it out (twice!) and then consult with someone else to make sure that i understood the "math", but i implemented the math correctly and copilot did not. Probably because i prompted it in a way that made its decision make more sense. see lines 2163-2185 in the link below for how i "prompt" in general.

note 2: http://projectftm.com/#I7bSTOGXsuW_5WZ8ZoLSPw is the conversation, as best i can tell. It's in reverse chronological order (#2944 - 2025-12-14 was the actual first message about this project, the last on 2025-12-15)

note 3: if you do visit the live site, and there's an error, red on black, just hit escape. I imagine the entire system has been tampered with by this point, since it is a public server running port 443 wide open.

Workaccount2 3 days ago

This is also compounded by the fact that LLMs are not deterministic, every response is different for the same given prompt. And people tend to judge on one off experiences.

otabdeveloper4 3 days ago

> LLMs are not deterministic

They can be. The cloud-hosted LLMs add a gratuitous randomization step to make the output seem more human. (In vein with the moronic idea of selling LLM's as sci-fi human-like assistants.)

But you don't have to add those randomizations. Nothing much is lost if you don't. (Output from my self-hosted LLM's is deterministic.)

CharlesW 3 days ago

Even at temperature = 0, LLM output is not guaranteed to be deterministic. https://www.vincentschmalbach.com/does-temperature-0-guarant...

deeth_starr_v 2 days ago

This is the issue with these kind of discussions on HN. “It worked great for me” or “it sucked for me” without enough context. You just need to try it yourself to see if it’ll work for your use case.