adastra22 9 days ago

I can ask Claude 3.7 to write me a program that does SAT solving, theorem proving, or scheduling, and it generally gets it right on the first try.

3
blacklion 9 days ago

You could ask me, and I'll copy'n'paste for you Z3 solver, for example, stripping copyrights & rearranging code. Without any understanding how this thing work.

It will be impressive, if Claude was trained on scientific literature about SAT solvers and tutorials about programming language in question, without access to any real SAT solver code. But it is not the case.

Why do you need LLM-generated code when you can take original, which was consumed by LLM?

Or, I could ask another question: Could Claude give you SAT solver which will be 1% more effective than state-of-art in the area? We don't need another mediocre SAT solver.

adastra22 9 days ago

I have asked Claude to solve scientific problems which I absolutely know are not in its training data, and it has done so successfully. I am not sure why people think it is only regurgitating training data. Don't be lazy and learn how it works--LLMs do generate generalized models, and employ them to solve previously unseen problems.

mepian 9 days ago

Would you actually use this program for real-world applications of theorem proving, e.g. validating an integrated circuit design before spending millions on its manufacturing?

adastra22 9 days ago

I do, yes.

YeGoblynQueenne 9 days ago

Demonstrate.

adastra22 9 days ago

It would take you all of 5 seconds to try in Claude yourself. I do this work on a daily basis; I know its value.

YeGoblynQueenne 9 days ago

Do you mean you create SAT solvers with Claude on a daily basis? What is the use case for that?

adastra22 8 days ago

I ask Claude to solve problems of similar complexity on a daily basis. A SAT solver specifically is maybe a once a week thing.

Use cases are anything, really. Determine resource allocation for a large project, or do Monte Carlo simulation of various financial and risk models. Looking at a problem that has a bunch of solutions with various trade-offs, pick the best strategy given various input constraints.

There are specialized tools out there that you can pay an arm and a leg for a license to do this, or you can have Claude one-off a project that gets the same result for $0.50 of AI credits. We live in an age of unprecedented intelligence abundance, and people are not used to this. I can have Claude implement something that would take a team of engineers months or years to do, and use it once then throw it away.

I say Claude specifically because in my experience none of the other models are really able to handle tasks like this.

Edit: an example prompt I put here: https://news.ycombinator.com/item?id=43639320

YeGoblynQueenne 8 days ago

Talk is cheap. The bottom line is that I don't see any SAT solvers that you generated with Claude.

adastra22 8 days ago

It’s not my job to make one for you.