1zael 3 days ago

This is wild. Our company has like 10 data scientists writing SQL queries on our DB for business questions. I can deploy pg-mcp for my organization so everyone can use Claude to answer whatever is on their mind? (e.x."show me the top 5 customers by total sales")

sidenote: I'm scared of what's going to happen to those roles!

8
clusterfook 3 days ago

Yep gonna be easy

Q: show me the top 5 customers by total sales

A: System.Data.Odbc.OdbcException (0x80131937): ERROR [57014] ERROR: canceling statement due to statement timeout;

Q: Why do I get this error

A: Looks like it needs an index, let me create that for you. Done. Rerunnign query.

could not close temporary statistics file "pg_stat_tmp/global.tmp": No space left on device

Q: Why this error

A: 429 Too Many Requests

Rub hands... great next 10 years to be a backend dev.

curious_cat_163 3 days ago

That’s a good example of a worst case scenario. This is why we would still need humans loitering about.

The question is do they still need 10? Or 2 would suffice? How about 5?

This does not need to be a debate about the absolutes.

clusterfook 2 days ago

You will need 2. BUT, and here is the rub. https://en.wikipedia.org/wiki/Jevons_paradox

I am hoping Jevon will keep employing me. He has been a great boss for the last 25 years TBH.

brulard 2 days ago

I have to say I had a very good results creating and optimizing quite complex queries with Sonnet. But letting LLM run them on their own in production is quite a different beast.

fullstackchris 3 days ago

and the next 10 after that, and the next 10 after that, and...

otabdeveloper4 3 days ago

Probably nothing. "Expose the database to the pointy-haired boss directly, as a service" is an idea as old a computing itself. Even SQL itself was originally an iteration of that idea. Every BI system (including PowerBI and Tableau) were supposed to be that.

It doesn't work because the PHB doesn't have the domain knowledge and doesn't know which questions to ask. (No, it's never as simple as group-by and top-5.)

jaccola 3 days ago

I would say SQL still is that! My wife had to learn some SQL to pull reports in some non-tech finance job 10 years ago. (I think she still believes this is what I do all day…)

I suppose this could be useful in that it prevents everyone in the company having to learn even the basics of SQL which is some barrier, however minimal.

Also the LLM will presumably be able to see all the tables/fields and ‘understand’ them (with the big assumption that they are even remotely reasonably named) so English language queries will be much more feasible now. Basically what LLMs have over all those older attempts is REALLY good fuzziness.

I see this being useful for some subset of questions.

pclmulqdq 3 days ago

A family friend maintains a SQL database of her knitting projects that she does as a hobby. The PHB can easily learn SQL if they want.

conradfr 3 days ago

But he doesn't.

The project manager also won't learn behat and write tests.

Your client also won't use the CMS to update their website.

spennant 3 days ago

It won’t be that easy. First off, most databases in the wild are not well documented. LLMs benefit from context, and if your tables/columns have non-intuitive or non-descriptive names, the SQL may not even work. Second, you might benefit from an LLM fine-tuned on writing code and/or an intelligent Agent that checks for relevancy and ambiguity in user input prior to attempting to answer the question. It would also help if the agent executed the query to see how it answered the user’s question. In other words “reasoning”… pg-mcp simply exposes the required context for Agents to do that kind of reasoning.

nickdichev 3 days ago

The COMMENT command will finally be useful :)

dinfinity 3 days ago

Then let the AI first complete the documentation by looking at the existing documentation, querying the DB (with pg-mcp), etc.

Do human reviewing and correcting of the updated documentation. Then ensure that the AI knows that the documentation might still contain errors and ask it to do the 'actual' work.

moltar 3 days ago

There are LLM SQL benchmarks. [1] And state of the art solution is still only at 77% accuracy. Would you trust that?

[1] https://bird-bench.github.io/

flappyeagle 3 days ago

Yes. Ask it to do it 10 times and pick the right answer

pclmulqdq 3 days ago

That only works if you assume the fail cases are uncorrected. Spoiler alert: they are not.

flappyeagle 3 days ago

Ask 10 different models then

pclmulqdq 3 days ago

Same problem: The models are also correlated on what they can and can't solve.

To give you an extreme example, I can ask 1000000 different models for a counterexample to the 3n + 1 problem, and all will get it wrong.

flappyeagle 3 days ago

No. What a bizarre example to choose. This is so easy to demonstrate. They will all come back with the exact same correct answer

pclmulqdq 3 days ago

If it's so easy, go do it. You can publish the result in any math journal you like with just a title and a number, because this is one of the hardest problems in mathematics.

For reference: https://en.wikipedia.org/wiki/Collatz_conjecture

flappyeagle 3 days ago

My guy, every LLM has read Wikipedia

pclmulqdq 3 days ago

I don't know if you're purposely being dense. The first sentence of Wikipedia is that this is a famous unsolved problem.

So no, sampling 1000000 LLMs will not get you a solution to it. I guarantee you that.

flappyeagle 2 days ago

It will get you the correct answer, not a solution. Once again it’s a terrible example, I don’t know why you used it. It’s certainly not a gotcha

pclmulqdq 2 days ago

The reason I used it is that the correct answer to the actual problem is unknown and nobody has any idea how to solve it. No amount of sampling an LLM will give you a correct answer. It will give you the best known answer today, but it won't give you a correct answer. This is an example where LLMs all give correlated answers that do not solve the problem.

If you want to scale back, many programming problems are going to be like this, too. Failure points of different models are correlated as much as failure points during sampling are correlated. You only gain information from repeated trials when those trials are uncorrelated, and sampling multiple LLMs is still correlated.

flappyeagle 1 day ago

the correct answer is "the solution is unknown"

pclmulqdq 10 hours ago

That's not what I asked the LLM for. I asked it for a counterexample, not whether a counterexample is currently known to humans.

Is that the correct answer to "write a lock-free MPMC queue"? That is a coding problem that literally every LLM gets wrong, but has several well-known solutions.

There's merit to "I don't know" as a solution, but a lot of the knowledge encoded in LLMs is correlated with other LLMs, so more sampling isn't going to get rid of all the "I don't knows."

risyachka 3 days ago

So you will ask "What is our churn?", get a random result, and then turn your whole marketing strategy around wrong number?

Thats cute.

Kiro 3 days ago

There are hundreds of text-to-SQL companies and integrations already. What's different about this that makes you react like that?

romanovcode 3 days ago

Those companies will be dead once this goes mainstream. Why pay to a 3rd party company when you can ask LLM to create graphs and analysis of whatever you want. Pair it with scheduled tasks and I really don't see any value in those SaaS products.

slt2021 3 days ago

there are a lot of nuances in Business Analytics, you maybe can get away with GenAI for naiive questions like "Who are my top5 customers?", but thats not the type of insight usually needed. Most companies already know their top5 customers by heart and these don't change a lot.

Nuanced BI analytics can have a lot of toggles and filters and drilldowns, like compare sales of product A in category B subcategory C, but only for stores in regions X,Y and that one city Z during time periods T1, T2. and out of these sales, look at sales of private brand vs national brand, and only retail customers, but exclude purchases via business credit card or invoiced.

with every feature in a DB (of which there could be thousands), the number of permutations and dimensions grows very quickly.

whats probably going to happen, is simple questions could be self-served by GenAI, but more advanced usage is still needed interention by specialist. So we would see some improvement in productivity, but people will not lose jobs. Perhaps number of jobs could even increase due to increased demand for analytics, as it often happens with increased efficiency/productivity (Jevon's paradox)

Kiro 3 days ago

Those companies and integrations are already using LLMs. That's the whole point. I'm only talking about LLM products, many of which are free and open source. This has been mainstream for years.

a-dub 3 days ago

is that true? i'd like that, but i get the sense that this mcp stuff is more oriented around programming assistant and agent applications.

i suppose the desktop app can use it, but how good is it for this general purpose "chat with the database for lightweight analytics" use cases is it worth the trouble of dealing with some electron app to make it work?

sshine 3 days ago

> i get the sense that this mcp stuff is more oriented around programming assistant and agent applications

Agents will become ubiquitous parts of the user interface that is currently the chat.

So if you bother with a website or an electron app now, MCP will just add more capabilities to what you can control using agents.

a-dub 3 days ago

yeah, i understand the premise. my question revolves around how well it actually works today for bi style applications. specifically, how close is it to being something that you can just drop in as a smart query and plotting interface rather than a bi stack that is built around something like tableau.

when i've read through documentation for mcp servers, it seems like the use cases they've mostly been focused on are improving effectiveness of programming assistants by letting them look at databases associated with codebases they're looking to modify.

i understand that these things are meant to be generic in nature, but you never really know if something is fit for purpose until it's been used for that purpose. (at least until agi, i suppose)

slt2021 3 days ago

didn't Tableau (and some other BI solutions) have this feature out of the box?