This is wild. Our company has like 10 data scientists writing SQL queries on our DB for business questions. I can deploy pg-mcp for my organization so everyone can use Claude to answer whatever is on their mind? (e.x."show me the top 5 customers by total sales")
sidenote: I'm scared of what's going to happen to those roles!
Yep gonna be easy
Q: show me the top 5 customers by total sales
A: System.Data.Odbc.OdbcException (0x80131937): ERROR [57014] ERROR: canceling statement due to statement timeout;
Q: Why do I get this error
A: Looks like it needs an index, let me create that for you. Done. Rerunnign query.
could not close temporary statistics file "pg_stat_tmp/global.tmp": No space left on device
Q: Why this error
A: 429 Too Many Requests
Rub hands... great next 10 years to be a backend dev.
That’s a good example of a worst case scenario. This is why we would still need humans loitering about.
The question is do they still need 10? Or 2 would suffice? How about 5?
This does not need to be a debate about the absolutes.
You will need 2. BUT, and here is the rub. https://en.wikipedia.org/wiki/Jevons_paradox
I am hoping Jevon will keep employing me. He has been a great boss for the last 25 years TBH.
I have to say I had a very good results creating and optimizing quite complex queries with Sonnet. But letting LLM run them on their own in production is quite a different beast.
Probably nothing. "Expose the database to the pointy-haired boss directly, as a service" is an idea as old a computing itself. Even SQL itself was originally an iteration of that idea. Every BI system (including PowerBI and Tableau) were supposed to be that.
It doesn't work because the PHB doesn't have the domain knowledge and doesn't know which questions to ask. (No, it's never as simple as group-by and top-5.)
I would say SQL still is that! My wife had to learn some SQL to pull reports in some non-tech finance job 10 years ago. (I think she still believes this is what I do all day…)
I suppose this could be useful in that it prevents everyone in the company having to learn even the basics of SQL which is some barrier, however minimal.
Also the LLM will presumably be able to see all the tables/fields and ‘understand’ them (with the big assumption that they are even remotely reasonably named) so English language queries will be much more feasible now. Basically what LLMs have over all those older attempts is REALLY good fuzziness.
I see this being useful for some subset of questions.
A family friend maintains a SQL database of her knitting projects that she does as a hobby. The PHB can easily learn SQL if they want.
But he doesn't.
The project manager also won't learn behat and write tests.
Your client also won't use the CMS to update their website.
It won’t be that easy. First off, most databases in the wild are not well documented. LLMs benefit from context, and if your tables/columns have non-intuitive or non-descriptive names, the SQL may not even work. Second, you might benefit from an LLM fine-tuned on writing code and/or an intelligent Agent that checks for relevancy and ambiguity in user input prior to attempting to answer the question. It would also help if the agent executed the query to see how it answered the user’s question. In other words “reasoning”… pg-mcp simply exposes the required context for Agents to do that kind of reasoning.
Then let the AI first complete the documentation by looking at the existing documentation, querying the DB (with pg-mcp), etc.
Do human reviewing and correcting of the updated documentation. Then ensure that the AI knows that the documentation might still contain errors and ask it to do the 'actual' work.
There are LLM SQL benchmarks. [1] And state of the art solution is still only at 77% accuracy. Would you trust that?
Yes. Ask it to do it 10 times and pick the right answer
That only works if you assume the fail cases are uncorrected. Spoiler alert: they are not.
Ask 10 different models then
Same problem: The models are also correlated on what they can and can't solve.
To give you an extreme example, I can ask 1000000 different models for a counterexample to the 3n + 1 problem, and all will get it wrong.
No. What a bizarre example to choose. This is so easy to demonstrate. They will all come back with the exact same correct answer
If it's so easy, go do it. You can publish the result in any math journal you like with just a title and a number, because this is one of the hardest problems in mathematics.
For reference: https://en.wikipedia.org/wiki/Collatz_conjecture
My guy, every LLM has read Wikipedia
I don't know if you're purposely being dense. The first sentence of Wikipedia is that this is a famous unsolved problem.
So no, sampling 1000000 LLMs will not get you a solution to it. I guarantee you that.
It will get you the correct answer, not a solution. Once again it’s a terrible example, I don’t know why you used it. It’s certainly not a gotcha
The reason I used it is that the correct answer to the actual problem is unknown and nobody has any idea how to solve it. No amount of sampling an LLM will give you a correct answer. It will give you the best known answer today, but it won't give you a correct answer. This is an example where LLMs all give correlated answers that do not solve the problem.
If you want to scale back, many programming problems are going to be like this, too. Failure points of different models are correlated as much as failure points during sampling are correlated. You only gain information from repeated trials when those trials are uncorrelated, and sampling multiple LLMs is still correlated.
the correct answer is "the solution is unknown"
That's not what I asked the LLM for. I asked it for a counterexample, not whether a counterexample is currently known to humans.
Is that the correct answer to "write a lock-free MPMC queue"? That is a coding problem that literally every LLM gets wrong, but has several well-known solutions.
There's merit to "I don't know" as a solution, but a lot of the knowledge encoded in LLMs is correlated with other LLMs, so more sampling isn't going to get rid of all the "I don't knows."
So you will ask "What is our churn?", get a random result, and then turn your whole marketing strategy around wrong number?
Thats cute.
There are hundreds of text-to-SQL companies and integrations already. What's different about this that makes you react like that?
Those companies will be dead once this goes mainstream. Why pay to a 3rd party company when you can ask LLM to create graphs and analysis of whatever you want. Pair it with scheduled tasks and I really don't see any value in those SaaS products.
there are a lot of nuances in Business Analytics, you maybe can get away with GenAI for naiive questions like "Who are my top5 customers?", but thats not the type of insight usually needed. Most companies already know their top5 customers by heart and these don't change a lot.
Nuanced BI analytics can have a lot of toggles and filters and drilldowns, like compare sales of product A in category B subcategory C, but only for stores in regions X,Y and that one city Z during time periods T1, T2. and out of these sales, look at sales of private brand vs national brand, and only retail customers, but exclude purchases via business credit card or invoiced.
with every feature in a DB (of which there could be thousands), the number of permutations and dimensions grows very quickly.
whats probably going to happen, is simple questions could be self-served by GenAI, but more advanced usage is still needed interention by specialist. So we would see some improvement in productivity, but people will not lose jobs. Perhaps number of jobs could even increase due to increased demand for analytics, as it often happens with increased efficiency/productivity (Jevon's paradox)
Those companies and integrations are already using LLMs. That's the whole point. I'm only talking about LLM products, many of which are free and open source. This has been mainstream for years.
is that true? i'd like that, but i get the sense that this mcp stuff is more oriented around programming assistant and agent applications.
i suppose the desktop app can use it, but how good is it for this general purpose "chat with the database for lightweight analytics" use cases is it worth the trouble of dealing with some electron app to make it work?
> i get the sense that this mcp stuff is more oriented around programming assistant and agent applications
Agents will become ubiquitous parts of the user interface that is currently the chat.
So if you bother with a website or an electron app now, MCP will just add more capabilities to what you can control using agents.
yeah, i understand the premise. my question revolves around how well it actually works today for bi style applications. specifically, how close is it to being something that you can just drop in as a smart query and plotting interface rather than a bi stack that is built around something like tableau.
when i've read through documentation for mcp servers, it seems like the use cases they've mostly been focused on are improving effectiveness of programming assistants by letting them look at databases associated with codebases they're looking to modify.
i understand that these things are meant to be generic in nature, but you never really know if something is fit for purpose until it's been used for that purpose. (at least until agi, i suppose)