Item 43535694

pera • 3 days ago

Do you really need to provide the docs? I would have imagined that those docs are included in their training sets. There is even a guide on how to migrate from GTK3 to GTK4, so this seems to be a low-hanging fruit job for an LLM iff they are okay for coding.

dagw • 3 days ago

Feeding them the docs makes a huge difference in my experience. The docs might be somewhere in the training set, but telling the LLM explicitly "Use these docs before anything else" solves a lot of problems the the LLM mixing up different versions of a library or confusing two different libraries with a similar API.

Workaccount2 • 3 days ago

LLMs are not data archives. They are god awful at storing data, and even calling them a lossy compression tool is a stretch because it implies they are a compression tool for data.

LLM's will always benefit from in context learning because they don't have a huge archive of data to draw on (and even when they do, they are not the best at selecting data to incorporate).

iamjackg • 3 days ago

You might not need to, but LLMs don't have perfect recall -- they're (variably) lossy by nature. Providing documentation is a pretty much universally accepted way to drastically improve their output.

baq • 3 days ago

It moves the model from 'sorta-kinda-maybe-know-something-about-this' to being grounded in the context itself. Huge difference for anything underrepresented (not only obscure packages and not-Python not-JS languages).

nurettin • 2 days ago

Docs make them hallucinate a lot less. Unfortunately, all those docs will eat up the context window. Claude has "projects" for uploading them and Gemini2.5+ just has a very large window so maybe that's ok.

jchw • 3 days ago

In my experience even feeding it the docs probably won't get it there, but it usually helps. It actually seems to work better if the document you're feeding it is also in the training data, but I'm not an expert.

vasergen • 3 days ago

The training set is huge and model "forgets" some of the stuff, providing docs in context makes sense, plus docs could be up to date comparing to training set.