<< there should be an app like an 'auto-RAG' that scrapes RSS feeds and URLs,
I am not aware if that exists yet, but the challenge I see with it is rather simple: you get overwhelmed with information really quickly. In other words, you would still need human somewhere in that process to review those scrapes and the quality of that varies widely. For example, even on HN it is not a given a link will be pure gold ( you still want to check if it fits your use case ).
That said, as ideas goes, it sounds like a fun weekend project.
I do exactly this with hoarder. I passively build tagged knowledge bases with the archived pages and then feed it to a RAG setup.
Cool. Hoarder looks interesting, thanks for the tip. How is it working out for you? Are you using the feature for auto hoarding RSS feeds?
I am! It works great and it’s reasonably easy to snapshot sites without RSS on a cron.