Just change arxiv.org to arxiv-txt.org in the URL to get the paper info in markdown
Example:
Original URL: https://arxiv.org/abs/1706.03762
Change to: https://arxiv-txt.org/abs/1706.03762
To fetch the raw text directly, use https://arxiv-txt.org/raw/abs/1706.03762, this will be particularly useful for APIs and agents
It just extracts the abstracts?
This would be awesome wrapped in an MCP server/tool call :)
If you train an LLM on only formally verified code, it should not be expected to generate formally verified code.
Similarly, if you train an LLM on only published ScholarlyArticles ['s abstracts], it should not be expected to generate publishable or true text.
Traceability for Retraction would be necessary to prevent lossy feedback.