I'm biased as an employee, but who knows PDFs better than Adobe? Use their PDF text extraction API.
As someone who's been using KDE's Okular PDF reader for nearly twenty years, and also has to use Adobe's products - can confidently say that at least one answer to your question is 'The developers of KDE's Okular'.