One of the persistent challenges I run into in this area, is that any sort of up front filtering/routing requires you to know in advanced which logs are going to be important when an issue happens. Which is sort of impossible. And nobody wants to be the guy that filtered out some logs because they looked useless and then only later on realize they would have been instrumental in getting back up and running quickly.
One of the biggest problem we hear about from CISOs is 'they don't know what they don't know' - meaning they need a way to catch all the data. This plays pretty directly into your comment - there's a need for wanting everything, but a penalty for having everything - slower queries, expensive, more false positives, slower time to resolution.
What's common as a middle ground is blob storage and rehydration - where you send everything into low cost storage like S3 while still peeling off the high value data into the SIEM / Datadog / etc. Then if you notice something is amiss, you can rehydrate the time window you care about.