> That said it is one thing for data to be cached in the short-term
Cashed data isn’t necessarily available for data retention to apply in the first place. Just because an ISP has parts of a message in some buffer doesn’t mean it’s considered as a recording of that data. If Google never stores queries beyond what’s needed to serve a response then it likely wouldn’t qualify.
Also, it’s on the entity providing data for the discovery process to do redaction as appropriate. The only way it ends up at the other end is if it gets sent in the first pace. There can be a lot of back and forth here and as to evidence that the courts care: https://www.law.cornell.edu/rules/frcp/rule_5.2
That is helpful, thanks, but I think it is not practical to redact LLM request information beyond the GDPR personally identifiable standards without just deleting everything. My (admittedly quick) read of those rules is that their ‘redacted’ information would still be readily identifiable anyway (not directly, but using basic data analysis). Their redaction standards for CC# and SIN are downright pathetic, and allow for easy recovery with modern techniques.