I disagree because the information has already been recorded and users don’t have a say in who at the company or some random 3rd party the company sells that data to is “authorized” to view data.
It’s the collection itself that’s the problem not how soon it’s deleted as economically worthless.
> with no guarantees of security, no audits, and few safeguards.
The courts pay far more attention to that stuff than profit maximizing entities like OpenAI.
I agree that your assessment of the legal state-of-play is likely accurate. That said it is one thing for data to be cached in the short-term, and entirely different for it to be permanently stored and then sent out to parties which the user has only a distant and likely adversarial relationship with.
There are many situations in which the deletion/destruction of ‘worthless’ data is treated as a security protection. The one that comes to mind is how some countries destroy fingerprint data after it has been used for the creation of a biometric passport. Do you really think this is a futile act?
>”The courts pay far more attention to that stuff than profit maximizing entities like OpenAI.”
I would be interested to see evidence of this. The courts claim to value data security, but I have never seen an audit of discovery-related data storage, and I suspect there are substantial vulnerabilities in the legal system, including the law firms. Can a user hold the court or opposing law firm financially accountable if they fail to safeguard this data? I’ve never seen this happen.
> That said it is one thing for data to be cached in the short-term
Cashed data isn’t necessarily available for data retention to apply in the first place. Just because an ISP has parts of a message in some buffer doesn’t mean it’s considered as a recording of that data. If Google never stores queries beyond what’s needed to serve a response then it likely wouldn’t qualify.
Also, it’s on the entity providing data for the discovery process to do redaction as appropriate. The only way it ends up at the other end is if it gets sent in the first pace. There can be a lot of back and forth here and as to evidence that the courts care: https://www.law.cornell.edu/rules/frcp/rule_5.2
That is helpful, thanks, but I think it is not practical to redact LLM request information beyond the GDPR personally identifiable standards without just deleting everything. My (admittedly quick) read of those rules is that their ‘redacted’ information would still be readily identifiable anyway (not directly, but using basic data analysis). Their redaction standards for CC# and SIN are downright pathetic, and allow for easy recovery with modern techniques.