> It’s not copyright.
The legal debate in the United States, where most of the relevant models are being trained, revolves around whether or not the use of training data is fair use or not.
That’s to say, it may be a legally permitted form of copyright violation, or it may be a legally prohibited form of copyright violation.
Either way, it is legally an issue of copyright. On a secondary level we can talk about trade dress and moral rights, but that simply muddies the water. The legal discussion centered on copyright is concrete and unavoidable.
This is all orthogonal to moral/ethical/societal concerns.
> That’s to say, it may be a legally permitted form of copyright violation, or it may be a legally prohibited form of copyright violation.
That's arguably not really accurate, since statutory fair use itself (and this is why it is written in a less straightforward fashion than most of the rest of copyright law) is a direct statutory codification of what the Supreme Court found to be a Constitutional limit (based on the First Amendment) on the copyright power.
Fair use is not a “legally permitted form of copyright violation”, it is the space where the federal government has no power under the Constitution to create exclusivity as part of copyright.
> That’s to say, it may be a legally permitted form of copyright violation, or it may be a legally prohibited form of copyright violation.
Where exactly is the copyright violation supposed to be occurring? Is it distinct from the 'copyright violation' that happens in the brain any time someone learns what a Ghibli frame looks like? What if I study Ghibli in an art class and write notes to myself about the art style, to jog my memory later?
If the problem is that model training is considered a weird kind of copying, what's the legal line between that (allegedly infringing) copy and taking a JPEG and re-encoding it as a PNG?