If the impact is really so minimal that the appropriate compensation is zero, then surely your AI would also be just fine without training on those works?
I understand what you are saying, but there are different kinds of learning. If your model learns how many fingers are on a hand, it does not violate copyright in that regard. Every copyrighted work relies on facts, abstractions and styles they don't own. Models don't have the necessary capacity to remember exact expression from each source, only generalities are being learned and later recombined. Blocking AI from training on grounds of copyright is a maximalist take on what copyright protects.
Let's say we change copyright to cover abstractions, then any human creator would be in deep trouble. Incidental similarities would make the creative act too risky.
Copyright as a concept is prescriptive not descriptive. It doesn't appear from an observation of the laws of reality, it's law to produce certain outcomes, so what copyright applies to is a question of what will the legislative bodies decide to do.
It's entirely valid to consider the act of training via ML a right granted by the creator like reproduction or performing, simply on the basis of protecting human art. The comparison with human learning can be made irrelevant (and IMO it's not the same fundamentally).
This is currently a discussion about plagiarism (the ethics) and what the outcomes are from unrestricted GenAI. How copyright applies to GenAI is a question for later, informed by the discussion by society at large (and lobbyists).
Setting aside questions like why the AI company is entitled to use these materials for free to develop commercial products, why would you even want to use art to learn factual information like how many fingers are on a hand? And how can you claim that this is what they pull from that art, right after we've just been through a wave of unmistakable AI Studio Ghibli copies?
I am more fearful that Ghilibi will demand to own their style, forbidding anyone from using it without permission. If everyone stakes a claim on some style or abstraction, then we are in a situation where nothing new can be created without automatically becoming infringing.
Wait, what happened to "it's just learning hands have five fingers"? A few comments ago you seemed to be saying that copying one artist's style on purpose might be bad, but that with billions of training images, no one source will be recognizeable in the output, and under that condition the impact to the artist is minimal. (and yet...[1]). Now you're taking the position that even using their work without permission to train AI for the purpose of directly and unmistakably replicating that artist's signature style is OK? Which is it?
And are you not fearful of a world in which nobody ever develops a new Ghibli-tier art style, because they'd starve in the process while competing against low-cost AI interpolations of prior art and AI companies make a fortune selling their efforts?
I'm fearful of a world in which anyone can print "Huggies" on counterfeit diapers and pass them off on Amazon as the real deal, so I hope you at least have room in your philosophy for trademarks, if not trademark art styles.
But either way, the good news is that you yourself can put in the work to hand-paint 100,000 frames of imitation Ghibli-style artwork and then train an AI on your portfolio and give it to whoever you want, or even master a style of your own; the bad news is that this will take sixty years of your life and career, during which time you'll have to feed yourself and your staff, and when you're done you might not feel so generous about giving it away for other people to use to sell greeting cards on Etsy.
But there's more good news, which is that Miyazaki is at least generous enough to allow at least one other studio to use his art style with his blessing, as long as the result lives up to the following standard: "every film you make, you’ll have to realize that has to be a film that is worthy to show to children".[2]
And personally, I do think he's earned the right to ask that much.
[1] https://theaiunderwriter.substack.com/p/an-image-of-an-arche...
Nothing happened. Judge by the outputs, if they are infringing, the model learned expression. If not, it learned abstractions and styles. Why do you think there were only 2-3 lawsuits focusing on output infringement? If regurgitation was a big issue copyright holders would raise hell. But they focus on training data instead, which means the outputs are ok.
Anyway, focusing on training data is misguided unless you also restrict in context learning by policing what users can paste into the model.