Are you sure you mean "stealing"? As in deprive him of his own recordings?
I am curious if anyone read Harry Potter in bootleg form from a LLM. I mean, LLMs are the worst tools for infringing - they are approximate, expensive and slow, while copying is instant, perfect and free. You can apply the same logic for other modalities.
Moreover, who's got the time to see someone else's AI shit when they can generate their own, perfectly customized to their liking? I personally generated a song about my cat and kid. It had zero commercial value but was fun for 2-3 people to listen.
I can steal a company's codebase without depriving them of their code.
I can steal an invention without wiping the inventor's memory.
These are other kinds of stealing, which deprives the creator not of the art itself, but the other benefits of having created it.
The logic "gen AI is theft" is pretty careless. Let's say I use gen-AI to identify a skin sore, and seek appropriate treatment. Who's copyright was being violated? How about if I ask it to make a story where my kid is a protagonist? In fact the more I put into a prompt, the less it looks like anything in the training set, the longer the discussion, the greater the divergence.
I'm saying that training gen-AI on Benn Jordan's art against his will, without permission, and without remuneration, is theft. You can train on a suitably licensed database of medical images if you want.
It depends, is it a LoRa focused on Benn? Is it intended to replicate his style? Or is Benn one of the billion authors in the training set? In that case his impact will be minimal, and the generated images/texts will be very different from any training example, mostly reflecting the prompt.
If the impact is really so minimal that the appropriate compensation is zero, then surely your AI would also be just fine without training on those works?
I understand what you are saying, but there are different kinds of learning. If your model learns how many fingers are on a hand, it does not violate copyright in that regard. Every copyrighted work relies on facts, abstractions and styles they don't own. Models don't have the necessary capacity to remember exact expression from each source, only generalities are being learned and later recombined. Blocking AI from training on grounds of copyright is a maximalist take on what copyright protects.
Let's say we change copyright to cover abstractions, then any human creator would be in deep trouble. Incidental similarities would make the creative act too risky.
Copyright as a concept is prescriptive not descriptive. It doesn't appear from an observation of the laws of reality, it's law to produce certain outcomes, so what copyright applies to is a question of what will the legislative bodies decide to do.
It's entirely valid to consider the act of training via ML a right granted by the creator like reproduction or performing, simply on the basis of protecting human art. The comparison with human learning can be made irrelevant (and IMO it's not the same fundamentally).
This is currently a discussion about plagiarism (the ethics) and what the outcomes are from unrestricted GenAI. How copyright applies to GenAI is a question for later, informed by the discussion by society at large (and lobbyists).
Setting aside questions like why the AI company is entitled to use these materials for free to develop commercial products, why would you even want to use art to learn factual information like how many fingers are on a hand? And how can you claim that this is what they pull from that art, right after we've just been through a wave of unmistakable AI Studio Ghibli copies?
I am more fearful that Ghilibi will demand to own their style, forbidding anyone from using it without permission. If everyone stakes a claim on some style or abstraction, then we are in a situation where nothing new can be created without automatically becoming infringing.
Wait, what happened to "it's just learning hands have five fingers"? A few comments ago you seemed to be saying that copying one artist's style on purpose might be bad, but that with billions of training images, no one source will be recognizeable in the output, and under that condition the impact to the artist is minimal. (and yet...[1]). Now you're taking the position that even using their work without permission to train AI for the purpose of directly and unmistakably replicating that artist's signature style is OK? Which is it?
And are you not fearful of a world in which nobody ever develops a new Ghibli-tier art style, because they'd starve in the process while competing against low-cost AI interpolations of prior art and AI companies make a fortune selling their efforts?
I'm fearful of a world in which anyone can print "Huggies" on counterfeit diapers and pass them off on Amazon as the real deal, so I hope you at least have room in your philosophy for trademarks, if not trademark art styles.
But either way, the good news is that you yourself can put in the work to hand-paint 100,000 frames of imitation Ghibli-style artwork and then train an AI on your portfolio and give it to whoever you want, or even master a style of your own; the bad news is that this will take sixty years of your life and career, during which time you'll have to feed yourself and your staff, and when you're done you might not feel so generous about giving it away for other people to use to sell greeting cards on Etsy.
But there's more good news, which is that Miyazaki is at least generous enough to allow at least one other studio to use his art style with his blessing, as long as the result lives up to the following standard: "every film you make, you’ll have to realize that has to be a film that is worthy to show to children".[2]
And personally, I do think he's earned the right to ask that much.
[1] https://theaiunderwriter.substack.com/p/an-image-of-an-arche...
Nothing happened. Judge by the outputs, if they are infringing, the model learned expression. If not, it learned abstractions and styles. Why do you think there were only 2-3 lawsuits focusing on output infringement? If regurgitation was a big issue copyright holders would raise hell. But they focus on training data instead, which means the outputs are ok.
Anyway, focusing on training data is misguided unless you also restrict in context learning by policing what users can paste into the model.
So leave him out of the training set… the impact will be minimal
That's the entire issue and the point of his attempts to poison his music from AI training: you can't opt-out of having your work used to train GenAI. I'd encourage you to watch his video.
GP clearly understands this. The comment makes perfect sense in the context of the thread. It is directed not at the artist, but at those who believe that AI companies training on art without permission aren't stealing anything.
How about you write a story where your kid is the protagonist, instead? I'm sure they'll look back on it with far more appreciation than something an LLM shit out.
> The logic "gen AI is theft" is pretty careless.
No, it isn't. It is a widely understood if not explicitly stated and frankly, so easy to understand concept that I'm forced to conclude people as yourself who refuse point blank to understand it are merely doing so out of sheer determination.
When you post things online, be it art, be it music, be it photographs, be it the idle thoughts of your mind: you do this to share it with other people. That's the point. Why you want to share it can certainly change and explanations range from desiring an online following/clout to making money to simply the fact that you are driven to create and share whatever it is you do. This is the entire basis of the proto-Internet, to a degree where it predates the whole notion of extracting wealth from others via the Internet, and to a large degree, also predates the notion of a following.
If you desire to counteract this premise, then I would suggest you must somehow explain the existence of BBSes, newsgroups, anonymous web forums, blogs, etc. that are some of the oldest, dustiest pillars of the Internet, that our modern Internet would not exist without, and moreover, tons of these existed when analytics, statistics, etc. were barely conceived of. Some websites had a guest book, and some had visitor counters, and that was literally it. If you published on a blog like I did in the 90's, you had no fucking clue if anyone at all was actually reading it.
Or to put it shortly: The Internet and it's constituent communities existed long before there was a financial motive to publish, and absent that, what possible explanation could there be to publish apart from have your thing seen by people?
Ergo, to then take those intellectual products, shared in good faith for the enjoyment of others, run them through an industrial shredder, identify the most common patterns in their underlying structure, and sell it back to the public to whom it's existence simply must be credited, is perverse. It is self-expression without the self; it is merely expression, for the sake of expression. Noise. And we all kind of know this, prior to LLMs, the most common experience of this phenomenon was spam. Because if you simply posted links to whatever thing you were hocking it would be deleted immediately, spammers had to get clever. They would write what were ostensibly "blog posts" but they were not sincere expressions of anything apart from avaricious greed and a desire to camouflage it. Websites rose that were nothing but endless pages of this shit, designed to be optimally crawled by Google and other engines, linking back to sites seeking favor in search engines. Web forums at that time had to regularly moderate away content not for being hateful, but for being openly fucking stupid, irrelevant to topic, and simply seeking to drive traffic to other sites. Anywhere there was an unprotected text input who's contents may, eventually, arrive at a page was, to a spammer, a perfectly valid place to park their shit that nobody wanted, and park it they did.
And this is exactly what LLMs make. Soulless nothing who's primary reason to exist is to occupy space. Image generators are the same; their outputs exist because an image, no matter how generic or mediocre, would look better in this spot than a blank color. Music, without motivation or drive behind it, as a convenient alternative to silence. Video because an image would be boring. Functional, purposeless, driveless, motivation-free content to occupy the otherwise empty void around whatever thing you're actually trying to get eyes on: Text to cloak your links, mediocre art to cloak your sales pitch, etc.
And the reason the business parasite class is so thrilled about it's existence is that it finally allows people who have not nurtured creativity within themselves for whatever reason you'd care to assign access to creative outputs. They have sneered derisively at people who feel the drive to create and learn the ability to do so because fundamentally they do not believe creative products have value, that the attention, the adoration, and more than anything the money that said creatives are given so freely by their fans would be far better in their pocket instead. You can hear this in the way they talk: "I'm making a movie with AI!" said as though it's a revolutionary statement, as though great directors announce "I've started production on my next picture using Red Digital Cameras" like anyone from the public would give an ounce of a shit how a thing was made. They don't even consider "will someone want to see this movie?" because other people seeing it isn't the point, their elevation is the point. The promotion of their brand is the point. They have nothing at all to say, and it will doom anything they make to abject failure.
AI permits capital to access skill without requiring at all that skill be permitted to access capital.
Yeah, but when money becomes information then everyone has money. When music becomes information then everyone has music. When food, medicine and narcotics become information with genetic engineering, then everyone has food, medicine and narcotics. When weapons become information, everyone has weapons. When skyscrapers become information, then everyone has skyscrapers.
Also information wants to be sorted and compressed. See for example personal wikis using org-roam, personal databases of text files using org-mode in emacs and so on.
What good is for me, 10 songs which each one has 10 seconds of a part i like, and i am indifferent to the rest of the song. I take these 100 seconds in total and compress them down to a new song which is totally to my liking.
> Yeah, but when money becomes information then everyone has money.
Categorically, no. When everyone has money, in fact, no one has money. If you don't believe me ask someone from Zimbabwe.
> When food, medicine and narcotics become information with genetic engineering, then everyone has food, medicine and narcotics.
The barrier to everyone having food, medicine and narcotics is not money, it's the restriction of those important goods to people who have money; because food is not grown because people are hungry, nor are medicines made because they are sick; both of these things are done for profit, and people without money are unprofitable to the producers and distributors of those goods.
We have enough food for everyone almost twice over. The problem is the majority of it is pumped to overweight westerners who have disposable income.
> What good is for me, 10 songs which each one has 10 seconds of a part i like, and i am indifferent to the rest of the song. I take these 100 seconds in total and compress them down to a new song which is totally to my liking.
Oh that's so nice for you. And all it took was the bankrupting of every music artist.
Your views on this are so myopic as I genuinely wonder if you're a parody account. I didn't think it was possible for someone to be this short-sighted and selfish but here we are.
Ok, let's focus on food then instead of being all over the place. I personally produce on average 1 ton of olive oil a year. A little bit less 800 kilos or something, but let's say it's 1 ton. I keep some for myself and the rest is sold for profit.
Let's say someone in Zimbabwe wants to buy olive oil. He has to buy from the top producers in the world, i.e. Spain, Italy and Greece. I will sell olive oil to Zimbabweans only for profit, if they don't have money they will not have olive oil.
But let's say a smart Zimbabwean figures out a way to genetically modify Chlorella or Spiroulina which only need sun and no particular climate or fertile land and he produces so much olive oil that it's price collapses to nothing. Not only that will bankrupt me, but such a new innovation in food production will put me at a disadvantage of even competing using the new innovation because the reason is simple, i don't have even half the sun Zimbabwe has.
But even if my income stream of olive oil goes to zero, i can sell something else, i can even learn to program computer code, and sell that. Or learn genetic engineering myself and create some new innovation. There is never any shortage of problems to solve, and things to produce.
> But let's say a smart Zimbabwean figures out a way to genetically modify Chlorella or Spiroulina which only need sun and no particular climate or fertile land and he produces so much olive oil that it's price collapses to nothing.
That's a hell of an industrious Zimbabwean, firstly.
Secondly though, while I understand where your metaphor is aiming, it just doesn't really work. There has never been a "business with a competitive advantage" on the scale of human creative output vs. AI. I would argue even this example you've cooked up which already strains belief on several points doesn't truly capture it.
AI in your example wouldn't be one person in one place that can make a product at unbelievable scale; I think it would be more analogous to everyone on Earth suddenly having a machine on their countertop that produces Olive Oil by itself, of low quality, for free, forever. And I think the effects would be similar: any producer of less than fantastic olive oil would go bankrupt, because they're effectively competing with a free product (not unlike the Netscape Navigator vs Internet Explorer problem) and most consumers who are fine with a low-grade oil have zero reason to buy more when they can just buy access to the infinite olive oil maker. You would also then have producers of food both industrial and restaurants who would begin advertising that they use real proper olive oil, many of which would doubtlessly be using the shitty stuff from our imaginary machine or perhaps a blend of both to cut costs, but nevertheless, this hurts the producers of the good stuff too, because again, they are competing with a free-at-point-of-use product.
And like, on balance, all of this is just a net negative for everyone:
* A lot of the olive oil is worse now, for everyone, seemingly permanently absent regulations
* People are now losing a lot of counter space to the machine
* We've obliterated an entire agricultural sector and farmers who have done nothing but this their entire lives have to retrain jobs and/or their land has to be used to grow something else
* One company that made the infinite olive oil machine now has billions and billions of dollars, that they have earned by making the other three things happen
And just... why? Why is this a good thing for humanity? Put aside the stupid game with the made up money and line go up and justify to me why we've now done this, because it seems like everyone has lost a little bit, some have lost a lot, and a vanishingly tiny minority have won big, explicitly at the expense of everyone else.
And before you start with "well the market chose-" no, it didn't. The market followed incentives established by the market and those who regulate it. That's not a choice anyone or anything made, any more than it's a choice if a river floods it's banks and destroys a bridge.
See the AI generated music channels on YouTube. They get lots of views and a significant part of them would be an actual song stream instead. So yeah, they're taking money away from the artists with content learned from the artists.