Keeping in mind that the purpose of IP law is to promote human progress, it's hard to see how legacy copyright interests should win a fight with AI training and development.
100 years from now, nobody will GAF about the New York Times.
IP law was to promote human progress by giving financial incentive to create this IP knowing it was protected, and you could make money off it.
We will all make a lot more money and a lot more progress by storing, organizing, presenting, and processing knowledge as effectively as possible.
Copyright is not a natural right by any measure; it's something we pulled out of our asses a couple hundred years ago in response to a need that existed at the time. To the extent copyright interferes with progress, as it appears to have sworn to do, it has to go.
Sorry. Don't shoot the messenger.
Why would you expect NYT or any other news organization to report accurate data to feed into your AI models if they can't make any money off of it?
It's not just about profits, it's about paying reporters to do honest work and not cut corners in their reporting and data collection.
If you think the data is valuable, then you should be prepared to pay the people who collect it, same as you pay for the service that collates it (ChatGPT)
I wish I knew what the eventual business model will look like, but I don't. A potential guess might be to consider what MSNBC was, or was supposed to be -- a joint venture between Microsoft and NBC network news, where the idea was to take advantage of the emerging WWW to get a head start on everyone else. The pie-in-the-sky synergies that were promised never materialized, so the outcome just amounted to a new name for an old-media player. As it turned out, the business of gathering and delivering news and editorial content didn't change much at all. It just migrated from paper and screens to, well, screens.
Now, as you point out, companies like OpenAI have a problem, and so do the rest of us. Fair compensation for journalists and editors requires attribution before anything else can even be negotiated, and AI literally transforms its input into something that is usually (but obviously not always) untraceable. For the big AI players, the solution to that problem might involve starting or acquiring news and content networks of their own. Synergies that Microsoft and NBC were hoping might materialize could actually be feasible now.
So to answer your question, maybe ChatGPT will end up paying journalists directly.
Again, I don't know how plausible that kind of scenario might turn out to be. But I am absolutely certain that countries that allow their legacy rightsholders to impede progress in AI are going to be outcompeted by those with less to lose.
Copyright is the thing that allows software companies to sell their products and make money. It’s not just about “knowledge”.
I sometimes wonder if people commenting on this topic on HN really understand how fundamental copyright as a concept is to the entire tech industry. And indeed even to capitalism itself.
>We will all make a lot more money and a lot more progress by storing, organizing, presenting, and processing knowledge as effectively as possible.
That's a huge assumption in the first place. An even bigger leap to tie that general proposition to what's happening here.
But the main point is the human progress here. If there is an obvious case where it seriously gets in the way of human progress, then thats a problem and I hope we can correct it through any means necessary.