Submitted by Tooskee t3_10oif8i in technology
AuthorNathanHGreen t1_j6gg5hf wrote
Reply to comment by SeaweedSorcerer in Microsoft, GitHub, and OpenAI ask court to throw out AI copyright lawsuit by Tooskee
When I posted a story online for free I did so because I thought real humans could read it, and perhaps decide they wanted to buy my longer works if they liked it. I understood that someone might read it and not like it, like it but be too cheap to buy paid work, or perhaps read it and use it to study writing techniques I used. I did not however post it thinking an AI might be training itself (with no hope of me getting compensation out of the deal) so that it could further dilute the market for writing.
Don't I have a right that my content not be used in a manner I couldn't anticipate or prevent?
CallFromMargin t1_j6glixm wrote
In that specific case, no. Fair use laws cover that, and Google vs author guild had solved that specific case in court. Using your work falls under fair use, just like human reading your work and incorporating ideas in his/her own work.
That said, if you wrote shit in internet, let me assure you, it is almost useless for training writing AI. Believe me, I tried to do it on dataset of /r/writingprompts, the thing is that most writing there just sucks, which is not bad, as the only way of learning to write is by writing, thus putting bad work on the internet. It doesn't change the fact that it objectively sucks.
If I wanted to write an actual writing AI I would use a collection of classical works, works that stood the test of time, and frankly, the difference between those and what is put on internet is often in how scenes and characters are flushed out.
Ronny_Jotten t1_j6hrjni wrote
> In that specific case, no. Fair use laws cover that, and Google vs author guild had solved that specific case in court. Using your work falls under fair use, just like human reading your work and incorporating ideas in his/her own work.
That's completely false. The Google case was found to be fair use, precisely because it did not "dilute the market for writing". That's one of the four legal tests for fair use. The judge said that it did not produce anything that competed economicially in the market for the books that it scanned; on the contrary it might increase their sales. Whether such scanning is fair use, is determined on a case-by-case basis. If AIs are being used to produce "new" works that are sold commercially and undercut the authors of the originals that it's based on, it will be much more difficult to prove fair use.
Furthermore, the Copilot product creates a loophole where businesses can incorporate code released under e.g. a GPL license that requires said business to release its deriviative works under the same open-source license, and make it closed-source instead. That can also create an unfair economic advantage in the market. These questions are far from "solved".
Doingitwronf t1_j6gpxd7 wrote
I wonder what happens now that Ais can be instructed to produce works in the specific style of any author/artist who's works were supplied to the training set?
CallFromMargin t1_j6gwqup wrote
What used to happen when you asked for a painting in style of X? The same thing is happening with AI art. It's literally the same thing.
Ronny_Jotten t1_j6hspu6 wrote
It's literally not the same thing though, at least legally speaking. It's already accepted that a human looking at an artwork is not "making a copy", as defined in the copyright laws. As long as they don't produce a "substantially similar" work, there's no copyright violation. The same can't be said for scanning or digitally copying a work into a computer; that is "making a copy" that's covered by the copyright laws. In some cases, that can come under the "fair use" exemption. But not in all cases. It's evaluated on a case-by-case basis; in the US according to the four-part fair use test. For example, if it's found that the generated works have a negative economic impact on the value of the original works, there's a substantial chance that it won't be found to be fair use.
CallFromMargin t1_j6hvui0 wrote
The computer is not storing a copy of original work in trained model. It looks at picture, it learns stuff from it and it stores only what it learns.
Your argument is based either on fundamental misconception on your part, or a flat out lie from you. Neither one casts you in good light
Ronny_Jotten t1_j6hzcpu wrote
> The computer is not storing a copy of original work in trained model. It looks at picture, it learns stuff from it and it stores only what it learns.
Just because you anthropomorphize the computer as "looking at" and "learning stuff", doesn't mean it's not digitally copying and storing enough of the original work in a highly compressed form within the neural network to violate copyright by producing something "substantially similar": Image-generating AI can copy and paste from training data, raising IP concerns | TechCrunch
But regardless of whether it produces a "substantially similar" work as output, making a copy of the original copyrighted work into the computer in the first place is a required step in training the AI network. Doing so is only legally allowed if it's fair use. That was the question in the Google books case - it was found that the scanning of books was fair use, because Google didn't use it to create new books or otherwise economically damage the authors or the market for the original books. But that's not necessarily the case with all instances of making digital copies of copyrighted works.
> Your argument is based either on fundamental misconception on your part, or a flat out lie from you. Neither one casts you in good light
Well, you can fuck off with that, dude. There's no call for that kind of personal attack.
CallFromMargin t1_j6i4o2a wrote
No, the fact that it's mathematically impossible to store that many images, and if done, this compression algorithm would violate laws of physics, means that it is not storing images.
It is impossible to compress 380tb of data to 0.04tb of data.
Ronny_Jotten t1_j6i68gn wrote
And yet, the citation I gave shows Stable Diffusion obviously replicating copyrighted images from the LAION training set, despite your musings about thermodynamics. It may not store reproducible representations of all images, I don't know - but it unquestionably does store some.
In any case, it doesn't change the fact that copying images into the computer in the first place, in order to train the model, would need to come under a fair use exemption. For example, research generally does - but not in every case, especially if it causes economic damage to the original authors. In many countries, authors also have moral rights, to attribution, to preservation of the integrity of their work against alteration that damages their reputation, etc., which may come into play.
[deleted] t1_j6ibove wrote
[deleted]
Viewing a single comment thread. View all comments