I had just heard the expression: "It helps the general effort." It was in a Discworld audiobook. It made me think: "Yeah, I could use some help right now." That was still rattling around in my head when I had to come up with a username. I don't remember what I was having trouble with, so I guess that worked out.
"They" is the copyright industry. The same people, who are suing AI companies for money, want the Internet Archive gone for more money.
I share the fear that the copyrightists reach a happy compromise with the bigger AI companies and monopolize knowledge. But for now, AI companies are fighting for Fair Use. The Internet Archive is already benefitting from those precedents.
In the US, copyright is limited by Fair Use. It is still IP. Eventually, you'd just be changing how Fair Use works. Not all for the better, I think.
Maybe one could compare it to a right of way over someone's physical property. The public may use it for a certain purpose, in a limited way, which lowers its value. But what value it has, belongs to the owner.
It's a bit of a split among libertarians. Some very notable figures like Ayn Rand were strong believers in IP. In fact, Ayn Rand's dogmas very much align with what is falsely represented as left-wing thought in the context of AI.
It's really irritating for me how much conservative capitalist ideals are passed off as left-wing. Like, attitudes on corporations channel Adam Smith. I think of myself as pragmatic and find that Smith or even Hayek had some good points (not Rand, though). But it's absolutely grating how uneducated that all is. Worst of all, it makes me realize that for all the anti-capitalist rhetoric, the favored policies are all about making everything worse.
For fastest inference, you want to fit the entire model in VRAM. Plus, you need a few GB extra for context.
Context means the text (+images, etc) it works on. That's the chat log, in the case of a chatbot, plus any texts you might want summarized/translated/ask questions about.
Models can be quantized, which is a kind of lossy compression. They get smaller but also dumber. As with JPGs, the quality loss is insignificant at first and absolutely worth it.
Inference can be split between GPU and CPU, substituting VRAM with normal RAM. Makes it slower, but you'll probably will still feel that it's smooth.
Basically, it's all trade-offs between quality, context size, and speed.
I had a look at what tumblr says and it's probably a good option. It's not likely that they will try to find sneaky ways around the settings. The liability risk is in no relation to the potential gain from selling that data. Under EU law, such opt-outs must be respected when training AI. For now, the major US companies can be expected to abide by that. In the future, we may see special models for the EU. A few open source models by Chinese companies already exclude use in the EU.
Reducing scraping takes skill and a major effort, which tumblr can bring. A catch is that there is a conflict between serving images to lots of people but not to scrapers. Sufficiently determined large scale scraping operations will still succeed, but maybe no one will feel that it's worth the effort anymore. It's impossible to prevent individuals from saving images. So AI hobbyists or small artists could still use your images for training and share the products. When fans re-upload your images, they may become part of large scale datasets after all.
It depends on what exactly you want to achieve. If you want money, then upload it to Adobe, Shutterstock, and such. That's the best offer you can expect. No one will offer you big bucks for an AI license just because you have protected your drawings.
If you want to have your own site, then you could rely on Cloudflare for handling the technical and legal side of preventing scraping/AI use.
But I guess the main worry for any artist is other artists who use AI. That's where Glaze and Nightshade come in. It's already been suggested but you should know how much you can expect.
These tools target the original Stable Diffusion 1.5. IIRC they also work on SD 2.0 because they reused some components. I am not sure what other versions, if any, could be affected. Certainly not the newer ones.
It goes without saying that the major companies were never affected, could not be affected. Since no one mentions it, I guess it's self-evident but I want to repeat it for the uninitiated.
I think these early models are still used partly because they have lower hardware demands and partly because they are less professionally censored (ie more suitable for porn).
Anyway, the effectiveness against hobbyists, your competitors, or other small scale AI users, is also limited. They may not use a susceptible model, especially if they make SFW images. If their model is susceptible, then these tools may waste a few hours of their time and maybe a bit of money. But it won't get rid of the competition or even significantly harm them.
It's also funny how Lemmy is buying up this narrative.
The entire US economy is currently being propped up by growth in the AI/tech sector.
What's happening is that Dementia Don is curb-stomping the US economy. AI investments, mainly in data centers, are the only thing that still seems promising. When you are on a trek and someone leads you through Death Valley, while pouring out all the water, you shouldn't blame the last horse that still keeps going.
Putting the blame in the right place would certainly help, with a view toward the mid-terms.
Financially: Diversify. Make sure that you are not completely dependent on what happens in the US. But mind that Europe comes with its own imponderable risks (ie Putin). Same with China. Maybe some old leader dies and the new crew runs everything into the ground; they go to war with Taiwan, that sort of thing.
This is a rare useful (and frugal/practical) niche for blockchain. Immutable verification is a core principle.
You're right on everything else, but this is just no. You never need blockchain.
One just need someone who makes it credible that the hash and timestamp were not tampered with. Even posting the hash on Reddit would do it for most people. Reddit isn't going to commit fraud for some random person. And that random person is probably not able to hack the database undetected.
Recomputing lots of hashes isn't difficult. A blockchain doesn't add any trustworthiness on its own.
Ethical meaning : “private”, "anonymous, “not training with your data”, “no censured”, “open source”…
Yes. You have to be careful with the meaning of "ethical". Most often, people write about "ethical AI" to demand money for copyright owners.
Case in point: Some people say that AI is only open source if the training data can also be shared freely. That means the training data has to be public domain or that permission by the copyright owner was obtained. If that's what you mean by "open source", then your options are extremely limited. EG some offerings from AllenAI.
Uncensored is also tricky. Many say that ethical AI does not output bad content. Of course, what bad content is depends very much on who you ask. The EU or China have strict legal requirements but not the same, of course. In any case, when you train an AI, you steer it to generate a certain kind of output. Respectable businesses don't want NSFW stuff. Some horny individuals out there want exactly that. So it depends on what you want.
Check out the SillyTavernAI subreddit (and also LocalLlama). There you find people who value private, uncensored LLMs, though not necessarily copyright. It's also where the above-mentioned horny individuals hang out for related reasons.
Duckduckgo offers free, anonymous access to major Chatbots. Maybe worth checking out.
I had just heard the expression: "It helps the general effort." It was in a Discworld audiobook. It made me think: "Yeah, I could use some help right now." That was still rattling around in my head when I had to come up with a username. I don't remember what I was having trouble with, so I guess that worked out.