Your website can now opt out of training Google's Bard and future AIs

⁨315⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨original_reader@lemm.ee⁩ to ⁨technology@lemmy.world⁩

https://techcrunch.com/2023/09/28/your-website-can-now-opt-out-of-training-googles-bard-and-future-ais/

source

Comments

Sort:hotnew top

mojo@lemm.ee ⁨1⁩ ⁨year⁩ ago
Just like Google honors “do not track”, right?

source
cheese_greater@lemmy.world ⁨1⁩ ⁨year⁩ ago
“opt” “out”

source
vimdiesel@lemmy.world ⁨1⁩ ⁨year⁩ ago
lol I’m sure the AI profiteers will honor this

source
Mongostein@lemmy.ca ⁨1⁩ ⁨year⁩ ago
So since what’s available now isn’t actually AI, what do we call it when we do get real AI? Will it be like what happened with HDMI? With True AI™ followed by Ultra AI™, AI4K™, and so on until we just call them master?

source
- pjhenry1216@kbin.social ⁨1⁩ ⁨year⁩ ago
  I've seen AGI thrown around. Artificial General Intelligence.
  
  source
- imperator3733@lemmy.world ⁨1⁩ ⁨year⁩ ago
  “Artificial General Intelligence” (AGI) seems to be the new term for what used to be considered AI.
  
  I’m sure they’ll move the goalposts once again whenever “AI” stops bringing in the money and the VCs/Wall Street get ridiculously focused on “AGI” startups and scammers.
  
  source
- chameleon@kbin.social ⁨1⁩ ⁨year⁩ ago
  AGI (artificial general intelligence) is the current term for "The Concept Formerly Known As AI". Not really a new term, but it's only recently that companies decided that any algorithm can qualify as regular "AI" if they consider it good enough.
  
  source
- lloram239@feddit.de ⁨1⁩ ⁨year⁩ ago
  Artificial Intelligence never meant AGI. AI was simply the attempt to build software that can solve problems that computers traditionally can’t do, but humans can. That includes stuff like chess, image recognition, language, etc. Exactly the kind of things AI has been getting really good at over the last decade.
  
  AGI on the other side is an autonomous AI system that can solve all the problems a human can solve, not just some. That’s frankly quite a bit more of a blurry concept, since what human can do is largely just an artifact of evolution and our environment, and also quite different between humans. AGI is not some magical point that has any significance in the underlying science and it’s unlikely we ever land on exactly that point, as capabilities are just so different between humans and often far superior once we figured out the basics (e.g. StableDiffusion can paint images 1000x faster than a human, ChatGPT has more knowledge than any human ever had).
  
  Also AGI still doesn’t contain sentience, since that’s not really needed to replace a human.
  
  what do we call it when we do get real AI?
  
  The “real AI” is in the underlying algorithms, e.g. backpropagation. Those are the foundation of modern AI systems and those algorithms are what allows you to find pattern in the data. And they are the reason why the field is making seemingly such rapid progress, we can throw data at them and get good results. Our actual understanding why that all works is still somewhat limited, since those algorithms are what extracts the pattern from the data, not us.
  
  ChatGPT on the other side is just an application build with those algorithms and a lot of data. It’s interesting because we can play with it today, but AI models get thrown away and new ones get trained from scratch all the time.
  
  source
FlyingSquid@lemmy.world ⁨1⁩ ⁨year⁩ ago
Until some AI is trained to ignore that.

source
- radix@lemm.ee ⁨1⁩ ⁨year⁩ ago
  You don’t even need to train the AI to ignore it. You just need to not specifically tell it to pay attention to it.
  
  source
- Asudox@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Until I block that user agent from accessing my website like an IP block.
  
  source
TwoGems@lemmy.world ⁨1⁩ ⁨year⁩ ago
Is it technically possible to prevent AI scraping on your website?

source
- lloram239@feddit.de ⁨1⁩ ⁨year⁩ ago
  Yes, pull the plug that connects the machine to the Internet.
  
  source
  - TwoGems@lemmy.world ⁨1⁩ ⁨year⁩ ago
    lol
    
    source
- dangblingus@lemmy.world ⁨1⁩ ⁨year⁩ ago
  No. There is nothing a website admin can do to prevent it. Every single tool to flag an AI would be circumvented by the AI learning what tools are being used.
  
  source
dangblingus@lemmy.world ⁨1⁩ ⁨year⁩ ago
AI puts on ski mask Alright, I promise I’m not an AI scraping your website.

source
Psythik@lemm.ee ⁨1⁩ ⁨year⁩ ago
How do I opt in?

source
j4k3@lemmy.world ⁨1⁩ ⁨year⁩ ago
People fundamentally fail to understand what AI is useful for and what it is doing. It is not anything like an Artificial General Intelligence. It is like a better way to search for information and interface with it. Just use open source offline AI, not the proprietary crap. The real issue is not what the AI can create. This is no different than what a person is capable of when they are aware of the same content, albeit code, art, music, etc. Just because I am inspired by something, due to my awareness does not give the original inspirational source a right to my thoughts or products. AI works at the same level. It is an aggregate of all content, but contains none of the original works any more than a person that knows about the paintings and works of an artist and tries to paint something in a similar style.

The real issue that people fail to talk about is that AI can synthesize an enormous amount of data about a person after prolonged engagement. This is like open port access directly into your subconscious brain and there are plenty of levers and switches it can twist and toggle. Giving this kind of interpersonal access to a proprietary stalkerware system where parts of humans are whored out to the highest bidder for exploitation, that is totally insane. This type of data can manipulate people in a way that will sound like science fiction until it normalizes. Proprietary AI is criminal in its potential to manipulate and exploit especially in the political sphere.

source
- pjhenry1216@kbin.social ⁨1⁩ ⁨year⁩ ago
  It's not the same as an artist being inspired. It's more like an artist painting something in the style of someone else. AI can generate anything new and it doesn't transform things in its own way. It just copies and melds together. Nothing about it is really it's own. It's just a biased algorithm putting things together. Moreover, the artist could actually forget what the painting looks like, but still be inspired. If you erase something from the LLM, it will change it's output. It's basically more of a constant copying.
  
  That analogy is what a bunch of people who want to sell AI art try to pitch. It's the difference between content and art.
  
  source
  - j4k3@lemmy.world ⁨1⁩ ⁨year⁩ ago
    It is possible to do more of what I would call inspired. Models are not just restricted to “in the style of” in that unrelated abstract ideas can be mixed to create something altogether new. It takes a good model and training, but like this is just from 15 minutes of messing around in Stable Diffusion trying to make Van Gogh do his best impression of Bob Ross. I’m adding all kinds of inspirational concepts all the way to emotions and contrasting them and doing this in layers of refinement using a series of images. I’m not very practiced at this. I would call this an artist’s tool. Yes it changes the paradigm, but people need to get over there resistance to change as this evolution; adapt or die. Image
    
    I used tricks like image to image, and this was not my best result as far as Van Gogh:Bob Ross, but I like it most of the 150 images I made.
    
    Positive: texture, (in the style of Vincent van Gogh:Bob Ross), [nasa], swirl, spiral, foreground tree, mountain drive, kindness, love, masterclass, (abstract:1.8), painting, dark, silhouette, swirls, texture, branches, ocean waves, anger, lonely
    
    Negative: red, (signature), multiple moons, buildings, modern, structures, guard rail, snow, realism, yellow, orange, detailed mountains, left side line, stretchy stars, brake lights, forest
    
    Seed: 1053938996 Model: Absolute Reality V1.6525
    
    source
    -> View More Comments
- mojo@lemm.ee ⁨1⁩ ⁨year⁩ ago
  There is nothing in LLMs that is able to verify the truth. They should not be used for accurate information unless we make some sort of technological breakthrough on that front. It’s really good at generating plausible text though.
  
  source
  - j4k3@lemmy.world ⁨1⁩ ⁨year⁩ ago
    People, and the internet are no different. The vast majority of information that exists in incomplete or wrong at some level. Skepticism is always required but assessing any medium by its performance without premeditated bias is the only intelligent approach that can grow with improving technology. Very few people are running the larger models (like a 65B or larger) that they fully control in an environment where they control every aspect of the LLM. I have such a setup on my hardware running offline. On its own my system is ~95% accurate on the tasks I use it for and it is more accurate at these than results I find when searching the internet.
    
    There are already open source offline models specifically designed to work on scientific white paper archives where every result cites the source from its database.
    
    Agents are a class of AI where the AI is running a multi-model system and where one model can send the prompt to more specialized models or a series of models equip to check and verify a response and do things like cite sources or verify against a database.
    
    source
autotldr@lemmings.world [bot] ⁨1⁩ ⁨year⁩ ago
This is the best summary I could come up with:

Large language models are trained on all kinds of data, most of which it seems was collected without anyone’s knowledge or consent.

Now you have a choice whether to allow your web content to be used by Google as material to feed its Bard AI and any future models it decides to make.

It’s as simple as disallowing “User-Agent: Google-Extended” in your site’s robots.txt, the document that tells automated web crawlers what content they’re able to access.

“We’ve also heard from web publishers that they want greater choice and control over how their content is used for emerging generative AI use cases,” the company’s VP of Trust, Danielle Romain, writes in a blog post, as if this came as a surprise.

On one hand that is perhaps the best way to present this question, since consent is an important part of this equation and a positive choice to contribute is exactly what Google should be asking for.

On the other, the fact that Bard and its other models have already been trained on truly enormous amounts of data culled from users without their consent robs this framing of any authenticity.

The original article contains 381 words, the summary contains 190 words. Saved 50%. I’m a bot and I’m open source!

source
- cheese_greater@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Good robot
  
  source