Comment

Comment on Microsoft 365's buggy Copilot 'Chat' has been summarizing confidential emails for a month — yet another AI privacy nightmare

Warl0k3@lemmy.world ⁨2⁩ ⁨weeks⁩ ago

For clarity, it’s only being summarized for the users that wrote it, it’s not leaking them to everyone. A comedically inept bug to allow though, holy shit.

source

Sort:hotnew top

Reygle@lemmy.world ⁨2⁩ ⁨weeks⁩ ago
AITA for understanding that as meaning in order to “summarize” the data the AI read it entirely and will never be instructed to “forget” that data

source
- TRBoom@lemmy.zip ⁨2⁩ ⁨weeks⁩ ago
  Unless someone has released something new while I haven’t been paying attention, all the gen AIs are essentially frozen. Your use of them can’t impact the actual weights inside of the model.
  
  If it seems like it’s remember things is because of the actual input of the LLM is larger than the input you will usually give it.
  
  For instance lets say the max input for a particular LLM is 9096 tokens. The first part of that will be instructions from the owners of the LLM to prevent their model from being used for things they don’t like. Lets say the first 2000 tokens. That leaves 7k or so for a conversation that will be ‘remembered’.
  
  Now if someone was really savvy, they’d have the model generate summaries of the conversation and stick them into another chunk of memory, maybe another 2000 tokens worth, that way it will seem to remember more than just the current thread. That would leave you with 5000 tokens to have a running conversation.
  
  source
  - dgdft@lemmy.world ⁨2⁩ ⁨weeks⁩ ago
    Microsoft is almost certainly recording these summarization requests for QA and future training runs; that’s where the leakage would happen.
    
    source
    TRBoom@lemmy.zip ⁨2⁩ ⁨weeks⁩ ago
    100% agree. At this point I am assuming everything sent through their servers is actively being collected for LLM training.
    
    source
    SirHaxalot@nord.pub ⁨2⁩ ⁨weeks⁩ ago
    That is kind of assuming the worst case scenario though. You wouldn’t assume that QA can read every email you send through their mail servers ”just because ”
    
    This article sounds a bit like engagement bait based on the idea that any use of LLMs is inherently a privacy violation. I don’t see how pushing the text through a specific class of software is worse than storing confidential data in the mailbox though.
    
    That is assuming that they don’t leak data for training but the article doesn’t mention that.
    
    source
    -> View More Comments
- fuckwit_mcbumcrumble@lemmy.dbzer0.com ⁨2⁩ ⁨weeks⁩ ago
  Why would that make you an asshole?
  
  source
  - Reygle@lemmy.world ⁨2⁩ ⁨weeks⁩ ago
    I’ve noticed growing opposition to critical thoughts about the sick and twisted nature of ai and the people who are in the cult.
    
    source
- VeganCheesecake@lemmy.blahaj.zone ⁨2⁩ ⁨weeks⁩ ago
  LLMs are stateless. The model itself stays the same. Doesn’t mean they’re not saving the data elsewhere, but the LLM does not retain interactions.
  
  source
horn_e4_beaver@discuss.tchncs.de ⁨2⁩ ⁨weeks⁩ ago
Allegedly

source
- Warl0k3@lemmy.world ⁨2⁩ ⁨weeks⁩ ago
  In this case there’s no evidence showing that it’s being spread widely - the bug reports are entirely about users being shown their own content. If you have something to dispute that I’m all ears.
  
  source
  - horn_e4_beaver@discuss.tchncs.de ⁨2⁩ ⁨weeks⁩ ago
    I was being a bit difficult tbh.
    
    But it is absolutely true that we can’t know for sure that it isn’t being leaked elsewhere.
    
    source