Comment

Comment on Lutris now being built with Claude AI, developer decides to hide it after backlash

Think of it like a jeweller suddenly announcing they were going to start mixing in blood diamonds with their usual diamonds “good luck finding them”.

Functionally, blood diamonds aren’t different.

Leaving aside that you might not want blood diamonds, are you really going to trust someone who essentially says “Fuck you, i’m going to hide them because you’re complaining”

If you don’t know what blood diamonds are, it’s easily searchable.

I’ll go on record as saying the aesthetic diamond industry is inflationist monopolist bullshit, but that doesn’t alter the analogy

Secondly, it seems you don’t really understand why LLM generated code can be problematic, i’m not going to go in to it fully here but here’s a relevant outline.

LLM generated code can (and usually does) look fine, but still not do what it’s supposed to do.

This becomes more of an issue the larger the codebase.

The amount of effort needed to find this reasonable looking, but flawed, code is significantly higher than just reading a new dev’s version.

Hiding where this code is make it **even harder ** to find.

Hiding the parts where you really should want additional scrutiny is stupid and self-defeating.

source

Sort:hotnew top

pheelicks@lemmy.zip ⁨2⁩ ⁨months⁩ ago
Thanks, I think your first point is a really valid one. AI technology is far from clean, especially in a political scope.

To your second point. I see that, but on the other hand, it makes an impression on me as if human code would be free of such errors. I would not put human code on an (implied) pedestal (especially not mine), but maybe I’m missing your point. I think being suspicious about AI code is good but same goes for human code. To me it sounds like nobody should ever trust AI code because there can or will be mistakes you can’t see, which is reasonably careful at best and paranoid at worst. At some point there is no difference anymore between “it looks fine” and “it is fine”.

source
- Senal@programming.dev ⁨2⁩ ⁨months⁩ ago
  Let’s assume we’re skipping the ethical and moral concerns about LLM usage and just discuss the technical.
  
  it makes an impression on me as if human code would be free of such errors
  
  Nobody who knows anything about coding is claiming human code is error free, that’s why code reviews, testing and all the other aspects of the software development lifecycle exist.
  
  To me it sounds like nobody should ever trust AI code
  
  Nobody should trust any code unless it can be verified that it does what is required consistently and predictably.
  
  because there can or will be mistakes you can’t see, which is reasonably careful at best and paranoid at worst
  
  This is a known thing, paranoia doesn’t really apply here, only subjectively appropriate levels of caution.
  
  Also it’s not that they can’t be seen, it’s just that the effort required to spot them is greater and the likelihood to miss something is higher.
  
  Whether or not these problems can be overcome (or mitigated) remains to be seen, but at the moment it still requires additional effort around the LLM parts, which is why hiding them is counterproductive.
  
  At some point there is no difference anymore between “it looks fine” and “it is fine”.
  
  This is important because it’s true, but it’s only true if you can verify it.
  
  This whole issue should theoretically be negated by comprehensive acceptance criteria and testing but if that were the case we’d never have any bugs in human code either.
  
  Personally i think the “uncanny valley code” issue is an inherent part of the way LLM’s work and there is no “solution” to it, the only option is to mitigate as best we can.
  
  I also really really dislike the non-declarative nature of generated code, which fundamentally rules it out as a reliable end to end system tool unless we can get those fully comprehensive tests up to scratch, for me at least.
  
  source
  - pheelicks@lemmy.zip ⁨2⁩ ⁨months⁩ ago
    Thanks for taking the time to reply.
    
    Also it’s not that they can’t be seen, it’s just that the effort required to spot them is greater and the likelihood to miss something is higher.
    
    Greater compared to human code? Not sure about that, but I’m not disagreeing either. Greater compared to verified able programmers, sure, but in general?..
    
    I also really really dislike the non-declarative nature of generated code, which fundamentally rules it out as a reliable end to end system tool unless we can get those fully comprehensive tests up to scratch, for me at least.
    
    I don’t think I’m getting your point here. Do you mean by that, the code basically lacks focus on an end goal? Or are you talking about the fuzzyness and randomization of the output?
    
    source
    Senal@programming.dev ⁨2⁩ ⁨months⁩ ago
    
    Greater compared to human code? Not sure about that, but I’m not disagreeing either. Greater compared to verified able programmers, sure, but in general?..
    
    Both.
    
    The reasons are quite hard to describe, which is why it’s such a trap, but if you spend some time reviewing LLM code you’ll see what I mean.
    
    One reason is that it isn’t coding for logical correctness it’s coding for linguistic passability.
    
    Internally there are mechanisms for mitigating this somewhat, but its not an actual fix so problems slip through.
    
    I don’t think I’m getting your point here. Do you mean by that, the code basically lacks focus on an end goal? Or are you talking about the fuzzyness and randomization of the output?
    
    The latter, if you give it the exact same input in the exact same conditions, it’s not guaranteed to give you the same output.
    
    The fact that its sometimes close to the same actually makes it worse because then you can’t tell at a glance what has changed.
    
    It also isn’t a simple as using a diff tool, at least for anything non-trivial, because it’s variations can be in logical progression as well as language. Meaning you need to track these differences across the whole contextual area.
    
    As I said, there are mitigations, but they aren’t fixes.
    
    source