Yeah, it could easily have added a couple of lines of code that sends everything to Northern Korean hackers because it found that in a bunch of repositories or just logging passwords to public logs or other things an experienced developer would never do. “AI” only replicates what it sees most often and as more spam and junk repos are added to its training data because “AI” companies are too concerned with profit to teach it properly, it could do tons of random stuff. It’s like training a developer by giving them random examples from the internet rather than specific ones. Of course they pick up bad habits. Even if it “works” it is almost never efficient or secure.
Comment on New ntfy.sh v2.18.0 was written by AI
henfredemars@infosec.pub 1 day ago
Definitely share your concern. Without strong review processes to ensure that every line of code follows the intent of the human developer, there’s no way of knowing what exactly is in there and the implications for the human users. And I’m not just talking about bugs. How do you know there isn’t malware?
They say it’s reviewed, but the temptation to blindly trust is there.
irotsoma@piefed.blahaj.zone 1 day ago
Slotos@feddit.nl 1 day ago
The size of that changeset means that it’s inherently unreviewable.
The commit history is something I’ve seen only in the PRs that even the most dysfunctional companies would demand a rewrite for.
Also, 2-3 weeks review? PostgreSQL support could be added in that time without the need for a damn „vibe check”. Hell, it would probably take less time than that.
MirrorGiraffe@piefed.social 1 day ago
To be fair they would have needed to spend time testing the manual implementation as well.
The problem I see mainly is that even if this rolls out perfectly, the erratic and changing nature if llms still make it pointless as a proof of concept. Next time Claude might fuck up in a fringe way that’s not covered by unit tests and is missed by manual tests.
On the other hand I guess I’ve been guilty myself on numerous occasions to implement fringe bugs into production code, but at least I learn from it.
Slotos@feddit.nl 1 day ago
I made my statement as a BDD/TDD practitioner.
The code goal of software engineering is not to deliver said code, but to deliver it in a framework that lets others—and consequently me in a week’s time—to contribute easily. This makes both future improvements and bug fixes easier.
Dumping a ~25000 lines changeset with a git history that’s almost designed to confuse is antithetical to both engineering and open source.