Daniel Stenberg claims that the curl bug reporting system is effectively DDOSed by AI wrongly reporting various issues. Doesn’t seem like a good feature in a code auditor.
Comment on Do you actually audit open source projects you download?
Ahardyfellow@lemmynsfw.com 1 week ago
I know lemmy hates AI but auditing open source code seems like something it could be pretty good at. Maybe that’s something that may start happening more.
tburkhol@lemmy.world 1 week ago
treadful@lemmy.zip 1 week ago
I’ve been on the receiving end of these. It’s such a monumental time waster. All the reports look legit until you get into the details and realize it’s complete bullshit.
But if you don’t look into it maybe you ignored a real report…
notabot@lemm.ee 1 week ago
‘AI’ as we currently know it, is terrible at this sort of task. It’s not capable of understanding the flow of the code in any meaningful way, and tends to raise entirely spurious issues (see the problems the curl author has with being overwhealmed for example). It also wont spot actually malicious code that’s been included with any sort of care, nor would it find intentional behaviour that would be harmful or counterproductive in the particular scenario you want to use the program.
semperverus@lemmy.world 1 week ago
Having actually worked with AI in this context alongside github/azure devops advanced security, I can tell you that this is wrong. As much as we hate the AI, and as much as people like to (validly) point out issues with hallucinations, overall it’s been very on-point.
notabot@lemm.ee 1 week ago
Could you let me know what sort of models you’re using? Everything I’ve tried has basically been so bad it was quicker and more reliable to to the job myself. Most of the models can barely write boilerplate code accurately and securely, let alone anything even moderately complex.
I’ve tried to get them to analyse code too, and that’s hit and miss at best, even with small programs. I’d have no faith at all that they could handle anything larger; the answers they give would be confident and wrong, which is easy to spot with something small, but much harder to catch with a large, multi process system spread over a network. It’s hard enough for humans, who have actual context, understanding and domain knowledge, to do it well, and I’ve, personally, not seen any evidence that an LLM (which is what I’m assuming you’re referring to) could do anywhere near as well. I don’t doubt that they flag some issues, but without a comprehensive, human, review of the system architecture, implementation and code, you can’t be sure what they’ve missed, and if you’re going to do that anyway, you’ve done the job yourself!
Having said that, I’ve no doubt that things will improve, programming languages have well defined syntaxes and so they should be some of the easiest types of text for an LLM to parse and build a context from. If that can be combined with enough domain knowledge, a description of the deployment environment and a model that’s actually trained for and tuned for code analysis and security auditing, it might be possible to get similar results to humans.
semperverus@lemmy.world 1 week ago
Its just whatever is built into copilot.
You can do a quick and dirty test by opening copilot chat and asking it something like “outline the vulnerabilities found in the following code, with the vulnerabilities listed underneath it. Outline any other issues you notice that are not listed here.” and then paste the code and the discovered vulns.
eksb@programming.dev 1 week ago
Lots of things seem like they would work until you try them.
mobotsar@sh.itjust.works 1 week ago
I’m writing a paper on this, actually. Basically, it’s okay-ish at it, but has definite blind spots. The most promising route is to have AI use a traditional static analysis tool, rather than evaluate the code directly.
semperverus@lemmy.world 1 week ago
That seems to be the direction the industry is headed in. GHAzDO and competitors all seem to be converging on using AI as a force-multiplier on top of the existing solutions, and it works surprisingly well.
Solumbran@lemmy.world 1 week ago
It wouldn’t be good at it, it would at most be a little patch for non audited code.
In the end it would just be an AI-powered antivirus.
AA5B@lemmy.world 1 week ago
I’m actually planning to do an evaluation of a n ai code review tool to see what it can do. I’m actually somewhat optimistic that it could do this better than it can code
I really want to sic it on this one junior programmer who doesn’t understand that you can’t just commit ai generated slop and expect it to work. This last code review after over 60 pieces of feedback I gave up on the rest and left it as he needs to understand when ai generated slop needs help
ilinamorato@lemmy.world 1 week ago
This is one of the few things that AI could potentially actually be good at. Aside from the few people on Lemmy who are entirely anti-AI, most people just don’t want AI jammed willy-nilly into everything.
cm0002@lemmy.world 1 week ago
Those are silly folks lmao
Exactly, fuck corporate greed!
ilinamorato@lemmy.world 1 week ago
Eh, I kind of get it. OpenAI’s malfeasance with regard to energy usage, data theft, and the aforementioned rampant shoe-horning (maybe “misapplication” is a better word) of the technology has sort of poisoned the entire AI well for them, and it doesn’t feel (and honestly isn’t) necessary enough that it’s worth considering ways that it might be done ethically.
I don’t agree with them entirely, but I do get where they’re coming from. Personally, I think once the hype dies down enough and the corporate money (and VC money) gets out of it, it can finally settle into a more reasonable solid-state and the money can actually go into truly useful implementations of it.
cm0002@lemmy.world 1 week ago
I mean that’s why I call them silly folks, that’s all still attributable to that corporate greed we all hate, but I’ve also seen them shit on research work and papers just because “AI” Soo yea lol
wise_pancake@lemmy.ca 1 week ago
I don’t hate AI, I hate how it was created, how it’s foisted on us, the promises it can do things it really can’t, and the corporate governance of it.
But I acknowledge these tools exist, and I do use them because they genuinely help and I can’t undo all the stuff I hate about them.
If I had millions of dollars to spend, sure I would try and improve things, but I don’t.