I cannot understand and debug code written by AI. But I also cannot understand and debug code written by me.
Let’s just call it even.
Submitted 17 hours ago by AutistoMephisto@lemmy.world to technology@lemmy.world
https://open.substack.com/pub/leadershiplighthouse/p/i-went-all-in-on-ai-the-mit-study
I cannot understand and debug code written by AI. But I also cannot understand and debug code written by me.
Let’s just call it even.
So there’s actual developers who could tell you from the start that LLMs are useless for coding, and then there’s this moron & similar people who first have to fuck up an ecosystem before believing the obvious. Thanks fuckhead for driving RAM prices through the ceiling… And for wasting energy and water.
I can least kinda appreciate this guy’s approach. If we assume that AI is a magic bullet, then it’s not crazy to assume we, the existing programmers, would resist it just to save our own jobs. Or we’d complain because it doesn’t do things our way, but we’re the old way and this is the new way. So maybe we’re just being whiny and can be ignored.
So he tested it to see for himself, and what he found was that he agreed with us, that it’s not worth it.
Ignoring experts is annoying, but doing some of your own science and getting first-hand experience isn’t always a bad idea.
And not only did he see for himself, he wrote up and published his results.
100% this. The guy was literally a consultant and a developer. It’d just be bad business for him to outright dismiss AI without having actual hands on experience with said product. Clients want that type of experience and knowledge when paying a business to give them advice and develop a product for them.
Problem is that statistical word prediction has fuck-all to do with AI. It’s not and will never be. By “giving it a try” you contribute to the spread of this snake oil. And even if someone came up with actual AI, if it used enough resources to impact our ecosystem, instead of being a net positive, and if it was in the greedy hands of billionaires, then using it is equivalent to selling your executioner an axe.
They are useful for doing the kind of boilerplate boring stuff that any good dev should have largely optimized and automated already. If it’s 1) dead simple and 2) extremely common, then yeah an LLM can code for you, but ask yourself why you don’t have a time-saving solution for those common tasks already in place? As with anything LLM, it’s decent at replicating how humans in general have responded to a given problem, if the problem is not too complex and not too rare, and not much else.
Thats exactly what I so often find myself saying when people show off some neat thing that a code bot “wrote” for them in x minutes after only y minutes of “prompt engineering”. I’ll say, yeah I could also do that in y minutes of (bash scripting/vim macroing/system architecting/whatever), but the difference is that afterwards I have a reusable solution that: I understand, is automated, is robust, and didn’t consume a ton of resources. And as a bonus I got marginally better as a developer.
Its funny that if you stick them in an RPG and give them an ability to “kill any level 1-x enemy instantly, but don’t gain any xp for it” they’d all see it as the trap it is, but can’t see how that’s what AI so often is.
As you said, “boilerplate” code can be script generated - and there are IDEs that already do this, but in a deterministic way, so that you don’t have to proof-read every single line to avoid catastrophic security or crash flaws.
And then there are actual good developers who could or would tell you that LLMs can be useful for coding, in the right context and if used intelligently. No harm, for example, in having LLMs build out some of your more mundane code like unit/integration tests, have it help you update your deployment pipeline, generate boilerplate code that’s not already covered by your framework, etc. That it’s not able to completely write 100% of your codebase perfectly from the get-go does not mean it’s entirely useless.
Other than that it’s work that junior coders could be doing, to develop the next generation of actual good developers.
And then there are actual good developers who could or would tell you that LLMs can be useful for coding
The only people who believe that are managers and bad developers.
Maybe they’ll listen to one of their own?
The kind of useful article I would expect then is one exlaining why word prediction != AI
I really have not found AI to be useless for coding. I have found it extremely useful and it has saved me hundreds of hours. It is not without its faults or frustrations, but the it really is a tool I would not want to be without.
That’s because you are not a proper developer, as proven by your comment. And you create tech legacy that will have a net cost in terms of maintenance or downtime.
Don’t worry. The people on LinkedIn and tech executives tell us it will transform anything soon!
Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive.
And all they’ll hear is “not failure, metrics great, ship faster, productive” and go against your advice because who cares about three months later, that’s next quarter, line must go up now. I also found this bit funny:
I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me… I was proud of what I’d created.
Well you didn’t create it, you said so yourself, not sure why you’d be proud, it’s almost like the conclusion should’ve been blindingly obvious right there.
The top comment on the article points that out.
It’s an example of a far older phenomenon: Once you automate something, the corresponding skill set and experience atrophy. It’s a problem that predates LLMs by quite a bit. If the only experience gained is with the automated system, the skills are never acquired. I’ll have to find it but there’s a story about a modern fighter jet pilot not being able to handle a WWII era Lancaster bomber. They don’t know how to do the stuff that modern warplanes do automatically.
It’s more like the ancient phenomenon of spaghetti code. You can throw enough code at something until it works, but the moment you need to make a non-trivial change, you’re doomed. You might as well throw away the entire code base and start over.
And if you want an exact parallel, I’ve said this from the beginning, but LLM coding at this point is the same as offshore coding was 20 years ago. You make a request, get a product that seems to work, but maintaining it, even by the same people who created it in the first place, is almost impossible.
The thing about this perspective is that I think its actually overly positive about LLMs, as it frames them as just the latest in a long line of automations.
Not all automations are created equal. For example, compare using a typewriter to using a text editor. Besides a few details about the ink ribbon and movement mechanisms you really haven’t lost much in the transition. This is despite the fact that the text editor can be highly automated with scripts and hot keys, allowing you to manipulate even thousands of pages of text at once in certain ways. Using a text editor certainly won’t make you forget how to write like using ChatGPT will.
I think the difference lies in the relationship between the person and the machine. To paraphrase Cathode Ray Dude, people who are good at using computers deduce the internal state of the machine, mirror (a subset) of that state as a mental model, and use that to plan out their actions to get the desired result. People that aren’t good at using computers generally don’t do this, and might not even know how you would start trying to.
For years ‘user friendly’ software design has catered to that second group, as they are both the largest contingent of users and the ones that needed the most help. To do this software vendors have generally done two things: try to move the necessary mental processes from the user’s brain into the computer and hide the computer’s internal state (so that its not implied that the user has to understand it, so that a user that doesn’t know what they’re doing won’t do something they’ll regret, etc). Unfortunately this drives that first group of people up the wall. Not only does hiding the internal state of computer make it harder to deduce it, every “smart” feature they add to try to move this mental process into the computer itself only makes the internal state more complex and harder to model.
Many people assume that if this is the way you think about software you are just an elistist gatekeeper, and you only want your group to be able to use the computer. Or you might even be accused of ableism. But the real reason is what I described above, even if its not usually articulated in that way.
Now, I am of the opinion that the ‘mirroring the internal state’ method of thinking is the superior way to interact with the machine, and the approach to user friendliness I described has actually done a lot of harm to our relationship with computers at a societal level. (This is an opinion I suspect many people here would agree with.) And yet that does not mean that I think computers should be difficult to use. Quite the opposite, I think that modern computers are too complicated, and that in an ideal world their internal states and abstractions would be much simpler and more elegant, but no less powerful. (But elaborating on that would make this comment even longer.) Nor do I think that computers shouldn’t be accessible to people with different levels of ability. But just as a random person in a store shouldn’t grab a wheelchair user’s chair handles and start pushing them around, neither should Windows (for example) start changing your settings on updates without asking.
Anyway, all of this is to say that I think LLMs are basically the ultimate in that approach to ‘user friendliness’. They try to move more of your thought process into the machine than ever before, their internal state is more complex than ever before, and it is also more opaque than ever before. They also reflect certain values endemic to the corporate system that produced them: that the appearance of activity is more important than the correctness or efficacy of that activity. But that is, again, a whole other comment.
I agree with you, though proponents will tell you that’s by design. Supposedly, it’s like with high-level languages. You don’t need to know the actual instructions in assembly anymore to write a program with them. I think the difference is that high-level language instructions are still (mostly) deterministic, while an LLM prompt certaily isn’t.
Once you automate something, the corresponding skill set and experience atrophy. It’s a problem that predates LLMs by quite a bit. If the only experience gained is with the automated system, the skills are never acquired.
Well, to be fair, different skills are acquired. You’ve learned how to create automated systems, that’s definitely a skill. In one of my IT jobs there were a lot of people who did things manually, updated computers, installed software one machine at a time. But when someone figures out how to automate that, push the update to all machines in the room simultaneously, that’s valuable and not everyone in that department knew how to do it.
So yeah, I guess my point is, you can forget how to do things the old way, but that’s not always bad. Like, so you don’t really know how to use a scythe, that’s fine if you have a tractor, and trust me, you aren’t missing much.
I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me… I was proud of what I’d created.
Well you didn’t create it, you said so yourself, not sure why you’d be proud, it’s almost like the conclusion should’ve been blindingly obvious right there.
Does a director create the movie? They don’t usually edit it, they don’t have to act in it, nor do all directors write movies. Yet the person giving directions is seen as the author.
The idea is that vibe coding is like being a director or architect. I mean that’s the idea. In reality it seems it doesn’t really pan out.
You can vibe write and vibe edit a movie now too. They also turn out shit.
The issue is that llm isnt a person with skills and knowledge. Its a complex guessing box that gets thing kinda right, but not actually right, and it absolutely cant tell whats right or not. It has no actual skills or experience or humainty that a director can expect a writer or editor to have.
I think this kinda points to why AI is pretty decent for short videos, photos, and texts. It produces outputs that one applies meaning to, and humans are meaning making animals. A computer can’t overlook or rationalize a coding error the same way.
We’re about to face a crisis nobody’s talking about. In 10 years, who’s going to mentor the next generation? The developers who’ve been using AI since day one won’t have the architectural understanding to teach. The product managers who’ve always relied on AI for decisions won’t have the judgment to pass on. The leaders who’ve abdicated to algorithms won’t have the wisdom to share.
Except we are talking about that, and the tech bro response is “in 10 years we’ll have AGI and it will do all these things all the time permanently.” In their roadmap, there won’t be a next generation of software developers, product managers, or mid-level leaders, because AGI will do all those things faster and better than humans. There will just be CEOs, the capital they control, and AI.
What’s most absurd is that, if that were all true, that would lead to a crisis much larger than just a generational knowledge problem in a specific industry. It would cut regular workers entirely out of the economy, and regular workers form the foundation of the economy, so the entire economy would collapse.
“Yes, the planet got destroyed. But for a beautiful moment in time we created a lot of value for shareholders.”
That’s why they’re all-in on authoritarianism.
According to a study, the lower top 10% accounts for something like 68% of cash flow in the economy. Us plebs are being cut out all together.
That being said, I think if people can’t afford to eat, things might bet bad. We will probably end up a kept population in these ghouls fever dreams.
Once Boston Dynamic style dogs and Androids can operate over a number of days independently, I’d say all bets are off that we would be kept around as pets.
I’m fairly certain your Musks and Altmans would be content with a much smaller human population existing to only maintain their little bubble and damn everything else.
We’re all idiots. Even the titans of industry-- sit down with then-- they are idiots, at best. narcissists and criminals at worst. At least you and I lack the power to be that awful with our idiocy.
What does lower top 10% mean?
Yep, and now you know why all the tech companies suddenly became VERY politically active. This future isn’t compatible with democracy. Once these companies no longer provide employment their benefit to society becomes a big fat question mark.
Also, even if we make it through a wave of bullshit and all these companies fail in 10 years, the next wave will be ready and waiting, spouting the same crap - until it’s actually true (or close enough to be bearable financially). We can’t wait any longer to get this shit under control.
Great article, brave and correct. Good luck getting the same leaders who blindly believe in a magical trend for this or next quarters numbers; they don’t care about things a year away let alone 10.
I work in HR and was stuck by the parallel between management jobs being gutted by major corps starting in the 80s and 90s during “downsizing” who either never replaced them or offshore them. They had the Big 4 telling them it was the future of business. Know who is now providing consultation to them on why they have poor ops, processes, high turnover, etc? Take $ on the way in, and the way out. AI is just the next in long line of smart people pretending they know your business while you abdicate knowing your business or employees.
Hope leaders can be a bit braver and wiser this go 'round so we don’t get to a cliffs edge in software.
Tbh I think the true leaders are high on coke.
Exactly. The problem isn’t moving part of production to some other facility or buying a part that you used to make in-house. It’s abdicating an entire process that you need to be involved in if you’re going to stay on top of the game long-term.
Claude Code is awesome but if you let it do even 30% of the things it offers to do, then it’s not going to be your code in the end.
I’m trying
Much appreciated 🫡
Fractional CTO: Some small companies benefit from the senior experience of these kinds of executives but don’t have the money or the need to hire one full time. A fraction of the time they are C suite for various companies.
The developers can’t debug code they didn’t write.
This is a bit of a stretch.
agreed. 50% of my job is debugging code I didn’t write.
Vibe coders can’t debug code because they didn’t write
Vibe coders can’t debug code because they can’t write code
I mean I was trying to solve a problem t’other day (hobbyist) - it told me to create a
function foo(bar): await object.foo(bar)
then in object
function foo(bar): _foo(bar)
function _foo(bar): original_object.foo(bar)
like literally passing a variable between three wrapper functions in two objects that did nothing except pass the variable back to the original function in an infinite loop
add some layers and complexity and it’d be very easy to get lost
The few times I’ve used LLMs for coding help, usually because I’m curious if they’ve gotten better, they let me down. Last time it was insistent that its solution would work as expected. When I gave it an example that wouldn’t work, it even broke down each step of the function giving me the value of its variables at each step to demonstrate that it worked… but at the step where it had fucked up, it swapped the value in the variable to one that would make the final answer correct. It made me wonder how much water and energy it cost me to be gaslit into a bad solution.
How do people vibe code with this shit?
As a learning process it’s absolutely fine.
You make a mess, you suffer, you debug, you learn.
But you don’t call yourself a developer (at least I hope) on your CV.
I think it highly depends on the skill and experience of the dev. A lot of the people flocking into the vibe coding hype are not necessarily always people who know how about coding practices (including code review etc …) nor are experienced in directing AI agent to achieve such goals. The result is MIT prediction. Although, this will start to change soon.
Some can’t because they never acquired to skill to read code. But most did and can.
If you’ve never had to debug code. Are you really a developer?
There is zero chance you have never written a big so… Who is fixing them?
Unless you just leave them because you work for Infosys or worse but then I ask again - are you really a developer?
AI is hot garbage and anyone using it is a skillless hack. This will never not be true.
While this is a popular sentiment, it is not true, nor will it ever be true.
AI (LLMs & agents in the coding context, in this case) can serve as both a tool and a crutch. Those who learn to master the tools will gain benefit from them, without detracting from their own skill. Those who use them as a crutch will lose (or never gain) their own skills.
Some skills will in turn become irrelevent in day-to-day life (as is always the case with new tech), and we will adapt in turn.
LLMs exist so that skill-less hacks can pretend to be skilled artists. It’s a shortcut to success.
Wait so I should just be manually folding all these proteins?
Do you not know the difference between an automated process and machine learning?
I’ve heard that these tools aren’t 100% accurate, but your last point is valid.
GPTZero is 99% accurate.
I agree but look at that third paragraph, it has the dash that nobody ever uses. Tell tale signs right there
Aren’t these LLM detectors super inaccurate?
I’ve tested lots and lots of different ones. GPTZero is really good.
If you read the article again, with a critical perspective, I think it will be obvious.
Yes, but also the opposite. Don’t discount a valid point just because it was formulated using an LLM.
The story was invented so people would subscribe to his substack, which exists to promote his company.
We’re being manipulated into sharing made-up rage-bait in order to put money in his pocket.
Something any (real, trained, educated) developer who has even touched AI in their career could have told you.
What’s funny is this guy has 25 years of experience as a software developer. But three months was all it took to make it worthless.
As someone who has been shoved in the direction of using AI for coding by my superiors, that’s been my experience as well. It’s fine at cranking out stackoverflow-level code regurgitation and mostly connecting things in a sane way if the concept is simple enough. The real breakthrough would be if the corrections you make would persist longer than a turn or two. As soon as your “fix-it prompt” is out of the context window, you’re effectively back to square one. If you’re expecting it to “learn” you’re gonna have a bad time. If you’re not constantly double checking its output, you’re gonna have a bad time.
It’s still useful to have an actual “study” (I’d rather call it a POC) with hard data you can point to, rather than just “trust me bro”.
I was in charge of an AI pilot project two years back at my company. That was my conclusion, among others.
Untrained dev here, but the trend I’m seeing is spec-driven development where AI generates the specs with a human, then implements the specs. Humans can modify the specs, and AI can modify the implementation.
This approach seems like it can get us to 99%, maybe.
Trained dev with a decade of professional experience, humans routinely fail to get me workable specs without hours of back and forth meetings. I’d say a solid 25% of my work day is spent understanding what the stakeholders are asking for and how contort the requirements to fit into the system.
If these humans can’t be explict enough with me, a living thinking human that understands my architecture better than any LLM, what chance does an LLM have?
@AutistoMephisto@lemmy.world @technology@lemmy.world
I used to deal with programming since I was 9 y.o., with my professional career in DevOps starting several years later, in 2013. I dealt with lots of other's code, legacy code, very shitty code (especially done by my "managers" who cosplayed as programmers), and tons of technical debts.
Even though I'm quite of a LLM power-user (because I'm a person devoid of other humans in my daily existence), I never relied on LLMs to "create" my code: rather, what I did a lot was tinkering with different LLMs to "analyze" my own code that I wrote myself, both to experiment with their limits (e.g.: I wrote a lot of cryptic, code-golf one-liners and fed it to the LLMs in order to test their ability to "connect the dots" on whatever was happening behind the cryptic syntax) and to try and use them as a pair of external eyes beyond mine (due to their ability to "connect the dots", and by that I mean their ability, as fancy Markov chains, to relate tokens to other tokens with similar semantic proximity).
I did test them (especially Claude/Sonnet) for their "ability" to output code, not intending to use the code because I'm better off writing my own thing, but you likely know the maxim, one can't criticize what they don't know. And I tried to know them so I could criticize them. To me, the code is.. pretty readable. Definitely awful code, but readable nonetheless.
So, when the person says...
The developers can’t debug code they didn’t write....even though they argue they have more than 25 years of experience, it feels to me like they don't.
An LLM can generate code like an intern getting ahead of their skis. If you let it generate enough code, it will do some gnarly stuff.
Another facet is the nature of mistakes it makes. After years of reviewing human code, I have this tendency to take some things for granted, certain sorts of things a human would just obviously get right and I tend not to think about it. AI mistakes are frequently in areas my brain has learned to gloss over and take on faith that the developer probably didn’t screw that part up.
AI generally generates the same sorts of code that I hate to encounter when humans write, and debugging it is a slog. Lots of repeated code, not well factored. You would assume of the same exact thing is fine in many places, you’d have a common function with common behavior, but no, AI repeated itself and didn’t always get consistent behavior out of identical requirements.
His statement is perhaps an over simplification, but I get it. Fixing code like that is sometimes more trouble than just doing it yourself from the onset.
Now I can see the value in generating code in digestible pieces, discarding when the LLM gets oddly verbose for simple function, or when it gets it wrong, or if you can tell by looking you’d hate to debug that code. But the code generation can just be a huge mess and if you did a large project exclusively through prompting, I could see the end result being just a hopeless mess.v frankly surprised he could even declare an initial “success”, but it was probably “tutorial ware” which would be ripe fodder for the code generators.
Computers are too powerful and too cheap. Bring back COBOL, painfully expensive CPU time, and some sort of basic knowledge of what’s actually going on.
Pain for everyone!
“fractional CTO”(no clue what that means, don’t ask me)
For those who were also interested to find out this means: Consultant and advisor in a part time role, paid to make decisions that would usually fall under the scope of a CTO, but for smaller companies who can’t afford a full-time experienced CTO
Just sell it to AI customers for AI cash.
My big fear with this stuff is security. It just seems so “easy”, without knowledgeable people, for AI to write a product that functions from a user perspective but is wide open to attack.
I work in an company who is all-in on selling AI and we are trying desperately to use this AI ourselves. We’ve concluded internally that AI can only be trusted with small use cases that are easily validated by humans, or for fast prototyping work… hack day stuff to validate a possibility but not an actual a high quality safe and scalable implementation, or in writing tests of existing code, to increase test coverage (yes, I know thats a bad idea but QA blessed the result… so uh … cool).
The use case we zeroed in on is writing well schema'd configs in yaml or json. Even then, a good percentage of the time the AI will miss very significant mandatory sections, or add hallucinations that are unrelated to the task at hand. We then can use AI to test AI's work, several times using several AIs. And to a degree, it'll catch a lot of the issues, but not all. So we then code review and lint with code we wrote that AI never touched, and send all the erroring configs to a human. It does work, but cant be used for mission critical applications. And nothing about the AI or the process of using it is free. Its also disturbingly not idempotent. Did it fail? Run it again a few times and it'll pass. We think it still saves money when done at scale, but not as much as we promise external AI consumers. The Senior leadership know its currently overhyped trash and pressure us to use it anyway on expectations it'll improve in the future, so we give the mandatory crisp salute of alignment and we're off.
I will say its great for writing reviews. It adds nonsense and doesnt get the whole review correct, but it writes very flowery stuff so managers dont have to. So we use it for first drafts and then remove a lot of the true BS out of it. If it gets stuff wrong, oh well, human perception is flawed.
No shit
ask your ai pal for help
This has not been my experience at all. I have a top rated VR app and use AI to code everything and change things all the time. It is not hard to understand the code and then prompt the AI to change this or that and then test to see if it got it right. If it did not, just prompt again to address. Maybe this does not work for the author or others, but it has saved my hundreds of hours in my small app.
and in order for ai to do that, it has to employ strategy and resource management. Good luck
I did see someone write a post about Chat Oriented Programming, to me that appeared successful, but not without cost and extra care. [checkeagle.com/…/a-month-of-chat-oriented-program…](Original Link,) Discussion Thread
Successful in that it wrote code faster and its output stuck to conventions better than the author would. But they had to watch it like a hawk and with the discipline of a senior developer putting full attention over a junior, stop and swear at it every time it ignored the rules that they give at the beginning of each session, terminate the session when it starts doing a autocompactification routine that wastes your money and makes Claude forget everything. And you try to dump what it has completed each time. One of the costs seem to be the sanity of the developer, so I really question if it’s a sustainable way of doing things from both the model side and from developers. To be actually successful you need to know what you’re doing otherwise it’s easy to fall in a trap like the CTO, trusting the AI’s assertions that everything is hunky-dory.
They shipped a product in 3 months? What the fuck was it? New “under construction” page?
I needed to make a small change and realized I wasn’t confident I could do it.
Wouldn’t the point be to use AI to make the change, if you’re trying to do it 100% with AI? Who is really saying 100% AI adoption is a good idea though? All I hear about from everyone is how it’s not a good idea, just like this post.
Wasn’t this obvious? He didn’t need to go “all-in on ai” cause there is hundreds of thousands of people who tried the same thing already and everyone of them could tell him that’s not what ai can do.
Suffa@lemmy.wtf 2 hours ago
AI is really great for small apps. I’ve saved so many hours over weekends that would otherwise be spent coding a small thing I need a few times whereas now I can get an AI to spit it out for me.
But anything big and it’s fucking stupid, it cannot track large projects at all.
victorz@lemmy.world 1 hour ago
What kind of small things have you vibed out that you needed?
MrScottyTay@sh.itjust.works 1 hour ago
Encryption, login systems and pricing algorithms. Just the small annoying things /s