Comment on Why can't code be uncompiled?
KoboldCoterie@pawb.social 1 year agoThe main issue is that to make code human-readable, we include a lot of conventions that computers don’t need. We use specific formatting, name conventions, code structure, comments, etc. to help someone look at the code and understand its function.
Let’s say I write code, and I have a function named ‘findUserName’ that takes a variable ‘text’ and checks it against a global variable ‘userName’, to see if the user name is contained in the text, and returns ‘true’ if so. If I compile and decompile that, the result will be (for example) a function named ‘function_002’ that takes a variable ‘var_local_000’ and checks it against ‘var_global_115’. Also, my comments will be gone, and finding where the function was called from will be difficult. Yes, you could look at that code and puzzle out what it’s doing, but you wouldn’t know that var_global_115 is a username, so you’d have to go find where that variable was set and try to puzzle out where it was coming from, and follow that rabbit hole backwards until you eventually find a request for user input which you’d have to use context clues to determine the purpose of.
It’s not that the code you get back from a decompiler is incorrect or inefficient, it’s that it’s very much not human-readable without a lot of extra investigatory work.
Hotzilla@sopuli.xyz 1 year ago
This might change now relatively fast, now that large language models can process code, you could give the function to LLM to rename the function. Iterating over the code and rename all functions and variables.
This won’t of course reproduce exact code, but it makes one really heavy part of reconstruction to human readable much lighter.