Yeah! And integers do way too many things as well. Counters, indexes, number of orange slices in an orange, there’s just no end to the wacky things people try to make integers do, and it’s impossible to keep track of it all when looking at code. And floats? Don’t get me started on floats. Angles, probabilities, weights, heights, degrees of separation from Kevin Bacon … I’m getting dizzy just listing all these different things that floats do.
It’s a big problem, because there isn’t an easy way to fix it in every programming language known to man, and someone needs to write more articles about this to get more hits for their sites.
fubo@lemmy.world 10 months ago
Any time you’re turning a string of input into something else, what you are doing is parsing.
Even if the word “parser” never appears in your code, the act of interpreting a string as structured data is parsing, and the code that does parsing is a parser.
Programmers write parsers quite a lot, and many of the parsers they write are ad-hoc, ill-specified, bug-ridden, and can’t tell you why your input didn’t parse right.
Writing a parser without realizing you’re writing a parser, usually leads to writing a bad parser. Bad parsers do things like accepting malformed input that causes security holes. When bad parsers do reject malformed input, they rarely emit useful error messages about why it’s malformed. Bad parsers are often written using regex and duct tape.
Try not to write bad parsers. If you need to parse something, consider writing a grammar and using a parser library. (If you’re very ambitious, try a parser combinator library.) But at least try to recall something about parsers you learned once way back in a CS class, before throwing regex at the problem and calling it a day.
(And now the word “parser” no longer makes sense, because of semantic satiation.)
ono@lemmy.ca 10 months ago
Amen.
Spoiler alert: Few of them are good, and those that are so simple that you might as well not use a library.
The only way to validate an email address is to send a message to it, and verify that it arrived.
Jesus_666@feddit.de 10 months ago
You can use a regex to do basic validation. That regex is
.+@.+
. Anything beyond that is a waste of time.fubo@lemmy.world 10 months ago
If you’re accepting email addresses as user input (e.g. from a web form), it might be nice to check that what’s to the right of the rightmost
@
sign is a domain name with an MX or A record. That way, if a user enters a typo’d address, you have some chance of telling them that instead of handing an email touser#example.net
to your MTA.But the validity of the local-part (left of the rightmost
@
) is up to the receiving server.grue@lemmy.world 10 months ago
Speaking of things you can’t parse with regex…
derpgon@programming.dev 10 months ago
The best validation for a valid email address is always sending a verification mail. I’ve rejected countless MRs that contained ad-hoc regex email validation copied from the internet. I’d allow a check for it to contain an
@
.MinekPo1@lemmygrad.ml 10 months ago
Honestly I feel like PHP regex may be able to parse html but not entirely sure
abhibeckert@lemmy.world 10 months ago
Speak for yourself. I’ve done it exactly once. It didn’t work, and never shipped.
oscar@programming.dev 10 months ago
So you have never iterated over command line arguments and tried to identify options? Or taken a string input field?