Am I the only one shocked to learn that to find something at the end of a string it starts at the beginning? Perhaps it’s because of the simplicity of the example but I expected it to start at the end.
A comprehensive guide to the dangers of Regular Expressions in JavaScript
Submitted 1 year ago by philnash@programming.dev to programming@programming.dev
https://www.sonarsource.com/blog/vulnerable-regular-expressions-javascript/
Comments
Alexstarfire@lemmy.world 1 year ago
sebsch@discuss.tchncs.de 1 year ago
Is there one thing not screwed up in this language? I mean it’s regex, there are so many good implementations for it.
philnash@programming.dev 1 year ago
JavaScript’s regex engine isn’t the only one to have these problems. There certainly are other implementations, like Re2 and Rust’s implementation, that don’t have this issue. But they also lack some of the features of the JS implementation too.
sebsch@discuss.tchncs.de 1 year ago
Ok thanks for the clarification.
I would argue, the gold standard of regex would be perlre or even re from python. I never heard one discouraging using them. Do you know sth I don’t?
recursive_recursion@programming.dev 1 year ago
Although I haven’t fully read this article you can definitely crosspost in:
philnash@programming.dev 1 year ago
Ah, I didn’t realise there was a regex channel here. Thanks!
jeffhykin@lemm.ee 1 year ago
This is why we need regex licenses regexlicensing.org
/s
philnash@programming.dev 1 year ago
That’s brilliant!
Turun@feddit.de 1 year ago
The visualization was great! The double loops jump out immediately and make it easy to recognize problematic expressions.
MonkderZweite@feddit.ch 1 year ago
Guide to the dangers of Javascript, no?
philnash@programming.dev 1 year ago
While this article is about JavaScript specifically, these issues certainly exist in other regex engines too.
lastunusedusername2@sh.itjust.works 1 year ago
No
cgtjsiwy@programming.dev 1 year ago
Regular expressions are great and can always be matched in linear time with respect to the input string length.
The problem is that JS standard library RegExps aren’t actually regular expressions, but rather a much broader language, which is impossible to implement efficiently. If RegExp switched to proper regular expressions, they would match much faster but supporting backreferences like /(.*)x\1/ would be impossible.
Turun@feddit.de 1 year ago
If you insist on the definition as it is in formal language theory.
In practice regex is widely used to mean the pattern matching thing that also supports back references.
Wikipedia suggests using the term “regular expressions” for the language theory thing and “regex” for the programming language (PCRE) thing. I agree and would even go further and say that any time one wants to refer to the concept as it is used in formal language theory they should explicitly specify that they are talking about the theoretical concept, not the regex implementation as it is found in most programming languages.