Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Regular Expression Matching can be Ugly and Slow (stevehanov.ca)
42 points by bdfh42 on March 14, 2010 | hide | past | favorite | 9 comments


Although this might not be applicable given the topic, someone might as well say it:

> Some people, when confronted with a problem, think "I know, I’ll use regular expressions." Now they have two problems. --Jamie Zawinski

As a sort of 'untold' rule of thumb, I avoid writing regex scripts like the plague and let my macros do all the globbing. Although I'm sure writing stuff in bash or C on a regular basis would trump this rule altogether, but, then again, I also avoid writing/reading anything in bash or C like the plague as well. sighs


The original quote is: “If you have a problem and you think awk(1) is the solution, then you have two problems.” -David Tilbrook (http://regex.info/blog/2006-09-15/247)


What if my only other choice was sed or perl?


Well, there are problems for which regular expressions are a perfect solution.

The problem is when you try to do something like use regular expressions to match HTML. If you know the least bit of formal grammar theory, you'll know how maladapted that is.


To be fair, regexes can still be used when parsing a CFG.


To be less fair, it can also be used to parse some strictly written XHTML.


Really? XML tags seem to me to be essentially balanced parentheses.


(I shan't confess this is partly an excercise to see how deep HN will nest comments)

Lua is interesting in that instead of using traditional regex it comes with its own "pattern" matching - the implementation is much cleaner than most regex implementations. See http://www.lua.org/pil/20.1.html for a bit more info.

I reply to this comment because Lua patterns actually have an operator to match balanced parentheses (or any pair of characters): %b()

Also worth a look if you're feeling bored of regex are PEGs, which leads me nicely onto: http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html

(and finally yes, everything I do or speak about does tend to revolve around Lua)


In perl you can use http://search.cpan.org/~abigail/Regexp-Common-2010010201/lib...

The entire Regexp::Common namespace is basically a cookbook of the "right" way to use regular expressions. Worth stealing and porting to other languages...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: