On Getting Started With Regular Expressions
Dr. Drang, regarding Jason Snell’s tale of using BBEdit and Excel to create a working RSS feed for an old podcast, “Don’t Fear the Regex”: Although I do often write short programs for text munging, I typically resort to that only if the problem requires more than just large-scale text editing or if I expect to be repeating the process several times. And even then, I usually start out by playing around in BBEdit to see what searches, replacements, and rearrangements need to be done. It’s a convenient environment for getting immediate feedback on each transformation step. (And if you expect to do a series of text transformations often and really don’t want to get into writing scripts in Perl or Python or Ruby or whatever, BBEdit’s Text Factories allow you to string together any number of individual munging steps.) After I linked to Snell’s piece, a reader emailed to ask why I didn’t think this would’ve been better solved by writing a script in Perl/Python/Ruby or any other language with good regex support. Why use Excel for date transformations when scripting languages all have extensive date libraries? What Drang describes above is my process too. If the task at hand is something I only need to do once or twice, right now, it’s simply easier to just do it in BBEdit. I’m only going to make a proper script if it’s something I know or suspect I’ll reuse. But even when I do write a script to automate some sort of text munging, it inevitably starts with me working out the regex transformations step-by-step in BBEdit. Instant visual feedback with undo support — I’ve worked with text this way since 1992. Drang: Even worse, people who are thinking they should start using regular expressions often hear about this great book on the topic and have a natural reaction when they see it: A 500+ page book to learn how to search for text? No thanks. This is too bad, because while Friedl’s book is great, it’s called Mastering Regular Expressions for a reason, and that reason is not because it’s a tutorial. My recommendation for a tutorial is the one I learned from over 20 years ago: the “Searching with Grep” chapter in the BBEdit User Manual. I believe it was largely written by a young guy named John Gruber. As for the Grep chapter in BBEdit’s user manual — I did write a significant part of it, but I can’t take and shouldn’t get credit for all of it. Long story short, until BBEdit 6.5, BBEdit used a rather basic regex engine. If I recall correctly, it was a highly customized version of Henry Spencer’s classic library, which supported only the classic features of regular expression syntax. I pushed for BBEdit to switch to Philip Hazel’s excellent PCRE (Perl Compatible Regular Expressions) library, which supports just about every advanced bit of regex syntax anyone could want — and it’s fast, supports Unicode, written in good clean cross-platform C, [read more]
Daring FireballJan 11 8:02
2 weeks ago