Posts Tagged RegEx
If you know how to use GREP you love it, if not then you always keep wondering what the fuss is all about. This post is about what the fuss is all about.
I’ll also try and give you some examples, in context of InDesign, and point to to some excellent resources that passionate users have created that make GREP easier to understand. The first few topics are general to Regular Expressions, because I think it makes sense to understand the basics. Let’s start at the beginning.
A bit of history
A regular expression, also referred to as regex or regexp, is a concise and flexible method for matching strings of text, such as particular characters, words, or patterns of characters. A regular expression is written in a formal language that can be interpreted by a regular expression processor, a program that either serves as a parser generator or examines text and identifies parts that match the provided specification.
grep is a command line text search utility originally written for Unix. The name comes from the “ed” editor command, g/re/p (global / regular expression / print). The
grep command searches files or standard input globally for lines matching a given regular expression, and prints them to the program’s standard output.
Regular Expressions quickly became, and still remain, the fastest way to manipulate huge amounts of text. Regular Expression support in PERL popularized the use, and combined with CGI made the web dynamic.
In most software, InDesign included, you can search using Wildcards and Regular Expressions (grep). A wildcard character can be used to substitute for any other character or characters in a string. Regular Expressions are like wildcards on steroids! They can make you experience, joy, pain, frustration, and exhilaration, and some emotions that you didn’t even know existed. But, believe me, it’s worth it!
A Regular Expression that works is a piece of art. Sometimes you want to put it up on the wall.
– Vikrant Rai, circa 2011.
Why would you use Regular Expressions
A regular expression describes a set of strings. They are usually used to define a set, without having to list all elements. More simply put: to find (replace) text, when you don’t exactly know what to find. Let me try and explain:
You need to find and replace all the dates in your document, but you’re not sure where, what, and in what format the dates might be. Regular expressions make most sense in situation such as these.
Looks for strings that will match the pattern [2 digits] / [2 digits] / [4 digits]. (dd/mm/yyyy or mm/dd/yyyy)
Finds valid dates in the dd/mm/ yyyy format.
Use GREP in InDesign
Before you can harness the power of grep in InDesign, you’ll need to write a regular expression. I found a few resources that help me in writing regular expressions.
- What the Grep! by Theunis de Jong (aka Jongware), is a script to help you understand regular expressions. Give this script a RegEx and it tells you what the RegEx means. Especially useful to document the RegEx or understand a regex written by someone else.
- Grep Query Manager by Peter Kahrel
Articles and Cheat sheets
- Grep In InDesign by Peter Kahrel, and published by O’Reilly Media is a useful resource.
- Regular Expression Cheat Sheet by David Child. Though this is not aimed at InDesign, it is still an excellent resource. In fact I have a copy stuck on my desk.
- InDesign Secrets videocast
- GREP resources on IndesignSecrets
- Regex Tester is an online utility that I use to help write, rewrite, and rewrite Regular Expressions.
Give regular expressions and GREP a test drive. and let me know what you think.
-Added links to a few more resources.