GREP and InDesign


If you know how to use GREP you love it, if not then you always keep wondering what the fuss is all about. This post is about what the fuss is all about.

I’ll also try and give you some examples, in context of InDesign, and point to to some excellent resources that passionate users have created that make GREP easier to understand. The first few topics are general to Regular Expressions, because I think it makes sense to understand the basics.  Let’s start at the beginning.

A bit of history

regular expression, also referred to as regex or regexp, is a concise and flexible method for matching strings of text, such as particular characters, words, or patterns of characters. A regular expression is written in a formal language that can be interpreted by a regular expression processor, a program that either serves as a parser generator or examines text and identifies parts that match the provided specification.

grep is a command line text search utility originally written for Unix. The name comes from the “ed” editor command, g/re/p (globalregular expressionprint). The grep command searches files or standard input globally for lines matching a given regular expression, and prints them to the program’s standard output.

Regular Expressions quickly became, and still remain, the fastest way to manipulate huge amounts of text. Regular Expression support in PERL popularized the use, and combined with CGI made the web dynamic.

In most software, InDesign included, you can search using Wildcards and Regular Expressions (grep). A wildcard character can be used to substitute for any other character or characters in a string. Regular Expressions are like wildcards on steroids! They can make you experience, joy, pain, frustration, and exhilaration, and some emotions that you didn’t even know existed. But, believe me, it’s worth it!

A Regular Expression that works is a piece of art. Sometimes you want to put it up on the wall.

- Vikrant Rai, circa 2011.

Why would you use Regular Expressions

A regular expression describes a set of strings. They are usually used to define a set, without having to list all elements. More simply put: to find (replace) text, when you don’t exactly know what to find. Let me try and explain:

You need to find and replace all the dates in your document, but you’re not sure where, what, and in what format the dates might be. Regular expressions make most sense in situation such as these.

  • \d{1,2}\/\d{1,2}\/\d{4}
    Looks for strings that will match the pattern [2 digits] / [2 digits] / [4 digits]. (dd/mm/yyyy or mm/dd/yyyy)
  • (0|1|2|3)[0-9][./-](0|1)[1-9][-/.](19|20)\d\d
    Finds valid dates in the dd/mm/ yyyy format.

Use GREP in InDesign

Before you can harness the power of grep in InDesign, you’ll need to write a regular expression. I found a few resources that help me in writing regular expressions.

Scripts

  • What the Grep! by Theunis de Jong (aka Jongware), is a script to help you understand regular expressions. Give this script a RegEx and it tells you what the RegEx means. Especially useful to document the RegEx or understand a regex written by someone else.
  • Grep Query Manager by Peter Kahrel

Articles and Cheat sheets

Give regular expressions and GREP a test drive. and let me know what you think.

7-Dec-2012
-Added links to a few more resources.

 

Bookmark and Share

, ,

  1. #1 by indigrep on June 25, 2011 - 2:37 am

    Hi
    A small correction : What the Grep is a great script by Theunis de Jonk (aka Jongware).
    Best
    Laurent Tournier

  2. #2 by indigrep on June 25, 2011 - 2:38 am

    Theunis de Jong, sorry

  3. #3 by Martinho da Gloria on June 29, 2011 - 7:35 am

    ….Not to mention Multi-Find/Change Plug-in for InDesign and InCopy
    http://www.automatication.com/index.php?id=12

  4. #4 by Vikrant on July 13, 2011 - 11:35 am

  5. #5 by Jaime Zuniga on May 23, 2013 - 12:13 am

    I receive InDesign files that are formatted like the following:

    English text (with its own formatting)
    Spanish translation (with a different formatting than above)

    What I am looking to do is this:
    English text (with its own formatting)
    English text (with a different formatting than above)

    That is, I want to replace the existing formatted Spanish text with a copy of the English text, but maintaining the formatting of the Spanish text.

    I am trying to do a search and replace in InDesign, using GREP, as follows:
    Search for: (.+)\r(.+)(?m)
    Replace with: $1\r$2

    Which essentially finds every instance of the bilingual text, but replaces both English instances with the formatting from the Spanish text.

    Any help in the right direction would be appreciated,
    Jaime

  6. #6 by Jim M on June 1, 2014 - 8:16 pm

    One cannot underestimate the power of grep use in UNIX scripts. Not sure though how helpful it could be with InDesign . We use it for analysis of big data or a regular basis.

(will not be published)


+ five = 13