Chapter 2: Working with Regular Expressions

Example 1:

Add a period to the 500 if it is missing

Find: (=500.*[^\W])$ Replace: $1.

Why does this work?

The key part of this expression is found at the end – the [^\W]$.  This tells the expression that I explicitly want to evaluate the last character in the string, and that the expression should only evaluate as true, if, and only if, the last character in the string is a non-word character.

Example 2:

Delete all non-LC Subjects using the Add/Delete Field Function

Field: 6xx Field Data: ^=6[0-9]{2}.{3}[^7]

Why this works:

The field part of this is not part of the regular expression.  In the Add/Delete Field function, fields can be referenced as groups.  The expression is what follows in the field data.  When working with the Add/Delete Field function, all data is exposed to the regular expression engine.  So, the expression sets an anchor that says the start of the field must begin with an equal-sign “=” then match any 6xx field.  Next, we use the .{3} syntax to match any of the next three values which would be the two spaces required by MarcEdit’s mnemonic format, and the first indicator.  The last value [^7], tells the expression engine to match any line that does not have a second indicator of 7.

Scoping

Within MarcEdit, how much data is exposed for regular expression evaluation will depend on the global editing function being utilized.  Generally, the global editing functions are scoped as follows:

  • Find/Replace: Regular Expression engine can read all record data. By default, expressions are scope to a single field, but multi-line expression can be written, changing the scope from a single field, to the entire record.
  • Add/Delete Field Function: The expression engine can see all data in a field, including the equal-sign at the start of the field.
  • Edit Field Function: The Expression engine can see subfield data.  No field or indicator data is visible to the Edit Field Function.  If you need to evaluate indicator data as part of an expression, you should use the Replace Function.
  • Edit Subfield Function: The Expression engine can only see the selected subfield, including the actual subfield code.
  • Copy Field Data: The Expression engine can see all field data.
  • Build New Field: Regular Expressions are limited to the specific parts of a record being used to build a new field.