Build New Field

The build new field function provides a flexible method for moving data from multiple data points, and recreate one or more new fields. In a lot of ways, this function compliments the Swap Field function, which copies whole subfield elements to new fields. The build new field function captures the actual field data, and rebuilds a new field based on the defined pattern.

The Build New Field Function is found in the MarEditor, in the Tools menu.

Patterns:

The build new field function works by extracting data and creating new fields through patterns. Patterns can look like the following:

=099 \\$aTitle Data: {245$a} : Publication Data: {264$c}

Patterns are structured by using the field/subfield mnemonic structure. The following structures are valid:

  • {###$a}
  • {###}

Functions

In addition to capturing data from a subfield or field, you can utilize functions as part of the data capture. Current processing functions available to be used as part of a pattern are:

  • find
    Example: {090$a.find(‘terry’)}
  • replace(find,replace)
    Example: {090$a.replace(‘find’,’replace’);
  • regex
    Example: {090$a.regex(‘pattern’,’replace’)}
  • trim(chars)
    Example: {090$a.trim(‘ :.’)}
  • trimend(chars)
    Example: {090$a.trimend(‘. :,’)}
  • trimstart(chars)
    Example: {090$a.trimstart(‘. :,’)}
  • substring(start,length)
    Example: {090$a.substring(0,3)}

Functions can be stacked and mixed. Only Find cannot be used multiple times and must be the first function when used. Example:

{090$a.find(‘terry’).replace(‘terry’,’tr’).trim(‘?. :’)}

In this example, the function will find an 090$a with ‘terry’ in it, and then will replace terry with tr, and then trim punctuation off the start and end of the string.

Creating Multiple Fields

The process will be based on the presence or lack of a new element in the pattern – a variable marker that MarcEdit will use to internally hold a tracking/token variable.

Example:

=040 \$aMiU$cMiU
=040 \$aBDS$beng$cBDS$dOCLCQ$dABCU
=041 \$aengrusger
=043 \$ae-gx---$ae-uk---$an-us---
=090 \$aTK1005$b(INTERNET) $c[UK.]

Say we have these fields – and the pattern I want to create is a 999 field, and in that field, I want to create a new 999 field for each 040$a – but I would also like to have the 090$a to be a part of the pattern.

The new pattern would look like this:

=999 \$a{040$a[x]} : {090$a}

This pattern would generate the following results:

=999 \$aMiU : TK1005
=999 \$aBDS : TK1005

If I changed the pattern to:

=999 \$a{040$a} : {090$a}

The program falls back to use the current functionality (only one field is created).

Please note, you cannot ask for a specific 040 to be used (outside of using find/reg functions inside the pattern) – the data inside the [x] isn’t an integer you can set.  It is a value that indicates to MarcEdit that the subfield should be tracked and multiple fields are desired. 

The [x] syntax works both after the subfield or after the field number, with data being scoped based on the location of the [x].  Any other value other than [x] will likely result in inconsistent results.  The [x] bracket is a reserved element within the field to indicate that multiple field generation is desired, and to tell the program to tokenize the data marked.

Finally – the tool placed data in the index range of the new field being generated.  So, consider this example:

=040 \$aMiU$cMiU
=040 \$aBDS$beng$cBDS$dOCLCQ$dABCU
=041 \$aengrusger
=043 \$ae-gx---$ae-uk---$an-us---
=090 \$aTK1005$b(INTERNET) $c[UK.]

If I used the following pattern:

=999 \$a{040$a[x]} : {090$a[x]}

The expected results would be:

=999 \$aMiU : TK1005
=999 \$aBDS :

Why?  Because the tool will slot values marked with the multi-field value [x] into the same field groups.  Since only one 090$a exists, the tool only updates the field group that it belongs.  However, if I had the following data:

=040 \$aMiU$cMiU
=040 \$aBDS$beng$cBDS$dOCLCQ$dABCU
=041 \$aengrusger
=043 \$ae-gx---$ae-uk---$an-us---
=090 \$aTK1005$b(INTERNET) $c[UK.]
=090 \$aG24211$b(INTERNET)

And used this pattern:

=999 \$a{040$a[x]} : {090$a[x]}

I would expect the following result:

=999 \$aMiU : TK1005
=999 \$aBDS : G24211

Again – internally, MarcEdit is creating tokens of data with the [x] and placing them within the same scope.  So, the tool would create new fields, placing data within the same scope within the new fields.