[table] [attr style=”width:90px”], “More discussion on how MarcEdit works with XML data can be found in the Chapter: Working with XML Data.”[/table]
Where do I find things?
Looking at Figure 1, it may appear that the MARC Tools component only provides access to tools and functions used in translating library metadata. That, however, would be an incorrect assumption. The following functions can be accessed from the MARC Tools component:
- MARCSplit — A tool that can split a single MARC file into multiple files.
- MARCJoin — A tool that can take multiple MARC files and join them into a single MARC file.
- Character Conversions — A tool that supports the ability to translate MARC records from a wide variety of supported character encodings.
- Batch Record Processing — A tool that facilitates the processing of multiple files at one time.
- MARCValidator — a tool for Validating MARC Records.
These tools can be accessed via Tools Menu Icon:
[table] [attr style=”width:90px”], “Access to most of these tools can also be found from the Main MarcEdit Window under either the Tools menu or Add-ins menu. Also, many tools like the Join, Split, and Merge tools can be added to the MarcEdit home screen through the preferences.”[/table]
While many of these functions will be discussed in later chapters, a couple of items are important to highlight.
- Edit XML Function List: Users can edit or modify how MarcEdit’s XML Transformations work by making changes to a specific XSLT’s configuration values. These values are set by editing the XML Function List options. This option is also how users add or delete XML functions from the application.
- Character Conversion Tools: Given the internationalize of Library metadata — MarcEdit provides a tool specifically to move data from local charactersets into UTF8 or MARC8 and then back into a local characterset. The tool provides access to the most common charactersets used by the MarcEdit community, but users can select from any characterset supported by their operating system.
- MARCValidator: The MARCValidator serves a number of roles within the MarcEdit application. While the tool does provide a Rules file that will enable a user to “validate” their record against AACR2 assumptions — for the purpose of the MARC Tools window — the MARCValidator provides a method for testing the structural validity of a particular set of records and isolating invalid data for later processing.
Processing MARC Data
A lot of people come to MarcEdit with a set of file data, and a need to get this data into a format that can be edited before loading it into their local library information system. Essentially, this is what the MARC Tools function does. It exposes to the user MarcEdit’s MARCEngine functionality, allowing users to “break” their binary MARC data into a more user friendly mnemonic format that can then be edited in MarcEdit’s MarcEditor or in any other text editing tool like Notepad+ or UltraEdit.
Mnemonic Format
The mnemonic format MarcEdit utilizes is very specific. Within the format, some characters are reserved (primarily the “$” which stands in for the delimiter character) and some characters have special meaning when placed at the front of lines. Here’s a list of the current rules that govern MarcEdit’s mnemonic format:
- New lines always start with an “=” sign. For example: =245 \\$aThis is a title.
- Fields use the following format: ={fieldnumber}{2 blank spaces}{ind0-9}. Number of indicators is defined within the MARC leader. The most important values in this mnemonic string are the equal sign, which designates a new field, and the two blanks between the field value and the first indicator (or the start of control data).
- Control field format (fields between 000-009): ={field}{2 blank spaces}[start of control data]
- Variable field format (fields between 010-999; aaa-ZZZ): ={field}{2 blank spaces}{ind0-9}[variable data with delimiters]
- Comments can be embedded in the mnemonic file format by using a pound sign at the beginning of a line: i.e.: #this is a comment.
- New lines designate the end of a field.
- Blank line indicates the end of a record
- All records must have a Leader designated in field =LDR or =000.
- The following values are reserved by the mnemonic format: “$”, “{“, “}”. These require the use of mnemonics when using these as literal values: {dollar} = “$”, {lcub} = “{“, {rcub} = “}”
- MARC records have field and record limits (9,999 bytes for fields, 99,999 bytes for records). The mnemonic format does not enforce these limits while editing, but when compiling back to MARC, the tool will truncate data if a record exceeds the field or record length limits.
Mnemonic Record Example:
=LDR 05658cam 2200949 a 4500 =001 ocm70844414\ =007 cr\||||||||||| =008 060807s2006\\\\maua\\\\of\\\\001\0\eng\c 020 \\$a1597490768 =072 \7$aCOM$x053000$2bisacsh =082 04$a005.8/068$222 =049 \\$aTFWW =100 1\$aSnedaker, Susan. =245 10$aSyngress IT security project management handbook$h[electronic resource] /$cSusan Snedaker ; Russ Rogers, technical editor. =246 30$aIT security project management handbook =260 \\$aRockland, MA :$bSyngress Pub.,$c{copy}2006. =300 \\$a1 online resource (xxvi, 612 pages) :$billustrations =336 \\$atext$btxt$2rdacontent =337 \\$acomputer$bc$2rdamedia =338 \\$aonline resource$bcr$2rdacarrier =500 \\$aIncludes index. =650 \0$aComputer security$xManagement$vHandbooks, manuals, etc. =776 08$iPrint version:$aSnedaker, Susan.$tSyngress IT security project management handbook.$dRockland, MA : Syngress, c2006$z1597490768$z9781597490764$w(OCoLC)72763213
Breaking MARC Records
To “break” a MARC record into MarcEdit’s mnemonic format for editing, select the MarcBreaker option, and then reference the file to be processed in the Input TextBox and the Save File in the Output Textbox. By default, MarcEdit associates .mrc file extensions with the MarcBreaker and .mrk file extensions with the MarcEditor. When selecting the input and output files, the tool will default to looking for a .mrc file for input and creating a .mrk file as the output. These are the extensions MarcEdit has registered with the program, but users are free to use whatever extension that would like to represent their data.