Troubleshooting the Delimited Text Translator

When working with the Delimited Text Translator, there are essentially 3 common errors that occur.  This knowledge-base article documents how you can over come each of them.

Error:

  1. Is there a data limit because only part of my abstract is showing up in the generated file?
  2. My ISBN data is all wonky!  Help!

These two errors are pretty common, and they are related to how Excel data types its content.  When MarcEdit reads data from Excel, it utilizes Excel’s OBDC components.  Since MarcEdit doesn’t know how the data has been typed in the Excel sheet, it uses the component to determine the data typing, and the retrieves the data.  But here’s the problem.  Unless the user specifically specifies the data type for all the data in the file (and very few people do that), then Excel utilizes predictive typing to determine how the data should be treated.  This means Excel looks at the first 6 lines, and based on the data in those lines, decides how the data should be typed.  So why does this matter?  Well, if in the first 6 rows, all your abstracts were less than 255 characters, the Excel will type the data as a VARCHAR, which has a data limit of 255 characters.  In Excel, you will see all the data, but for indexing and API purposes, only the first 255 characters are visible.  In fact, the ISBN wonkiness is rooted in the same problem.  If in the first 6 lines, your ISBN data looks like scientific notation, or another number type, Excel will type all your data that way and this is how MarcEdit will see it.

So how do you fix this and tell Excel not to do this?  Well, you can’t.  The only way to turn this feature off is to:

  1. Change a registry key, which makes Excel unstable
  2. Define your data types when you import your data.

For obvious reasons, the second option is the preferred, even if it is time consuming.  So how do you do that — see: Correct ISBNs converted to scientific notation in Excel

Error:

  1. I’m getting an error when I try to process Excel data (either .xsl or xslx files)?

Generally, this problem is accompanied with the following error:

Why you get this error, this is related to a very specific problem — the OBDC connector (the Excel API) isn’t visible to MarcEdit.  Fortunately, fixing this problem is pretty easy once you understand why it is happening.

Scenario 1:

Office is installed utilizing Office 365 (or business cloud offering) rather than physical disks.  Office 365 sandboxes it’s components, and the only way to expose them for use by 3rd-parties is to expose them at the time of installation (which no one ever thinks about) or you have to install the Microsoft Access Database Engine 2010 Redistributable (http://www.microsoft.com/en-us/download/details.aspx?id=13255).

Scenario 2:

MarcEdit can work with either 32 or 64 bit versions of Office, but the user must tell MarcEdit which version is being used.  By default, MarcEdit will assume that the 32-bit version of Office has been installed.  To denote Office version, open the Preferences, and select the Other tab.