The assignment was as follows:
After skimming each article below, select one to serve as the focal point for a discussion (~3 - 5 paragraphs posted to your blog or emailed to me) of your impressions of typical metadata quality problems that can occur. (NOTE: I'm more interested in your discussion of the kinds of problems rather than in the specific solutions proposed in these articles):
- Achieving and Maintaining Metadata Quality: Toward a Sustainable Workflow for the IDEALS Institutional Repository (2017)
- Fixing Yesterday’s Solutions: Data Cleanup in Serials Solutions 360 Core (2019)
- E-Data Quality: How Publishers and Libraries are Working Together to Improve Data Quality (2016)
- A worked example of fixing problem MARC data (Parts 1-5) (2015)
As for the first problem noted above, the bibliographic data records that libraries receive to accompany e-resources will often be incomplete to meet the library’s need or the data will be inaccurate to match the resource. One way this happens is when the data suppliers send print identifiers with e-resources. The print identifiers are inadequate for e-resources in a variety of ways. One of the most glaring problems with using the bibliographic metadata from a print identifier for an e-resource is that it lacks the 856 tag in the MARC record. Without this 856 tag, there would be no information for the system to provide location and access information to the user. Additional metadata that would be missing would be the format of the digital file, name of the host where the resource is actually being housed, the size of the file, the Uniform Resource Identifier (URI), etc., just to name a few.
Additionally, related to problem number one, even if the bibliographic metadata is available, holdings data can be inaccurate. For example, if a library subscribes to an e-book through any number of data-suppliers, there could be a limit to the number of digital copies of the resource that can be checked out by users at any given time. This is based on the subscription that the library chooses. If the library only allowed for 2 digital copies of a book to be in use at a time, when a user searches for the e-book in the integrated library system (ILS), the results could show that additional downloads are available or that there are no downloads available even if none of the digital copies have been check out. The holdings data is important even with e-resources.
Problem number two noted that bibliographic metadata and holdings data are often not synchronized. When a library subscribes to a new e-resource, data suppliers will often send all of the bibliographic metadata information included in the new subscription first. Then, based on the terms of the contract, the holdings data will come separate. This can often lead to inaccurate holdings data for the e-resources. This is a problem for the user because if the system perceives that the holding limit has been reached for a particular resource, it might not appear in the user’s search.
The inconsistent bibliographic metadata records and holding data being sent separately leads in to the final problem that libraries can often receive data that is in multiple formats. Not all data suppliers adhere to a standard format when providing the bibliographic metadata information. The file formats and the information contained within the file can often be very different than what the library actually needs. When bibliographic metadata arrives that is not accurate, the librarians must spend a significant amount of time making corrections to the data to ensure that the knowledge base has the most correct and updated information available. Another problem here (and with all of the other problems outlined in the article) is that libraries spend a significant amount of time and resources reformatting and completing the data. Of course, when librarians make corrections to the MARC records, a new problem can arise… the possibility of a localized error.
While one might assume that metadata records should be more accurate and easier to incorporate into an ILS than ever, that is certainly not the case. The authors of this particular study are proposing that libraries, service providers, and data suppliers all work more closely together to ensure that the bibliographic metadata records are as accurate as they can be when they arrive at the library.