Encode ALL the Things: When Your Metadata Dreams Meet Reality

When I get started researching a new metadata standard, I tend to get really excited thinking about all the possibilities that the different elements offer. Think of all the things you could do! The beautiful indexing that could result from your perfectly polished metadata! Take the Text Encoding Initiative (TEI), for example. I worked with TEI as a graduate assistant during library school. TEI is famous (or infamous, depending on who you ask) for having an incredibly extensive number of elements. I would look at those elements and think, “I will encode every single syllable in this document! Scholars will do groundbreaking research about the syllables in this thing!”

Phase One: Looking at a metadata standard for the first time.

That’s right about when you look at the pile of materials you need to encode using this metadata standard. There are probably isolated projects that have the time and resources to utilize close to every element of TEI. For most of us, even getting institutions to buy into using metadata schemas can be a trial, let alone attempting that level of granularity. It can be discouraging to feel like you’re losing all the shiny possibilities you imagined when you first looked at this standard.

Phase Two: Realizing exactly how many things you have to encode.

The best solution I’ve found for this is to think about your potential use cases for the metadata you’re looking to add to your collection. This is the phase I am at currently in my NDSR project. We’ve gotten very close to selecting the final format of our digitization process history metadata, but in order to design the XML records themselves, we need to know exactly how we want to use this information in the future. Knowing this will inform our decisions about which standards and elements to keep, and which are more trouble than they’re worth.

For example, one use case that has been discussed for process history metadata is needing to find every object that was digitized using a specific tool. The reasoning behind this is that there is always a possibility that some tool that seems sufficient today could have long-term consequences that only become apparent in five or ten years. This means that we absolutely need to include elements that describe things like manufacturer and model name of digitization tools, and that that information needs to be drawn from a controlled vocabulary to ensure accurate recall when searching.

I also find it really useful to think about the potential workflow that will result from creating metadata the way you propose. In my case, I know that media conservators already have quite a lot on their plate. Although process history is something they already record in a basic form in free text documents, asking them to create extremely granular, lengthy metadata records is not sustainable; the conservators simply don’t have the time to do detailed cataloging work in addition to all their regular duties. Although I, as a strange person who loves metadata and cataloging, would be happy to write detailed process history records, I have to keep in mind that the metadata systems I’m designing are not for me, the metadata specialist, but rather for media conservation specialists, for whom this is only a small part of their overall workload.

As disappointing as it can be to have to shelve grand ideas about beautifully intricate metadata records, I think it can also be really satisfying to present a metadata implementation that fits extremely well with your use cases and institution. It is no small feat taking an existing standard and molding it into something that fits your local needs while still maintaining the aspects of the standard that make it interoperable and useful. To me, it’s like the metadata version of finding the perfect foot for the glass slipper and living happily ever after with your XML.

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

NDSR – NY

National Digital Stewardship Residency in New York

Encode ALL the Things: When Your Metadata Dreams Meet Reality