NDSR – NY

National Digital Stewardship Residency in New York

Skip to content
  • About NDSR
  • About NDSR-NY
  • 2014/15 Residency
    • Projects 2014/15
    • Residents 2014/15
    • 2014-2015 Resident Blog
  • 2015/16 Residency
    • Projects 2015/16
    • Residents 2015/16
    • 2015-2016 Resident Blog
  • FAQ

PBCore RDF Ontology Hackathon at Code4Lib 2015

Posted on February 23, 2015 by Peggy Griesinger

Earlier this month, I had the opportunity to participate in my very first hackathon while at Code4Lib 2015 in Portland, OR. The goal of the hackathon was to evaluate the PBCore metadata standard (used to describe audiovisual materials – more about it here) and determine the best way to create a PBCore RDF ontology. There have already been some great recaps of the event, such as this one by WGBH’s Karen Cariani. This post is just my two cents on the experience of participating in a hackathon.

The hackathoners at the Airbnb house on the second day. Photo by Jack Brighton.

The hackathoners at the Airbnb house on the second day. Photo by Jack Brighton.

The hackathon took place the Saturday and Sunday before the conference at an Airbnb house a ways outside of downtown Portland. Seven of us stayed there for the entirety of the conference, and we were joined by five or six conference attendees who weren’t staying at the house. As a marked introvert, I was worried about the prospect of spending an entire week with people who were, to me, more or less strangers. What I found was that most of the housemates were the same as me – introverted librarians/archivists who loved metadata, audiovisual collections, and organizing stuff. Needless to say, there were no problems. Everyone got along fabulously and I think the fact that we were all situated in the same place for an extended period of time helped us when coming to decisions during the hackathon. In Karen Cariani’s post she discusses the benefits of working on this project while being physically together (rather than on conference calls). It was apparent that we would not have been able to make the progress we did without getting this group of people in the same place for an extended period of time.

hackathon_41

The Airbnb house also included friendly neighborhood cats, including Sina, pictured here showing me some love. All conferences should involve this many cats. Photo by Jack Brighton.

The goal of the hackathon as it was originally envisioned was to come to a consensus about how to move PBCore forward into RDF linked data. We figured we’d spend the entire weekend considering the various options, specifically whether it would be better to create and design an entire new RDF ontology for PBCore or whether it would be better to draw from existing ontologies (specifically, EBUCore, a European “sister” standard of PBCore – both standards developed as audiovisual-specific expansions of Dublin Core). We came to the conclusion, relatively quickly, that it would be superfluous to create an entirely new PBCore ontology when a perfectly good EBUCore ontology already existed with a majority of the information we would need to express. Granted, there are significant differences between the design, structure, and language of the two standards, as we discovered very clearly when we attempted to map from PBCore XML to EBUCore RDF/XML. However, there is enough similarity in the overall design of the standards that mapping between them is worth the effort.

It was exciting to come out of this hackathon with tangible results, rather than just ideas to present as possibilities. You can view some of these results – and contribute more! – at our GitHub repo: https://github.com/WGBH/pbucore. This repo contains our first attempts at mapping PBCore XML to EBUCore RDF/XML using XSLT – with my apologies for all those acronyms. Basically we decided that the best way to deal with creating an ontology for PBCore was to examine each PBCore element individually, down to its attributes, and see if it would be possible to map that into the language and hierarchy prescribed by EBUCore. We did find a number of gaps that will need to be filled (for example, Instantiation Generations cannot be expressed using current EBUCore), and for those we will need to create new structures in the ontology. Whether that is as an extension of EBUCore, as its own namespace, or something else entirely remains to be seen.

For me, one of the biggest conceptual challenges was translating between an XML metadata standard and an RDF ontology. These are two things that are structurally and conceptually very different. Ontology Development 101 by Natalya F. Noy  and Deborah L. McGuinness gives a good overview of the choices involved when designing an ontology, including things that one does not consider when making an XML schema such as classes, domains, and ranges. Because these things are not considered when designing a metadata standard, they can cause problems and inconsistencies when trying to map an existing metadata standard to an existing ontology. Things that were easily expressible in the metadata standard may become very difficult to express in the complex language of the ontology – although, luckily, the opposite is also true.

This hackathon was an enormously educational experience for me. I came into the hackathon with a very basic understanding of ontologies and the challenges involved in creating them, and came out of it with firsthand experience mapping an XML metadata standard to an RDF/XML ontology. I also became much more knowledgeable about what’s happening in Europe with linked data ontologies, as we were lucky to have Jean-Pierre Evain, creator of EBUCore, with us at the hackathon all the way from Switzerland (a fact which made the rest of us a little less grumpy about our East Coast/Midwest jetlag!). Even though I felt overwhelmed and out of my depth at the beginning of the hackathon, the immersive and collaborative nature of the hackathon and the intelligence and helpfulness of the other participants allowed me to quickly get up to speed. If you’re looking for a way to rip off the metaphorical bandaid and immerse yourself in something with which you’ve been meaning to gain more experience, I highly recommend a small, focused hackathon such as this one.

I consider my visit to Portland time well spent, with a great deal of professional development mixed with a healthy dose of sightseeing and, of course, sampling the local cuisine. The rest of Code4Lib 2015 is a topic too big for this post, although fellow NDSR resident Vicky Steeves does a great job discussing it here. For those planning on heading to Portland for AMIA 2015 or ACRL 2015: I am very jealous! Enjoy! If you have the opportunity, take the relatively brief trip out to the coast. I did, visiting Astoria, Oregon for my very first view of the Pacific Ocean:

OREGON COAST 2

The view from the coast of Astoria, Oregon. Photo by Peggy Griesinger.

For more specifics about the hackathon including notes, check out the hackathon wiki: http://wiki.code4lib.org/PBCore_RDF_Hackathon.

This entry was posted in 2014-2015 Resident Blog and tagged Metadata, NDSR, Ontologies, PBCore, Peggy Griesinger on February 23, 2015 by Peggy Griesinger.

Post navigation

← Applications now open for the 2015/16 round of NDSR in New York Digital Archives Project: Throwback Thursday Edition →
May 2025
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  
« May    

Archives

  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • July 2014

Bookmark This

Click here for a list of links featured on the NDSR-NY resident's blog entries, sorted by category.

NDSR-NY is presented by:

METRO_Logo_165

bhs

IMLS

Tags

  • American Museum of Natural History
  • AMIA14
  • AMNH
  • Archive-It
  • ARLIS
  • AVPreserve
  • BAM
  • born-digital
  • Carnegie Hall
  • CURATEcamp
  • Digital Archives Project
  • digital forensics
  • Digital humanities
  • Digital preservation
  • Donald Mennerich
  • Hack Day
  • IMLS
  • Internet Archive
  • Introduction
  • Jeremy Blake
  • Julia Kim
  • Kara Van Malssen
  • Karl Blumenthal
  • Metadata
  • METRO
  • MoMA
  • NDSA
  • NDSR
  • NDSR-NY
  • NYARC
  • NYU Libraries
  • OAIS
  • Peggy Griesinger
  • Peter Chan
  • Preservation Planning
  • Rebecca Fraimow
  • Records Management
  • Records Retention
  • science
  • Shira Peltzman
  • THATCamp
  • The Signal
  • Vicky Steeves
  • vickysteeves
  • Web archiving
Proudly powered by WordPress
css.php