Archives Featured Preservation — 05 January 2014

We now have over 9,000 oral histories at the Louie B. Nunn Center for Oral History at the University of Kentucky Libraries, where we take great care to curate these treasures.  Recently, I was going through a box of old reel-to-reel and cassette tapes that were never accessioned into the oral history collection.   Some were recordings of various presentations and, therefore, not accessioned into the oral history collection, and some were actually blank.  I discovered the reel to reel tape pictured here with the following identifier taped on the case:

“Various excerpts of interviews with 4 people about someone or something else.”

I was reminded of the power and critical importance of metadata.

The following is an excerpt about metadata and oral history from my  recent article “The Digital Mortgage: Digital Preservation of Oral History,” in Oral History in the Digital Age.


Overwhelm the Future with Metadata

Information is power and good technical and preservation metadata collected in the present will transform future archivist’s abilities to effectively curate your digital assets.  Two critical aspect of future digital continuity is technical and preservation metadata.  The bad news is this contains a massive quantity of elements that would take hours to individual enter into a collection management system.  The good news is that most of these elements can be automatically harvested using free tools such as MediaInfo.  The following is an example of a MediaInfo export of technical metadata for a video file.  The following video file is a .mov that was given to the Nunn Center from a video-editing studio.  The .mov file is a wrapper containing a variety of variables including audio and video codecs , audio and video resolutions, audio and video bitrates, frame rates, color sampling etc.  Each of these elements are critical for making a video file work in the future.  While everyone may not understand what role each element plays in the technical process, it is key to document this metadata in an archival collection management system so that future obsolescence can be monitored.

  • Complete name                      : /C2KY ProRes
  • Format                                       : MPEG-4
  • Format profile                           : Base Media / Version 2
  • Codec ID                                   : mp42
  • File size                                     : 1.40 GiB
  • Duration                                     : 13mn 16s
  • Overall bit rate mode               : Variable
  • Overall bit rate                           : 15.1 Mbps
  • Encoded date                           : UTC 2011-07-06 16:47:17
  • Tagged date                              : UTC 2011-07-06 16:47:17


  • ID                                                  : 1
  • Format                                         : AVC
  • Format/Info                                  : Advanced Video Codec
  • Format profile                              : Main@L4.2
  • Format settings, CABAC           : Yes
  • Format settings, ReFrames     : 3 frames
  • Codec ID                                       : avc1
  • Codec ID/Info                                : Advanced Video Coding
  • Duration                                         : 13mn 16s
  • Bit rate                                            : 15.0 Mbps
  • Width                                              : 1 920 pixels
  • Height                                            : 1 080 pixels
  • Display aspect ratio                    : 16:9
  • Frame rate mode                        : Constant
  • Frame rate                                    : 29.970 fps
  • Standard                                       : NTSC
  • Color space                                  : YUV
  • Chroma subsampling                : 4:2:0
  • Bit depth                                        : 8 bits
  • Scan type                                      : Progressive
  • Bits/(Pixel*Frame)                       : 0.241
  • Stream size                                   : 1.39 GiB (99%)
  • Language                                      : English
  • Encoded date                                : UTC 2011-07-06 16:47:17
  • Tagged date                                   : UTC 2011-07-06 16:47:17


  • ID                                             : 2
  • Format                                    : AAC
  • Format/Info                            : Advanced Audio Codec
  • Format profile                       : LC
  • Codec ID                               : 40
  • Duration                                : 13mn 16s
  • Bit rate mode                       : Variable
  • Bit rate                                   : 157 Kbps
  • Maximum bit rate                 : 237 Kbps
  • Channel(s)                           : 2 channels
  • Channel positions              : Front: L R
  • Sampling rate                      : 48.0 KHz
  • Compression mode           : Lossy
  • Stream size                           : 14.9 MiB (1%)
  • Language                              : English
  • Encoded date                       : UTC 2011-07-06 16:47:17
  • Tagged date                          : UTC 2011-07-06 16:47:17
  • Material_Duration                : 796629
  • Material_StreamSize           : 15671193

The Nunn Center’s collection management system, SPOKEdb, automatically harvests MediaInfo exports, parses out the key elements and places them in the appropriate database field.  This allows the archivist to create a technical report on all of the interviews in the archival collection that were encoded using the H.264 video codec. When the H.264 codec becomes obsolete, the future archivist will have the information they need for future migration, because the necessary technical metadata has been saved.  The following is a screenshot of the technical metadata section of SPOKEdb, the Nunn Center’s collection management systems.  Most of the fields are automatically populated when we past in the export from Mediainfo.

There are many metadata schemas available for the archivist to choose from (See Elinor Maze’s essay on metadata standards).   At this time, I particularly like the PBCore 2.0 (  This standard was developed particularly for audiovisual material and excels at documenting the specifics of audio and visual technical metadata.  The 2.0 revision of PBCore introduces many new innovations that make it particularly effective for documenting born digital content.  This may not be the right schema for everyone.  If your repository uses Dublin Core, try to maintain technical metadata for your audiovisual materials in a searchable database field in your archival management system.

Higher-end Open Archival Information Systems (OAIS), such as the one being implemented at the University of Kentucky requires staffing and great financial investment.  The Trusted Digital Repository (TDR) standard is an exciting development, however, again, is a system requiring great resources that are not within reach of the typical small institution or individual.  At the University of Kentucky the preservation repository utilize the METS standard, which was developed by the Digital Library Federation and excels at interoperability.  METS is a wrapper that allows multiple metadata schemas to be wrapped into the same package.  For example, the Nunn Center utilizes PBCore2 for descriptive and technical metadata and is using PREMIS for its preservation metadata.  These individual components will be wrapped up in a METS file that will be use to ingest into your preservation system.

So, the ideal preservation context is becoming more and more accessible to institutions with budgets and staff, what is someone with budget and technical limitations supposed to do with regard to preservation oriented metadata?

  • Born digital interview masters can often reside in parts spanning multiple files.  Develop a workflow that tracks technical metadata for each of these files
  • Be meticulous and consistent with file naming.  This is critical to automation.
  • Track technical metadata for each instantiation created.  Strive for that technical metadata to reside in a searchable field in your archival management system.
  • Harvest the checksum for each file accessioned and incorporate that checksum into your archival management system.
  • Choose a metadata schema that works with your archival repository.  Working outside of your institution’s dominant system can be counterproductive.  Explore customizations of that schema that are particularly useful for oral history.
  • Do not hesitate to ask a well-established archival repository about their metadata schema and strategy.  There is no need to reinvent the wheel.

 “Archiving” is no longer the final step in an oral history preservation workflow.  It needs to be the first step. Interviewers designing a project need to be choosing a recording format that will be sustainable.  Interviewers need to partner with an archive as soon in the process as possible.  Choose an archive that has the capability to curate the type of collection that you are generating.  Ask the administrators of that archive to articulate their preservation plan to you.  If they do not have a digital preservation plan you should place your interviews elsewhere.


About Author

Douglas A. Boyd

(0) Readers Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Please type the characters of this captcha image in the input box

Please type the characters of this captcha image in the input box