Contrary to common perception, the “Wave” is not, itself a “format” but a container for housing a particular set of data.  The Wave as we tend to use it in oral history, folklore  and audio archival and preservation contexts is a data file based on the Resource Interchange File Format (RIFF) serving as a container for digital audio encoded utilizing LPCM (Linear Pulse Code Modulation) for encoding uncompressed digital audio.  I know, too much detail.  The point is, the Wave file, as we tend to think about it, is not the “format” and it does have limitations.

  • Metadata Limitations: The Wave container never included fields for embedding metadata into the file itself.
  • File Size Limit: The Wave architecture does have  a built-in 4 gigabyte file size limit.  Some software editors enforce a 2 gigabyte file size limit.  Oral History interviews tend to be lengthy and large, which can create a major problem if the expectation is that 1 interview equals one data file.

The Broadcast Wave extension has, clearly, emerged as “best practice” for preserving oral history recordings on the archival side.   The Broadcast Wave, originally specified by the European Broadcast Union, was designed with interoperability in mind.  Broadcast Wave is an extension of the WAVE file that adds the ability to embed metadata into the digital data file header (IASA TC-04 2nd ed.).  The metadata is stored in “extension chunks” in the .wav file.  On that note, The Broadcast Wave looks, on the surface, to be a standard .wav file.  When ingested into a system that can read the Broadcast Wave metadata, it can be utilized and manipulated.

Once the file is ingested into an archival system, it is incredibly easy to transform the traditional Wave file to include the Broadcast Wave extension, assuming you are using an audio editing system that can work seamlessly with Broadcast Wave (Steinberg’s WaveLab for example).  Some software packages allow the creation of default values for metadata fields–an incredibly useful feature..  If the audio editor does not work seamlessly with Broadcast Wave (like earlier versions of Sound Forge), they mostly open and often retain the metadata associated with the Broadcast Wave extension, they just don’t allow the manipulation of the Broadcast Wave element.  My current favorite software tool for embedding metadata into the Broadcast Wave extension is BWF MetaEdit (Tutorial forthcoming).  BWF MetaEdit is a fee, open source tool developed by FADGI (Federal Agencies Digitization Guidelines) supported by AudioVisual Preservation Solutions that enables the batch embedding of metadata and the batch conversion of Wave files  into Broadcast Wave.  I love this tool, it even creates and embeds the checksum (see my earlier post/tutorial on the importance of the checksum).

Although most audio recorders being used right now by oral historians or folklorists do not create a Broadcast Wave audio file, there are a few and I have a feeling that it will be emerging as more common in the near future.  Currently some audio recorders do create Broadcast Wave files:

Even if your recorder does not create the Broadcast Wave,  don’t panic, you can embed the metadata quite easily after the fact.  The advantage of the Broadcast Wave being created by the recorder is to capture the born-digital metadata (such as the recorder used).  As far as I am concerned, the earlier we capture/create and associate this metadata to the audio data file the better.

The bottom line, on the archival side you should be utilizing the Broadcast Wave extension for embedding metadata into the files themselves when you digitize and when you process your born digital interviews.   The current guidelines for Broadcast Wave files are located at the Federal Agencies Digitization Guidelines Initiative website. 

 

Share

About Author

Douglas A. Boyd

(0) Readers Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Please type the characters of this captcha image in the input box

Please type the characters of this captcha image in the input box