Antiques Roadshow is one of many iconic shows to have come from WGBH, and is a staple in many households. Growing up, Antiques Roadshow was one of my family’s go-to shows. We would make a game of sitting around the TV and trying to guess the value of the objects being appraised, and although we didn’t play for anything besides the pride of being the one to make the closest guess, Antiques Roadshow will forever be associated with the memory of those lively guessing-games.

WGBH’s Media Library and Archives (MLA) began work on digitizing and preserving older Antiques Roadshow episodes in 2018.  Previously, Antiques Roadshow had digitized old episodes and footage tapes in their own department on a one-off basis as needed to create new episodes out of vintage material from previously-visited cities. After discussing with the MLA, however, Antiques Roadshow decided to take a new approach: they wanted to partner with the archives on a larger-scale project to digitize all the original raw footage tapes related to each city they planned to revisit for the season, and ensure that the MLA had the opportunity to ingest (process and preserve) the digitized “preservation master” files of the digitized footage before it was re-edited for broadcast. Before switching to file-based cameras, Antiques Roadshow shot about 70-100tapes of raw footage in each city they visited, in formats including Betacam, Digital Betacam, DVCam, and Mini DV.  Since each season includes seven vintage shows, that meant that we would be digitizing between 600 and 700 tapes for each season’s material.

Because of the amount of material and the short turnaround time to get it digitized and processed for Antiques Roadshow to begin their production process, the MLA decided to work with Memnon, a Sony-owned company and digitization vendor based out of Bloomington, Indiana, rather than digitizing all the material in-house.

Since I was fairly new to the archival profession when the project began, this project was one of the first on which I worked, and was instrumental to my learning WGBH’s process for working with LTO tapes, a tape-based data storage technology, which I can explain in greater detail:

LTO
An LTO tape. The MLA currently uses LTO-6 tapes, which are able to hold 2.5 TB of uncompressed data.

Antiques Roadshow’s original analog footage is stored in WGBH’s state-of-the-art, climate-controlled vault, so the first step is to identify how many tapes there are for each city and what formats they’re in. Peter Higgins, WGBH’s Archives Manager, then sends this information off to Memnon. Memnon then digitizes the files, and then sends the digitized episodes on hard drives to WGBH. My part begins after the hard drives reach my desk.

When the hard drives full of newly-digitized files from Memnon land on my desk, the first thing to do is to make a quick check to ensure that all files are present. For each episode, there are eight* parts that are expected:

 

  • a preservation master file (large, high-quality version of the episode; 10-bit uncompressed video in a .mov wrapper)
    • a MediaInfo file (a program that provides technical information about media files, including the codec, file size, bit rate, pixel size, etc.) with metadata on the master file
    • an MD5 checksum of the master file (MD5s are a unique string of numbers and letters that is generated to check for any corruptions when moving files)
    • a QCTools report (an analytics file that helps users find possible video corruptions)
    • an .xml sheet with information about the preservation master file
  • a proxy file (smaller, lower-quality version of the episode)
    • MediaInfo on the proxy file
    • an MD5 of the proxy file

*Memnon also sends an email with .csv files containing the MD5 checksums for the contents of the drives

ANRO_memnon
A sample of Memnon’s deliverables, with all eight parts.

After checking that all parts are present, I put the master files into the Avid ISIS, a shared storage system that gives the Antiques Roadshow department the ability to access the files and edit them for future episodes.

Once the master files are in Avid, I put all of the files from the drives onto LTO-6 tapes, for long-term preservation in WGBH’s vault. When saving onto LTO tapes, we also create a duplicate tape, so that at the end of the process we have one tape that will be stored in WGBH’s vault, and another that will be stored in WGBH’s offsite storage facility (which I will refer to as the vault-LTO and offsite-LTO, respectively). This ensures that if one tape is damaged or becomes corrupted, a useable copy is saved elsewhere.

LTO_Setup
The MLA’s LTO station. Our two LTO decks (on the left) are hooked up to the computer, allowing us to write to two LTO tapes simultaneously.

At WGBH, we use a pre-written script to write files to LTO tape. This script tells the computer to grab files from the drive, copy the files onto the LTO, compare them against the original files on the drive to ensure that they’re an exact match, and then run FITS on the contents of the drive. FITS stands for File Information Toolset, a tool developed by Harvard University that combines a number of command line tools to generate technical metadata, including MD5s, for each file. This creates the LTO tape that goes to WGBH’s vault.

Since our script and our set-up allows us to write to two LTO tapes at once, I use the script to write to both the vault-LTO and offsite-LTO simultaneously (there are other manual options for writing to both LTO tapes depending on the set-up of the LTO decks). When the script has finished running, I like to first check the destination files in the diff folder that FITS generates, to make sure that the script ran fully and without errors. If the destination files say that all the files are not identical (it will tell you at the bottom), this means that both the vault- and offsite-LTOs have identical files. Afterwards, it is vital to compare the MD5s generated by the script in the FITS folder against the .csv file of MD5 values sent from Memnon. This can be done by running a diff command on the two lists, which should match one another. If any deviate from Memnon’s MD5 values, email Memnon and ask about it (a lot of the time, it is simply that they had run the MD5 on an older version of the file and had forgotten to send the updated MD5).

ANRO-fits
The contents of the folder created by the script. The script creates a file list, virus scan, a diff folder with two destination files, a fits folder with FITS files for the contents of the two tapes, and a .zip drive of the fits folder.

After confirming that everything matches (and re-uploading and re-checking any files that are confirmed to be corrupted), I record information about the LTO tapes in the MLA’s access database (PIM/MARS) and digital preservation database (DAM) so that users can request the tapes if they need to access that content. The LTO tapes are then sent off to their respective locations, and Memnon can then send the original analog tapes back to WGBH.

Hopefully this gives a sense of what the workload is like for this phase of the Antiques Roadshow project. The amount that we’ve digitized for this project to date is still just a fraction of Antiques Roadshow’s at-risk tape library, but going city by city, eventually we can make sure that all of Antiques Roadshow’s history is not only preserved, but is re-usable by our producers and for contemporary viewers.


Headshot

Miranda Villesvik joined WGBH’s Media Library and Archives in 2017, where she is an archivist.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s