volunteervoices

 

Filenaming Schemes

Page history last edited by Tiffani 2 yrs ago

Filenaming Schemes

 

Specific Filenaming Schemes

 


 

Scheme 00, for single images with one metadata (xml) record

Scheme 01, for multi-page images with one metadata (xml) record

Scheme 02, for non-transcribed text (such as a letter), no transcription needed, one or more pages per metadata (xml) record

Scheme 03, for digitally transcribed text, one metadata (xml) record

Scheme 04, for single or multi-page, text requiring transcription, one metadata (xml) record

 

If the digital object you have in front of you does NOT fit any of these descriptions, categorize it as “other” and describe it in detail. We need as much information as possible about this object to be able to develop support for web delivery at a later date.


Main explanation of the filenaming scheme being used by Volunteer Voices as well as other DLC projects.

 

All sections of numbers are separated by an underscore. Extension is appended (in lowercase: .tif for tiffs, .xml for xml)

  • Institution_Collection_Item_Item number
  • 0123_000012_000432_0001.[file extension]

 

Institution: First Set of Numbers

The first set of numbers (4 digits) identify the institution. This corresponds to the institution’s identifying number in the vvadmin database [admindb]. 

  • Example: 0123_000012_000432_0001.tif came from institution #123.

 

Collection: Second Set of Numbers

The second set of numbers (6 digits) identify the collection within that institution. This corresponds to the collection’s identifying number in the vvadmin database. Each institution may have collections that have the same number: 001, 002, 003, etc. [select id from coll where name=”Whittaker’s Confederate Uniforms Collection”]

  • Example: 0123_000012_000432_0001.tif came from the 12th collection at institution #123.

 

Item: Third set of Numbers

The third set of numbers (6 digits) is the item number. The first 2 sets of numbers are unique to this digital object across all digital objects in Volunteer Voices.

  • The first 4 sets of numbers are common to every file that is part of this digital object; the metadata file and all the digital objects that are part of this item, share this 4-part number as part of their filenames.
  • The item number must be unique for that collection in that institution.
  • Each collection may have items 0001, 0002, 0003, etc.

For example:

the following files are all for the 432nd item (or digital object) in collection 12 at institution 123, and they are the metadata file and the first 3 pages of a text document:

  • 0123_000012_000432_0000.xml
  • 0123_000012_000432_0001.tif
  • 0123_000012_000432_0002.tif
  • 0123_000012_000432_0003.tif

Example: 0123_012_02_0432_0001.tif is the part of the non-transcribed text object that is the 432nd item in the 12th collection from institution #123.

 

The remainder of the filename (everything after the item number) will be created by the content people and will provide sufficient information (via file-naming scheme adopted) for us to reconstruct the object and display it correctly. (In the example above, the 5th set of numbers tell us which is the metadata record, and what the sequence of display is for each tiff. ) A scrapbook with sub-objects for some pages and both images and text will require different processing and display than a set of photos of the four sides of a Roman column, or a thesis with 4 movie files, 2 audio tapes and a spreadsheet. We cannot forsee all possible combinations, so each must be covered by a different scheme. Thus, look up the scheme referenced by the third set of numbers, in order to make sense of the remainder of the filename. [select * from FileNameSchemes where id=”03”;]

 

Example: 0123_000012_000432_0001.tif is the first page of the non-transcribed text object that is the 432nd item in the 12th collection from institution #123.

 

NOTE: Content people will need to create this part of the filename! (in this last example, the “_0001.tif”) The rest should be generated by the database when creating the base metadata record, which (in schemes 00, 01, 02, 03, and 04) will end in “_0000.xml”:

 

Example: 0123_000012_000432_0000.xml is the database-generated filename for the metadata record for the 432nd item (non-transcribed text) in the 12th collection from institution #123.

For the scan of the first page, change the last four digits to 0001 and change the extension to “.tif”.

For the scan of the second page, change the last four digits to 0002 and change the extension to “.tif”, and so on.
The first four parts of the filename (here, “0123_000012_000432_”) should match the first four parts of the metadata record filename (here, “0123_000012_000432_0000.xml”). 

Comments (0)

You don't have permission to comment on this page.