Glossary of Digitization Terms


Introduction


When beginning your Digitization (DiGI) project, there might be a lot of new terms and words related to the work you have been tasked to do. As you start and become more comfortable with the devices in your project and workflow, these terms will be continue to appear and be frequently used by your team and yourself.

This page features a general glossary of terms that you will likely come across throughout your DiGI project and also the archiving and preservation fields in general. It will be helpful to familiarize yourself with these words and concepts for the future.

This vocabulary is also applicable outside of digitization, so you will also find that you may have a better understanding of musical equipment, general computing, and IT after acquainting yourself more with these words.

This is a living list of definitions and descriptions. If you encounter terms that are unlisted here, or think that more descriptions should be added, please contact the DiGI Team or comment below on this page to have them included in this growing glossary. 

Contact Info

Phone: 604-319-7094
Emails: Ben (ben@fpcc.ca) or James (james@fpcc.ca)


Common Technology Terms


Term

Alternative Names

Meaning

Term

Alternative Names

Meaning

Audio-cassette tape

Audiocassettes, audio-cassettes, tape cassette, compact cassette, tape, cassette

An analogue cassette with a magnetic stripe used for recording, playing audio, and data storage

Betamax Tape

Beta

An alternative analogue cassette with a magnetic strip used for audio and video; this format has been discontinued and often must be outsourced for special digitization

Cassette Deck

Cassette player

A device that plays audio-cassette tapes without internal speakers or power amplifiers

For more information about the Cassette Deck, please review this DiGI training module:

Digitization Kit

DiGI kit, digitization rack, DiGI rack

The compact, set of devices supplied by FPCC upon request to aid in digitizing audio-cassette tapes for your DiGI project

For more information about the DiGI kit, please review this DiGI training module:

Gain Control

Gain dials, nobs

The nobs on an interface that alter the power (or gain; see Common Content Editing Terms on Gain) changing the signal being recording; if the gain is too high, then the recording may be clipped (see Common Content Editing Terms on Clipping) or lost in the transfer process

For more information about the controlling gains, please review this DiGI training module:

Input



Where energy or a signal enters a device or system

Microcassette



An analogue cassette that is essentially a smaller version of an audio-cassette tape that also has a smaller storage size; as a result of their size and delicateness, microcassettes often must be outsourced for special digitization

Mini DV tape

MiniDV, S-size tapes

An analogue cassette with a magnetic strip used for audio and video that is a smaller form of a VHS tape; these tapes were used often in camcorder devices

Monitor

Audio monitor

The speakers enclosed and connected to the DiGI kit; a term used to describe loudspeakers for audio processing and editing in general 

Output



Where energy or a signal leave a device or system

Reel-to-reel Tape

Reel-to-reel audio tape recording, open-reel recording

An analogue tape recording format that is unmounted (e.g. not in a case) and played on a spindle device; this type of tape needs a careful digitization process and use on a machine that can take the magnetic strip that is fed through the reel player and re-spin it

VHS tape

Video home system tapes

An analogue cassette with a magnetic strip used for audio and video

 


Common Archiving Terms


Term

Alternative Names

Meaning

Term

Alternative Names

Meaning

Acquisition

Transfer, donation, loan of materials

The process of receiving and transferring new materials into your archive through various methods; the processed, acquired material itself (e.g. 'the archive's new acquisition')

Acquisition of materials should be done through a cooperative, ethical process. The donating partner who passes on the contents to be digitized and logged should do so consensually.

For more information on informed consent, please review this article on the sister FirstVoices Knowledge Base on Informed consent

Appraisal



The process and act of assessing the quality, other values (e.g. cultural, monetary if applicable), uniqueness, sensitivities related to new or potential acquisitions

Arrangement

Arrangement strategy, order

The organization of processed materials in a collection and archive usually by provenance (see Common Archiving Terms on Provenance) or sequence of acquisition/original creation

Arrangements also do not have to be based off these principles, but can be organized to facilitate searchability of users or designed using Indigenous principles and groupings

Archive

Language archive, data archive, repository

A storage space and system that possess a variety of language materials, organized in collections; archives are long-term storage strategies generally for primary documents and other original resources

Analogue

Analog, old school devices

Relating to a continuous signal or broadcast; analogue materials use these continuous signals in physical measurements in lieu of digital binaries and changes (e.g. 1s and 0s) to store, exchange, and present information

Back-up

Backup, data backup

A copy of or duplicated data from a device that is taken and stored securely to recover if the original data are ever lost; sometimes data are stored virtually in a space known as a cloud (see Common Data Terms on Cloud)

Bundle



A grouping of files or materials (that are shared or processed and categorized together into a series, sub-series, or collection)

Condition Assessment



A condition assessment is a diagnostic review of a material in a collection to determine if there is damage or wear on it; assessments are completed before digitization and recorded in a log

For more information on condition assessments, please review this DiGI training module:

Module 8: Condition Assessments

Digital Asset



Resources and data in a digital form that are processed for use with conditions of access; digital assets need to be managed and organized to be both usable and secure

Digital Asset Management

DAM

The technology and management of digital assets relating to their storage, searchability, cybersecurity, presentation, distribution and accessibility to users

For more information on DAM strategies and storage, please review these DiGI training modules:

Module 12: Storage

Module 13: Summary and Additional Resources

Digitization Log

Log

A digitization log is a document used by digitizers to record the process and condition of materials being digitized; it contains metadata and technical information about the software used for editing and the state of the recording or text

For more information on digitization logs, please review this DiGI training module:

Module 7: What is Metadata?

Finding Aid



A document or guide that directs and assists patrons and archivists in locating materials across a collection; it contains important information (and metadata) that points to the location of processed resources and entries

For more information on finding aids, please review this DiGI article and module:

Create Finding Aids

Inventory



A document or description of all materials and content in your series of collections, noting metadata information (see Common Data Terms on Metadata)

For more information on how to build an inventory, please follow the instructions in these DiGI training modules:

Provenance

respect de fonds, principle of provenance

Relating to the source and origin of materials; provenance can mean information about a family, clan, person, nation or organization in regard to the creation or acquisition of an entry in a collection

The principle of provenance or respect de fonds state that collections should typically be organized by entries' provenance (e.g. grouping them by origin/creator)

Unique ID

Reference Code

The unique identifier (e.g. code or combination of numbers and letters) used to find and differentiate materials and entries from each other or different collections 

 


Common Data Terms


Term

Alternative Names

Meaning

Term

Alternative Names

Meaning

Bit Rot

Bit-rot, data degradation, data rot, data decay

The process of decay and general changes to files as a result of slowly accruing errors/issues that eventually make some materials unusable

Data

Datum (singular)

Broadly, data are information, storable and transportable information that in the case of digitization and archiving, pertain to resources, materials (media), and their contents

Data Authenticity



Relating to the genuineness of the file and that the data are the original, master source

Data Integrity



Relating to the state of the data and that their content is unaltered and uncorrupted

Cloud storage

The cloud

Relating to a data centre that multiple users can access over the internet or through a wireless connection; it is an alternative storage system base that uses a server (or multiple servers) to store and manage large amounts of information with minor maintenance

Checksum

check-sum

A checksum is a string of data that is attached to a file to help detect the presence of errors introduced in storage or transfer between devices; different software programs can detect checksums to determine if there is corruption in the file (e.g. data integrity)

Checksums can be embedded via the software, BWF-MetaEdit

For more information on BWF-MetaEdit, please review this DiGI training module:

Module 6: Installing Audacity and MetaEdit

Metadata

Meta-data

The data about your data or information about your information; metadata is descriptive, administrative, and technical information about your content and materials that aid in their preservation and location


Common Content Editing Terms


Term

Alternative Names

Meaning

Term

Alternative Names

Meaning

Bit-depth

Bit-rate, Bitrate

The amount of bits in each sample; the higher the bit-depth the higher quality the audio given the increase in information recorded

Clipping



The process of distorting or losing audio at the top and bottom of a sound wave from excess sound

Compression

Compressing, compressor

A reduction in the amount of data contained in a file, thus reducing the overall size of the file; this is recommended for access files, and not master files and can be done in Audacity or any other audio editing program

For more information on Audacity:

Edit audio quality in Audacity

Gain

Voltage

The power of the signal in transition measured and controlled through an amplifier

GIF

Graphic Interface Format, JIF (resembling alternate pronunciation)

A small file type that is best suited for moving images

Lossy



A method that compresses data that irreversibly deletes information, but significantly shrinks file size

Lossless



A method of preserving the original data with no loss of information

MP3

MPEG-1 Audio Layer III, MPEG-2 Audio Layer III, 

A lossy, compressed audio file type used for access copies and recommended for use on websites and as shared audio

For more information on MP3 files, please review this DiGI training module:

Module 11: Let's Digitize

MP4

MPEG-4 Part 14

A lossy, compressed multimedia file type that can store video, audio, subtitles, and also still images; this file type is often used to share videos with audio online

Noise Reduction

Filtering

A method of removing continuous static background noise, which can be achieved in Audacity; this is often used to 'clean up' old recordings that have 'fuzzy' sounding background noise due to age, or recordings that have continuous background noise that was not eliminated from the environment at the time of the original recording (e.g. hum from electronics, rain against the window)

For more information on Audacity:

Edit audio quality in Audacity

Normalization

Normalizing

The application or a steady gain or window applied to an audio waveform in order to bring the peaks and troughs or loudness to a more balanced, target level

This process can be done through the software, Audacity

For more information on Audacity, please review these DiGI training modules:

Module 6: Installing Audacity and MetaEdit

Module 11: Let's Digitize

Optical Character Recognition

OCR, optical character reader

The conversion of scanned, copied, or printed text into searchable, machine-encoded characters; OCR is often an automatic process that can be run over documents once they are digitized

Sample Rate

Project Rate, Sampling Rate, Frequency Rate

The number of samples per timeframe (usually measured in per seconds) from a signal; the higher the sample rate, the larger the range of audio recorded

Segment (verb)



The act of saving excerpts of a larger audio file as individual, smaller files

For more information on segmenting audio in Audacity:

https://firstvoices.atlassian.net/wiki/spaces/FIR1/pages/1706206

Segment (noun)



A smaller recording that has been pulled from a larger recording

For more information on segmenting audio in Audacity:

https://firstvoices.atlassian.net/wiki/spaces/FIR1/pages/1706206

TIFF

TIF, Tag Image File Format

An uncompressed file type used for graphics and their metadata that is optimal for editing images; this format maintains visual quality for archiving and is not best suited for images hosted online

Timestamp

Time-stamp, time-codes

The encoding of information to reference when an occurrence or event took place; transcribers might timestamp their transcriptions by line or paragraph 

Transient

Transient burst, transient spike, pop

A quick, high amplitude sound in the beginning of a waveform that results in clipping and lost or distorted audio

WAV

Waveform Audio File Format, WAVE

A lossless, uncompressed audio file type used for preservation master and access master copy files; since this file type is uncompressed then it is the most optimal to edit on

For more information on WAV files, please review this DiGI training module:

Module 11: Let's Digitize

XML

Extensible Markup Language

A type of coding language that assists in reading other code to store and transport data; it is used for many functions including to render orthographies across the internet, to embed checksums on files, and as a text editor

Â