Glossary of Digitization Terms
Introduction
When beginning your Digitization (DiGI) project, there might be a lot of new terms and words related to the work you have been tasked to do. As you start and become more comfortable with the devices in your project and workflow, these terms will be continue to appear and be frequently used by your team and yourself.
This page features a general glossary of terms that you will likely come across throughout your DiGI project and also the archiving and preservation fields in general. It will be helpful to familiarize yourself with these words and concepts for the future.
This vocabulary is also applicable outside of digitization, so you will also find that you may have a better understanding of musical equipment, general computing, and IT after acquainting yourself more with these words.
This is a living list of definitions and descriptions. If you encounter terms that are unlisted here, or think that more descriptions should be added, please contact the DiGI Team or comment below on this page to have them included in this growing glossary.Â
Contact Info
Phone:Â 604-319-7094
Emails:Â Ben (ben@fpcc.ca) or James (james@fpcc.ca)
Common Technology Terms
Term | Alternative Names | Meaning |
---|---|---|
Audio-cassette tape | Audiocassettes, audio-cassettes, tape cassette, compact cassette, tape, cassette | An analogue cassette with a magnetic stripe used for recording, playing audio, and data storage |
Betamax Tape | Beta | An alternative analogue cassette with a magnetic strip used for audio and video; this format has been discontinued and often must be outsourced for special digitization |
Cassette Deck | Cassette player | A device that plays audio-cassette tapes without internal speakers or power amplifiers For more information about the Cassette Deck, please review this DiGI training module: |
Digitization Kit | DiGI kit, digitization rack, DiGI rack | The compact, set of devices supplied by FPCC upon request to aid in digitizing audio-cassette tapes for your DiGI project For more information about the DiGI kit, please review this DiGI training module: |
Gain Control | Gain dials, nobs | The nobs on an interface that alter the power (or gain; see Common Content Editing Terms on Gain) changing the signal being recording; if the gain is too high, then the recording may be clipped (see Common Content Editing Terms on Clipping) or lost in the transfer process For more information about the controlling gains, please review this DiGI training module: |
Input | Where energy or a signal enters a device or system | |
Microcassette | An analogue cassette that is essentially a smaller version of an audio-cassette tape that also has a smaller storage size; as a result of their size and delicateness, microcassettes often must be outsourced for special digitization | |
Mini DV tape | MiniDV, S-size tapes | An analogue cassette with a magnetic strip used for audio and video that is a smaller form of a VHS tape; these tapes were used often in camcorder devices |
Monitor | Audio monitor | The speakers enclosed and connected to the DiGI kit; a term used to describe loudspeakers for audio processing and editing in general |
Output | Where energy or a signal leave a device or system | |
Reel-to-reel Tape | Reel-to-reel audio tape recording, open-reel recording | An analogue tape recording format that is unmounted (e.g. not in a case) and played on a spindle device; this type of tape needs a careful digitization process and use on a machine that can take the magnetic strip that is fed through the reel player and re-spin it |
VHS tape | Video home system tapes | An analogue cassette with a magnetic strip used for audio and video |
Â
Common Archiving Terms
Term | Alternative Names | Meaning |
---|---|---|
Acquisition | Transfer, donation, loan of materials | The process of receiving and transferring new materials into your archive through various methods; the processed, acquired material itself (e.g. 'the archive's new acquisition') Acquisition of materials should be done through a cooperative, ethical process. The donating partner who passes on the contents to be digitized and logged should do so consensually. For more information on informed consent, please review this article on the sister FirstVoices Knowledge Base on Informed consent |
Appraisal | The process and act of assessing the quality, other values (e.g. cultural, monetary if applicable), uniqueness, sensitivities related to new or potential acquisitions | |
Arrangement | Arrangement strategy, order | The organization of processed materials in a collection and archive usually by provenance (see Common Archiving Terms on Provenance) or sequence of acquisition/original creation Arrangements also do not have to be based off these principles, but can be organized to facilitate searchability of users or designed using Indigenous principles and groupings |
Archive | Language archive, data archive, repository | A storage space and system that possess a variety of language materials, organized in collections; archives are long-term storage strategies generally for primary documents and other original resources |
Analogue | Analog, old school devices | Relating to a continuous signal or broadcast; analogue materials use these continuous signals in physical measurements in lieu of digital binaries and changes (e.g. 1s and 0s) to store, exchange, and present information |
Back-up | Backup, data backup | A copy of or duplicated data from a device that is taken and stored securely to recover if the original data are ever lost; sometimes data are stored virtually in a space known as a cloud (see Common Data Terms on Cloud) |
Bundle | A grouping of files or materials (that are shared or processed and categorized together into a series, sub-series, or collection) | |
Condition Assessment | A condition assessment is a diagnostic review of a material in a collection to determine if there is damage or wear on it; assessments are completed before digitization and recorded in a log For more information on condition assessments, please review this DiGI training module: | |
Digital Asset | Resources and data in a digital form that are processed for use with conditions of access; digital assets need to be managed and organized to be both usable and secure | |
Digital Asset Management | DAM | The technology and management of digital assets relating to their storage, searchability, cybersecurity, presentation, distribution and accessibility to users For more information on DAMÂ strategies and storage, please review these DiGI training modules: |
Digitization Log | Log | A digitization log is a document used by digitizers to record the process and condition of materials being digitized; it contains metadata and technical information about the software used for editing and the state of the recording or text For more information on digitization logs, please review this DiGI training module: |
Finding Aid | A document or guide that directs and assists patrons and archivists in locating materials across a collection; it contains important information (and metadata) that points to the location of processed resources and entries For more information on finding aids, please review this DiGI article and module: | |
Inventory | A document or description of all materials and content in your series of collections, noting metadata information (see Common Data Terms on Metadata) For more information on how to build an inventory, please follow the instructions in these DiGI training modules: | |
Provenance | respect de fonds, principle of provenance | Relating to the source and origin of materials; provenance can mean information about a family, clan, person, nation or organization in regard to the creation or acquisition of an entry in a collection The principle of provenance or respect de fonds state that collections should typically be organized by entries' provenance (e.g. grouping them by origin/creator) |
Unique ID | Reference Code | The unique identifier (e.g. code or combination of numbers and letters) used to find and differentiate materials and entries from each other or different collections |
Â
Common Data Terms
Term | Alternative Names | Meaning |
---|---|---|
Bit Rot | Bit-rot, data degradation, data rot, data decay | The process of decay and general changes to files as a result of slowly accruing errors/issues that eventually make some materials unusable |
Data | Datum (singular) | Broadly, data are information, storable and transportable information that in the case of digitization and archiving, pertain to resources, materials (media), and their contents |
Data Authenticity | Relating to the genuineness of the file and that the data are the original, master source | |
Data Integrity | Relating to the state of the data and that their content is unaltered and uncorrupted | |
Cloud storage | The cloud | Relating to a data centre that multiple users can access over the internet or through a wireless connection; it is an alternative storage system base that uses a server (or multiple servers) to store and manage large amounts of information with minor maintenance |
Checksum | check-sum | A checksum is a string of data that is attached to a file to help detect the presence of errors introduced in storage or transfer between devices; different software programs can detect checksums to determine if there is corruption in the file (e.g. data integrity) Checksums can be embedded via the software, BWF-MetaEdit For more information on BWF-MetaEdit, please review this DiGI training module: |
Metadata | Meta-data | The data about your data or information about your information; metadata is descriptive, administrative, and technical information about your content and materials that aid in their preservation and location |
Common Content Editing Terms
Term | Alternative Names | Meaning |
---|---|---|
Bit-depth | Bit-rate, Bitrate | The amount of bits in each sample; the higher the bit-depth the higher quality the audio given the increase in information recorded |
Clipping | The process of distorting or losing audio at the top and bottom of a sound wave from excess sound | |
Compression | Compressing, compressor | A reduction in the amount of data contained in a file, thus reducing the overall size of the file; this is recommended for access files, and not master files and can be done in Audacity or any other audio editing program For more information on Audacity: |
Gain | Voltage | The power of the signal in transition measured and controlled through an amplifier |
GIF | Graphic Interface Format, JIF (resembling alternate pronunciation) | A small file type that is best suited for moving images |
Lossy | A method that compresses data that irreversibly deletes information, but significantly shrinks file size | |
Lossless | A method of preserving the original data with no loss of information | |
MP3 | MPEG-1 Audio Layer III, MPEG-2 Audio Layer III, | A lossy, compressed audio file type used for access copies and recommended for use on websites and as shared audio For more information on MP3 files, please review this DiGI training module: |
MP4 | MPEG-4 Part 14 | A lossy, compressed multimedia file type that can store video, audio, subtitles, and also still images; this file type is often used to share videos with audio online |
Noise Reduction | Filtering | A method of removing continuous static background noise, which can be achieved in Audacity; this is often used to 'clean up' old recordings that have 'fuzzy' sounding background noise due to age, or recordings that have continuous background noise that was not eliminated from the environment at the time of the original recording (e.g. hum from electronics, rain against the window) For more information on Audacity: |
Normalization | Normalizing | The application or a steady gain or window applied to an audio waveform in order to bring the peaks and troughs or loudness to a more balanced, target level This process can be done through the software, Audacity For more information on Audacity, please review these DiGI training modules: |
Optical Character Recognition | OCR, optical character reader | The conversion of scanned, copied, or printed text into searchable, machine-encoded characters; OCR is often an automatic process that can be run over documents once they are digitized |
Sample Rate | Project Rate, Sampling Rate, Frequency Rate | The number of samples per timeframe (usually measured in per seconds) from a signal; the higher the sample rate, the larger the range of audio recorded |
Segment (verb) | The act of saving excerpts of a larger audio file as individual, smaller files For more information on segmenting audio in Audacity: https://firstvoices.atlassian.net/wiki/spaces/FIR1/pages/1706206 | |
Segment (noun) | A smaller recording that has been pulled from a larger recording For more information on segmenting audio in Audacity: https://firstvoices.atlassian.net/wiki/spaces/FIR1/pages/1706206 | |
TIFF | TIF, Tag Image File Format | An uncompressed file type used for graphics and their metadata that is optimal for editing images; this format maintains visual quality for archiving and is not best suited for images hosted online |
Timestamp | Time-stamp, time-codes | The encoding of information to reference when an occurrence or event took place; transcribers might timestamp their transcriptions by line or paragraph |
Transient | Transient burst, transient spike, pop | A quick, high amplitude sound in the beginning of a waveform that results in clipping and lost or distorted audio |
WAV | Waveform Audio File Format, WAVE | A lossless, uncompressed audio file type used for preservation master and access master copy files; since this file type is uncompressed then it is the most optimal to edit on For more information on WAV files, please review this DiGI training module: |
XML | Extensible Markup Language | A type of coding language that assists in reading other code to store and transport data; it is used for many functions including to render orthographies across the internet, to embed checksums on files, and as a text editor |
Â