File management best practices

 


Introduction


When storing your files, especially recordings of speakers, it is important to have a well-organized and consistent file management system in place. This helps support your workflow and makes sure that someone who inherits your data will be able to look through and use your files.

 


File naming convention


The first step is to set up a file naming convention.

This will be the rule you follow when you name different types of files. The most important thing when determining a file naming convention is consistency. If you are consistent when naming files, it will be easier to organize them and find them.

The following are examples of file naming conventions that we recommend using.

Master Files​

[Speaker’s Initials]_ [Date]_[Session number/Topic].wav​

Example: KF_2019-08-20_Greetings.wav​

Word Files

[Speaker’s Initials]_[Date]_[Word/Phrase].wav​

Example: KF_2019-08-20_Hello.wav​

File names are sensitive to certain characters. Our system cannot process filenames with special characters (i.e. characters like č, ł, ā), spaces, or punctuation​ other than - and _

File Naming FAQ

Question (Q)

Answer (A)

Question (Q)

Answer (A)

Should my filenames be in English or my language?

This is up to you. Some teams prefer to use English to avoid special characters (see next question).

How can I write my file names in my language if I can't use special characters?

Work with your team to make character conversion conventions. For every special characters you need to use to write your filename, choose a Latin-only character that you will replace it with when writing your file names.

Examples: Ł → l_

                 ƛ̓ → tl-

                 č → cv

                 ā → aa

Why should I include the speaker's name and the date?

This information provides context for others working with your files (this could be others on your team or someone who inherits your data in the future). This is important for archival reasons too.

It also helps to maintain the legacy of those who have contributed to language work in your community, ensuring that future language workers can track who worked on the project and when.

 


Set up a file management system


When you begin gathering files, setting up an easy to follow file management system will allow you to organize and find your files easily.

We would recommend managing your files within folders that are labeled and organized as such:

 

LANGUAGE > SPEAKER > DATE > WORDS/MASTERFILES​

Example: HEBREW > John Smith > 2019-08-20 > Words

 

This is what a fully set up file management system, with files and folders named appropriately, may look like:

 


Set up a data storage system


Data should be backed up in a secondary location to your primary workstation to ensure that secure copies are always archived and maintained.

Your long-term data storage plan should include both hard drive and cloud storage. Hard drive storage (such as an external hard drive or the hard drive of a computer) and cloud storage (such as a wireless server storage run by your Band) have different vulnerabilities. Using both reduces the chance that files will be lost in a data disaster and increases the likelihood that you can still access your files if one of the storage systems becomes obsolete before the other.

What is a data disaster?

We refer to any event that causes data loss or compromise as a data disaster. 

Data Disaster

Vulnerability

Solution

Data Disaster

Vulnerability

Solution

Flood

Hard drive storage is vulnerable to flooding if the equipment becomes wet.
Cloud storage is vulnerable to flood if your organization is self-hosting the servers and the servers become wet.

Geo-redundancy: have multiple back-ups in multiple locations.

Secure storage: store external hard drives in water-proof containers (off the floor).

Fire

Hard drive storage is vulnerable to fire if the equipment becomes wet from extinguishing fluids or fire damage.
Cloud storage is vulnerable to flood if your organization is self-hosting the servers and the servers are burned or overheated.

Geo-redundancy: have multiple hard drives in multiple locations.

Secure storage: store external hard drives in fire-proof and temperature-controlled containers.

Theft

Computers and external hard drives are vulnerable to theft.

Geo-redundancy: use cloud storage or server storage as a redundant back-up to ensure that copies of your files are retained in the event of theft.

Secure storage: store external hard drives in locked containers.

Ransomware

Computers connected to the internet are vulnerable to ransomware.
Cloud storage and server storage are vulnerable to ransomware.

Anti-malware software: All computers and servers should be equipped with anti-malware software to reduce the likelihood of ransomware and other malware and run the latest versions of software programs.

Geo-redundancy: use hard drive storage as a redundant back-up to ensure that copies of your files are retained in the event of malware.

Bit rot

All forms of data storage are vulnerable to bit rot.

Checksums: read more about bit rot and checksums here.

Summary of recommendations

  1. Make your back-ups geo-redundant by having multiple back-ups in different locations and different formats.

  2. Store external hard drives in locked, temperature controlled, and waterproof containers. 

  3. Equip all computers and servers with anti-malware software and update current programs.

  4. Create a schedule for saving backups of new work and running checksums.