Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

...

Introduction

...

Data information is central to any technology-based project. It is the building block and the information itself that you share and use to exchange language.

...

To make sure our tools and project projects continue to run smoothly and remain accessible, there needs to be stewardship and continued action to ensure the gears keep turning (so to speak).

The following article outlines ways in which to think of data and plan a strategy to maintain and recording record information about data (as known as metadata). At every step of your project, you can address and work with data differently, depending on how you plan to use it.

Info

Frequently Used Terms

In this article, there are many terms and themes around data and technology. Some of them might be more common than others.

If you are unsure about any wording or terms, please consult these resources that describe many words relating to language and technology that would be considered 'jargon' or specialized vocabulary:

...

Recording Practices

...

Recording audio is an interactive and often fun task involved with FirstVoices projects and other language initiatives.

Some ways to include data maintenance in your recording practices are:

  • Stating the time, date, place, environment, people with you, and session number at the start of your recording

This information is a type of metadata and will be helpful for you when reviewing the audio later.

  • Assessing your devices and the recording standards you are using

The type of device you are using to record will influence how detailed your recordings will be, which relates to how much data you collect.

...

A good standard is 48,000 Hz or 48kHz as a Sample Rate and 24-bits as the Bit-depth in a software called Audacity.

Related resources to this topic can be found here:

...

Labelling

...

Labelling might not seem like a big deal. However, once you start recording lots of audio and need to store and find individual recordings out of thousands, labelling is helpful in your workflow.

...

File naming conventions are a type of metadata. Including information to identify the speaker, date or recording, session, topic, and filetype are all examples of data that you record.

How you arrange and track this recording information is another aspect of data management and maintenance. We can call this our audio inventory.

Related resources to this topic can be found here:

...

Checksums

...

Checksums are another aspect of data that is often used in archiving. It is They are also helpful to employ and adapt to adopt into your FirstVoices project. To use checksums, you will need to download and install additional software, and we recommend BWF - MetaEdit.

Checksums are bits of data that tell you if there have been changes in your file. These changes occur for many reasons including just naturally over time.

...

One important step in using checksums is the review and monitoring of changes. It will be important to establish a procedure and schedule to use BWF - MetaEdit on your files again to check on these files. The software will also tell you if there have been changes to the files’ checksums if they that have already been embedded.

Related resources to this topic can be found here:

...

Backups & Storage

...

Backups are copies of your data that are stored in multiple settings or locations to be used in a pinch and in case of data loss.

Depending on the amount of data you are backing up, it may be faster to enable automatic backups. Some software programs like ELAN, which is used for transcribing audio and videos, can backup automatically if you change the settings within it.

Info

Some programs that will

...

expedite the process and help you make backups quickly on your computer include:

  • Fbackup: a free open-source, backup software

...

  • Areca: a free open-source, backup software

...

A backup is also only as secure as useful if it has copies of itself and if its storage space is protected.

...

You will need to make multiple copies to strengthen your backup plan. Three backups is a good number to have.

Type of Storage

The type of storage includes the settings and devices you store data on. Some cheaper options include flash drives and portal hard drives, which can store some data, but not as much as a server or on a cloud.

However, even with some of these more expensive, larger storage solutions, there are other factors to consider. A cloud that has its servers located outside of Canada may have international laws apply to the data stored there. If you have your own server for storage, be sure that it has the latest software and updates to prevent vulnerability to hackers or ransomware.

...

These storage considerations also relate to data security. Keep your data secure by applying passwords to storage systems and speaking with IT professional professionals in your community or organization (if applicable) if there are other measures that the Band takes to prevent data loss or theft.

Using integrated methods like being aware of phishing scams and , adding passwords, and 2-step verification steps are helpful to protect your data.  

...

If you are in a location that is at risk of flooding or wildfires, investing in a waterproof & fireproof safe for hard drives and servers is advised. Generally, you want to store at least one backup off-site.

Keeping servers and archives in a cool, dry, temperature-controlled space is also suggested. Heat can increase the risk of data loss, and moisture with heat can lead to moldmould, which can also damage storage devices and materials.

...

Some third-party storage solutions may have off-site servers or charge your organization every time you access your backups. Sometimes, third-party organizations will provide you with an onsite on-site server and free service requests as part of your subscription. It will be necessary to define and ask about access to storage and their related data management policies if using a storage solution professional.

In general, having backups be accessible is helpful in case you need to restore files. Often when you need to restore files due to bit rot, it might be a stressful situation. You do not want to your stress eve more by having to locate or search for your backups.

Backups should be easy to always locate and be available to language technology workers for their use.

Related resources to this topic can be found here:

...

Check Before You Tech

...

Another resource you can use to evaluate and consider your options before beginning your language technology project is the Check Before You Tech document from the First Peoples’ Cultural Council.

...

Read more here: Check Before You Tech

View file
nameFPCC-Check-Before-You-Tech.pdf

...

Data Maintenance Checklist

The following checklist will also help you organize and maintain your data for your FirstVoices or language technology project.

Download the PDF here:

View file
nameData Maintenance Checklist-AUG-2021.pdf

...