*This article is adapted from pre-existing resources located on the Digitization Knowledge Base
Introduction
The following information is about how to store backups securely and getting to know what bit rot is.
The title of this article is also slightly misleading.
Bit rot is natural. Even though it might not seem like it, data can rot just like a log over time. It is an occurrence that language team workers must account for in their assets management. In the following modules, you will come to learn how to better prepare for bit rot and make and store backups to restore data that undergoes corruption.
When data corruption or bit rot occurs with your files, restoring these files with backups will relieve you of extra work in re-recording any missing or corrupted audio. Especially if you are unable to work with the same Elders or language speakers, having backups ensures that you can continue on these legacies and keep speakers’ voices on your language site for generations to come.
What is Bit Rot?
Simply, bit rot is data degradation or the process of data decay as a result of slowly accruing errors/issues. In a file's lifetime, changes in their bits (tiny amounts of data), importing, exporting, their physical and electronic containment, and even electric charge will slowly alter the information within. When these changes accrue to a point where you can no longer use or open the file, then the file is corrupted.
Many models and new tools for project members to use are working with new internal processes to prolong the lives of files and detect errors that might indicate data corruption. However, you still need to be aware of it and how it can occur with your data.
Certain decisions can reduce the environments and scenarios for bit rot to occur.
In this module, you will conceptualize and dive into what bit rot looks like for your data and materials.
The image above is also featured as an interactive diagram in the lesson on bit rot in the related module.
Here, you can visualize the forces that might damage or make your data apple slightly less appealing. There are concerns of thermal insulation, magnetic orientation issues, decomposition and containment errors, and even biological hazards when it comes to securing your data and heritage materials.
If your storage system or location gets too hot, too cold, has been receiving lots of power surges over the years, or has become home to some critters like bugs or even mold, then bit rot can occur much faster than if information and data is stored differently.
Learn more about these considerations in the lesson within the interactive module below.
Within the module, there are also suggestions on how to address bit rot when it surfaces in your collections.
In general the single best remedy for bit rot is to prepare backups to be used to replace corrupted or changed files.
Another tool in your repertoire to aid in the fight against bit rot are checksums.
Checksums are bits of data that attach to files that detect slight changes in the contents and internal structure. They can be used to clue you into when bit rot might be occurring in entries and recordings before they become unusable. For more information on embedding checksums onto your WAV files, please see this link to FV Checksum Article (once published).
Why is It Important to a FirstVoices Project?
Bit rot might feel like a topic that does not relate to your FirstVoices work, but it is a concept and issue for all technology professionals, which includes you now too!
Ensuring that you have secure files means not just that the correct people have access to them, but the things that people have access to are usable and can continue to be shared.
Long-term planning should be a part of your FirstVoices project including who will continue to care for and monitor the site, files, and possibly even continue your team's work.
Depending on when your project formally ends and someone else's team takes on the roles, there might be data corruption that new team members need to account for and files to restore. You want to leave your site and its related content (files) in the best conditions possible and with a solid roadmap to project success for the next generation. Incorporating backups and anticipating bit rot are two steps that will aid future language workers to maintain your community site after your work is finished.
Backup Strategies
If backups are the single best method to address bit rot then you need to come up with a strategy to store and make sure your backups are accessible, findable, and usable later.
A backup is a copy of stored data that is saved in multiple locations and later used to restore lost or corrupted data.
How you manage and create your backups varies. Workflow, human capacity, technical capacity, and storage space are huge factors that influence how you prepare and bundle these resources for use.
Preparing your Backups
Backups can either be done manually (by a team member every so often) or automatically (through a computer program). The more data and files you are wanting to back up, the more likely that automatic backups are better for your workflow.
Having to manually drag and drop folders or even hitting a button to begin a backup takes time out of your day when many software programs will back up on a regular schedule for you.
Programs that will expediate and help you make backups quickly on your computer include:
- Fbackup: a free open-source, backup software with an easy to use interface
- Areca: a free open-source, backup software for Windows and Linux
- The archiving program you are already using: check your settings and the features if automated backups are available!
When looking for a software program to help you back up your data, there are many free open-source options from which to choose. Reliable backup software should act as an easy-to-use facilitator of information and tool.
Administrative Considerations
After completing your backups, there are next steps to take to ensure that these resources are usable and maintained for the future.
Some guiding questions before thinking about storage strategies include:
- Is your labeling and naming system consistent across documents and files?
- Are your folders easy to navigate through in the backups?
It is important to keep in mind who is going to be restoring backups and how long these backups will exist before being updated. If you leave the organization or pass on your responsibilities onto another, it will be important to leave a guide or make your internal information structure (e.g. file naming and organization) inside your backups understandable for the next person.
You do not want to structure your data and internal organization in only a way you can understand.
A Backup Scenario
In the module below, there are scenarios that present storage plans with various pros and cons.
In these interactive images, you can peruse and explore what aspects might strengthen your backup storage and what might weaken it.
One (non-interactive) image example is below:
Some key points to remember in any given scenario and backup strategy are:
- You will not be able to 100% prevent bit rot, only lessen and anticipate its effects
- Backups must be stored in multiple locations to be useful and to be considered true backups
- The format how and location where you store your backups should not damage or have long-term effects on the contents within
Within this lesson, you will be presented with more backup scenarios like the one above and, at the end, be able to decide which one is the most feasible, useful, and secure for storing backups.
Ultimately, combining storage devices that both are easy-to-use to transfer information between FirstVoices team members and are stable for long-term storage is recommended in whatever your backup strategy turns out to be. Your backup strategy is also likely to change too. It all depends on your current workflow and capacity.
Link to Interactive Module (Originally designed for Digitization projects)
Please follow this link to take you to both lessons within the interactive module:
For a detailed discussion about these topics that also makes use of the educational materials showcased within the module, please also find the March 2021 DiGI Webinar with the topic, Backups & Bit Rot, here: Join or watch Digitization Webinars.