Implementing Periodic Backups on UCloud

Protecting data from accidental deletion or corruption is paramount, especially in dynamic environments like cloud platforms. On UCloud, data removed from the /work directory during job execution is permanently deleted without being moved to the Trash folder.

To mitigate the risk of data loss, setting up periodic backups is a strategic approach. This guide introduces the use of Restic, a robust backup solution, and outlines steps to automate data backups securely and efficiently.

Introduction to Restic

Restic is a modern backup program that efficiently handles data snapshots, providing fast, secure, and incremental backups. It stores data in encrypted repositories, ensuring that your backups are not only up-to-date but also protected against unauthorized access. The official Restic documentation is available here.

Setting up Restic

We provide a streamlined script to facilitate the Restic setup and simplify the scheduling of periodic backups. This script automates the installation of necessary components, configures the environment, and schedules backups using cron jobs.

Creating the setup script

To simplify its setup, a wrapper script is provided:

#!/bin/bash

# Function to install Restic and cron, and configure Restic
set_up_restic () {
        (sudo apt-get update; \
         sudo apt-get install -y cron restic; \
         sudo restic self-update; \
         sudo restic generate --bash-completion /etc/bash_completion.d/restic; \
         sudo cron) >/dev/null 2>&1
}

# Function to update the crontab with new jobs
update_crontab () {
        (crontab -l 2>/dev/null; echo "$1") | crontab -
}

# Parse command-line arguments
while [[ $# -gt 0 ]]; do
    case "$1" in
        backup|restore)
            ACTION="$1"
            shift
            ;;
        -r)
            RESTIC_REPOSITORY="$2"
            shift 2
            ;;
        -s)
            RESTIC_SOURCE="$2"
            shift 2
            ;;
        -p)
            RESTIC_PASSWORD_FILE="$2"
            shift 2
            ;;
        *)
            EXTRA_ARGS+=("$1")
            shift
            ;;
    esac
done

# Validate required inputs
if [[ -z "${ACTION}" ]]; then
    echo "Error: No action specified (backup or restore required)."
    exit 1
elif [[ -z "$RESTIC_REPOSITORY" || -z "$RESTIC_PASSWORD_FILE" ]]; then
    echo "Error: Missing required parameters. Please specify the repository (-r), and password file (-p)."
    exit 1
fi

# Ensure absolute paths are used
export RESTIC_REPOSITORY=$(readlink -f "$RESTIC_REPOSITORY")
export RESTIC_PASSWORD_FILE=$(readlink -f "$RESTIC_PASSWORD_FILE")

# Add environment variables to bashrc
echo "export RESTIC_REPOSITORY=$RESTIC_REPOSITORY" >> "/home/$USER/.bashrc"
echo "export RESTIC_PASSWORD_FILE=$RESTIC_PASSWORD_FILE" >> "/home/$USER/.bashrc"

# Setup Restic and initialize repository if necessary
set_up_restic
restic snapshots >/dev/null 2>&1 || restic init

# Perform backup or restoration based on user input
if [[ ${ACTION} == "backup" ]]; then
    export RESTIC_SOURCE=$(readlink -f "$RESTIC_SOURCE")
    restic --password-file "$RESTIC_PASSWORD_FILE" -r "$RESTIC_REPOSITORY" backup --host UCloud --verbose "$RESTIC_SOURCE" "${EXTRA_ARGS[@]}"
    echo "Backup scheduled successfully."

    # Schedule periodic backup job (e.g., every 2 hours)
    BACKUP_JOB="0 */2 * * * restic --password-file $RESTIC_PASSWORD_FILE -r $RESTIC_REPOSITORY backup --host UCloud --verbose $RESTIC_SOURCE ${EXTRA_ARGS[@]} >> /work/backup.log"
    update_crontab "$BACKUP_JOB"
    crontab -l
elif [[ ${ACTION} == "restore" ]]; then
    restic -r "$RESTIC_REPOSITORY" restore latest --password-file "$RESTIC_PASSWORD_FILE" --target "$RESTIC_TARGET" --host UCloud "${EXTRA_ARGS[@]}"
    echo "Data restoration initiated."
fi

The script installs Restic and schedules a backup every 2 hours on a given source folder.

To execute the program, open a terminal window inside a running job instance on UCloud and create a new file called restic_wrapper.sh. For example, if you use GNU nano shell editor:

$ nano /work/restic_wrapper.sh

Copy and paste the content of the script above and save the file (Ctrl+O, Enter, and Ctrl+X to exit nano).

Finally, grant execution permissions:

$ chmod +x /work/restic_wrapper.sh

Backup and recovery usage

1. Backup

To create a backup, specify the repository, source folder, and password file. The script will initialize the repository if it does not exist and schedule periodic backups.

Example command:

$ /work/restic_wrapper.sh backup -r /path/to/repository -s /path/to/source -p /path/to/password_file.txt

Tip

created restic repository 56afb44a78 at /path/to/repository

Please note that knowledge of your password is required to access
the repository. Losing your password means that your data is
irrecoverably lost.
open repository
repository 56afb44a opened (version 2, compression level auto)
created new cache in /home/ucloud/.cache/restic
lock repository
no parent snapshot found, will read all files
load index files

start scan on [/path/to/source]
start backup on [/path/to/source]
scan finished in 0.223s: 100 files, 4.395 KiB

Files:         100 new,           0 changed,           0 unmodified
Dirs:            3 new,           0 changed,           0 unmodified
Data Blobs:    100 new
Tree Blobs:      4 new Added to the repository: 54.954 KiB (19.131 KiB stored)

processed 100 files, 4.395 KiB in 0:00
snapshot 94f26cd0 saved
Backup scheduled successfully.

2. Recovery

To restore from the latest snapshot, specify the repository, target folder, and password file.

Example command:

$ /work/restic_wrapper.sh restore -r /path/to/repository -t /path/to/target -p /path/to/password_file.txt

Tip

repository 56afb44a opened (version 2, compression level auto)

restoring <Snapshot 94f26cd0 of [/path/to/source] at 2024-03-08 09:12:32.333577672 +0100 CET by ucloud@UCloud> to /path/to/target
Summary: Restored 103 files/dirs (4.395 KiB) in 0:00
Data restoration initiated.

Customizing backup frequency

The scheduling of backups is efficiently managed through the use of cron jobs. To adapt the backup frequency to your needs, you may adjust the cron job schedule within the script. Detailed guidance on cron schedule expressions can be found here.

For instance, to configure a backup to occur every hour, you would modify the script as follows:

# Schedule a backup to run every hour
0 * * * * restic --password-file $RESTIC_PASSWORD_FILE -r $RESTIC_REPOSITORY backup --host UCloud --verbose $RESTIC_SOURCE ${EXTRA_ARGS[@]} >> /work/backup.log

Alternatively, for more frequent backups, such as every 30 minutes, the line should be adjusted to:

# Schedule a backup to run every 30 minutes
*/30 * * * * restic --password-file $RESTIC_PASSWORD_FILE -r $RESTIC_REPOSITORY backup --host UCloud --verbose $RESTIC_SOURCE ${EXTRA_ARGS[@]} >> /work/backup.log

These examples demonstrate how to modify the cron job schedule to either increase or decrease the frequency of your data backups, ensuring your files are backed up according to your specific requirements.

Utilizing advanced options

Restic offers several advanced options to tailor backup and restoration processes to your specific needs. These options include flags for file change detection, excluding or including files, and managing snapshots. Here's how to use some of these advanced features in conjunction with the provided script.

1. File Change Detection

To force a backup regardless of file changes, you can use the --force flag. This option is useful when you want to ensure a snapshot is created at a specific time, regardless of whether the files have been modified since the last backup.

Example command:

$ /work/restic_setup.sh backup -r /path/to/my/repository -s /path/to/my/source -p /path/to/my/password_file.txt --force

2. Excluding Files

If you want to avoid backing up certain files or directories, the --exclude option can be utilized. This is particularly useful for skipping large, temporary, or sensitive files that do not need to be included in the backup.

Example command:

$ /work/restic_setup.sh backup -r /path/to/my/repository -s /path/to/my/source -p /path/to/my/password_file.txt --exclude /path/to/file_or_directory

3. Including Files

Conversely, if you want to include specific files or directories that are not located within the source directory, you can use the --include option. This allows for more granular control over what gets backed up, enabling the addition of important files from various locations.

Example command:

$ /work/restic_setup.sh backup -r /path/to/my/repository -s /path/to/my/source -p /path/to/my/password_file.txt --include /path/to/file_or_directory

4. Managing Snapshots

For more control over which snapshot to restore or to manage snapshots manually, directly use the restic snapshot and restic restore command line options.

  • Getting the list of all the repository snaphots:

    $ restic snapshots -r /path/to/my/repository --password-file /path/to/my/password_file.txt
    
  • Restoring a specific snapshot:

    $ restic -r /path/to/my/repository restore <snapshot_id> --password-file /path/to/my/password_file.txt --target /path/to/my/target --host UCloud
    
  • Using a policy to forget (delete) snapshots:

    $ restic -r /path/to/my/repository forget --keep-last 10 --password-file /path/to/my/password_file.txt
    

    This command keeps the last 10 snapshots and deletes the rest, helping manage disk space and organize backups efficiently.

Alternative backup solutions

While Restic offers a robust and encrypted backup solution, other tools like Rsync and Borg can also be utilized for data backup, each with its unique features and use cases.

  • Rsync: Ideal for incremental file transfers and backups. It does not encrypt data, allowing direct access and modification of backup files. For more details, see the Rsync guide.

  • Borg: Similar to Restic, Borg provides efficient and secure incremental backups. For more information, visit Borg's official website.