Python Based Backup Script for Linux

Introduction
Here at CDOT, our current backup solution was a little archaic, and hard to expand on. I decided to make a new method of backup that can be run from a single computer and backup our entire infrastructure. This script is currently, as I’m writing this not in a finished state, however it is in a state where it works and is usable as a replacement to our previous system. I would like to pose a warning that this method of backup across systems is not a very secure method, and it does pose security threats. Since it does require you to give some users nopasswd sudo access to some or all programs. I am looking for a way around this, and would appreciate any input on this matter.

Here is a copy of the script: smart-bk.py

Goals
There were a few goals that were kept in mind with this script:
– Script resides on a single computer (complete)
– Do not run multiple backups using the same hard drive (complete)
– Check space requirements before performing a backup on source and destination (in progress)
– Emails out daily reports on success or fail (not complete)
– Logs all information /var/log/smart-bk/ (complete)
– Easy(ish) to add a new backup schedule (complete)
– Can view all backups that are currently running (complete)
– Can view all the backups in the queue to run (complete)
– Can view all the schedules that are added (complete)
– Records a record of all previously run backups (not complete)
– Website to view status of currently running backups (not complete)

At this time, not all of these goals have been completed, but I would like them to be sooner or later. Right now I’m setting up a little documentation on how it currently works, what it’s missing, and what my next steps will be.

Scheduler System
The main chunk of the script is setting up a scheduler system. A person or script will add backups they would like to be performed to a schedule using specific parameters. A schedule looks like this:

----------------------------------------------------------------------------------------------------
id|day|time|type|source host|dest host|source dir|dest dir|source user|dest user
----------------------------------------------------------------------------------------------------
1|06|11:00|archive|japan|bahamas|/etc/|/data/backup/japan/etc/|backup|backup

What do these fields mean?

id - This is just a unique field identifier.
day - This is the day the backup last was last run. This is used to check if the schedule is expired(in the past) or has already completed.
time - This is the time at which the backup will start. This allows you to order different schedules to happen earlier or later in the day.
type - This is the type of backup. Currently there are 3.
     - archive backup wraps the directory specified in a tar archive and compresses it with bzip. Uses options: tar -cpjvf
     - rsync is a very simple rsync that preserves most things. Uses options: rsync -aHAXEvz
     - dbdump backup, this is specifically a koji db backup currently. Uses options: pg_dump koji
source_host - This host is the target for backup. You want the files backup up from here.
dest_host - This host is your backup storage location. All files backed up will go here.
source_dir - This directory correlates to source_host. This is the directory that is backed up.
dest_dir - This directory correlates to dest_host. This is where the backup is stored.
source_user - User to use on the source host.
dest_user - User to use on the dest host.

Database
All data for this script is stored inside a sqlite3 db.

sqlite> .schema 
CREATE TABLE Queue(scheduleid INTEGER, queuetime TEXT, FOREIGN KEY(scheduleid) REFERENCES Schedule(id));
CREATE TABLE Running(scheduleid INTEGER, starttime TEXT, FOREIGN KEY(scheduleid) REFERENCES Schedule(id));
CREATE TABLE Schedule(id INTEGER PRIMARY KEY, day TEXT, time TEXT, type TEXT, source_host TEXT, dest_host TEXT, source_dir TEXT, dest_dir TEXT, source_user TEXT, dest_user TEXT);

How To Use sbk
Checking all the available options:

[backup@bahamas ~]$ sbk -h

Output:

Usage: sbk [options]

The smart backup scheduler program sbk is used to run backups from computer to
computer. sbk does this by adding and removing schedules from a schedule
database. Once added to the schedule database, sbk should be run with '--
queue' in order to intelligently add hosts to a queue and start running
backups. It is recommended to run this as a cron job fairly often, more
fequently depending on the number of schedules.

Options:
  -h, --help          show this help message and exit
  -q, --queue         queue schedules and start backups
  -a, --add           add new schedule at specific time
  -s, --show          show the schedule and host info
  -r, --remove        remove existing schedule
  --remove-queue      remove existing schedule from queue
  --remove-run        remove existing schedule from running
  --expire            expire the day in schedule
  --add-queue         add a single schedule to queue
  --sid=scheduleid    specify schedule id for removing schedules
  --time=18:00        specify the time to run the backup
  --backup-type=type  archive, pg_dump, rsync
  --source-host=host  specify the source backup host
  --source-dir=dir    specify the source backup dir
  --source-user=user  specify the source user
  --dest-host=host    specify the destination backup host
  --dest-dir=dir      specify the destination backup dir
  --dest-user=user    specify the destination user
  --log-dir=dir       specify the directory to save logs

Showing Schedule Information
Show all schedules, schedules in queue, and running schedules:

[backup@bahamas ~]$ sbk -s

Output:

        -[Schedule]-
----------------------------------------------------------------------------------------------------
id|day|time|type|source host|dest host|source dir|dest dir|source user|dest user
----------------------------------------------------------------------------------------------------
1|06|11:00|archive|japan|bahamas|/etc/|/data/backup/japan/etc/|backup|backup
2|06|11:00|archive|romania|bahamas|/etc/|/data/backup/romania/etc/|backup|backup
----------------------------------------------------------------------------------------------------

-[Queue]-
----------------------------------------------------------------------------------------------------
|schedule id|queue time|
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------

-[Running]-
----------------------------------------------------------------------------------------------------
|schedule id|start time|
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------

Adding new schedules
All of these options are unfortunately required.
Add a new schedule:

[backup@bahamas ~]$ sbk --add  --time=11:00 --backup-type=archive --source-host=japan --dest-host=bahamas --source-dir=/etc/ --dest-dir=/data/backup/japan/etc/ --source-user=backup --dest-user=backup

Removing schedules
In order to remove a schedule, a “sid” must be specified. This is simply the “id” of the schedule, which is unique to schedules.
Remove a schedule:

[backup@bahamas ~]$ sbk --remove --sid=1

Start the Backups
Start intelligently queuing schedules and starting backups(best to run this in crontab:

sbk -q
or
sbk --queue

If you found this post interesting, there is more information about this backup system and it’s uses on the zenit wiki
http://zenit.senecac.on.ca/wiki/index.php/OSTEP_Infrastructure#Backup_System

Advertisements

About oatleywillisa

Computer Networking Student
This entry was posted in SBR600 and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s