![]() |
|
IntroductionYou will
find here a collection of example of use for several features of dar
suite command-line tools.
ContentsDar and
remote backup server
dar and ssh Bytes, bits, kilo, mega etc. Running DAR in background Files' extension used Running command or scripts from DAR Convention for DUC files Convention for DBP files User target in DCF Using data protection with DAR & Parchive Examples of file filtering Decremental Backup Door inodes (Solaris) |
Dar and remote backup serverThe
situation is
the following : you have a host (called local in the following), on
which resides an operational system, which you want to backup
regularly, without perturbing users. For security reasons you want to
store the backup on another host (called remote host in the following),
only used for backup. Of course you have not much space on local host
to store the archive.
Between
these two
hosts, you could use NFS and nothing more would be necessary
to use dar as usually. but if for security reasons you don't want to
use NFS (insecure network, local user must not have access to backups),
but prefer to communicate through an encrypted session, (using ssh for
example) then you need to use dar features brought by version 1.1.0:
dar can output its archive to stdout instead of a given file. To activate it, use "-" as basename. Here is an example : dar -c - -R / -z |
some_program or
dar -c - -R / -z >
named_pipe_or_file Note, that
file
splitting is not available as it has not much meaning when writing to a
pipe. (a pipe has no name, there is no way to skip (or seek) in a pipe,
while dar needs to set back a flag in a slice header when it is not the
last slice of the set). At the other end of the pipe (on the remote
host), the data can be redirected to a file, with proper filename
(something that matches "*.1.dar").
some_other_program >
backup_name.1.dar It is also
possible to redirect the output to dar_xform which can in turn on the
remote host split the data flow in several files, pausing between them
if necessary,
exactly as dar is able to do:
some_other_program |
dar_xform -s 100M - backup_name this will
create backup_name.1.dar and so on. The resulting archive is totally
compatible with those directly generated by dar. OK,
you are happy, you can backup the local filesystem to a remote server
through a secure socket session, in a full featured dar archive
without using NFS. But, now you want to make a differential backup
taking this archive as reference. How to do that? The
simplest way is to use the new feature called "isolation", which
extracts the catalogue from the archive and stores it in a little file.
On the remote backup server you would type:
dar -A backup_name -C
CAT_backup_name -z if the catalogue is too big to fit on a floppy, you can split it as usually using dar: dar -A backup_name -C
CAT_backup_name -z -s 1440k the
generated
archive (CAT_backup_name.1.dar, and so on), only contains the
catalogue, but can still be used as reference for a new backup (or
backup of the internal catalogue of the archive, using -x and -A at the
same time). You
just need to transfer it back to the local host, either using floppies,
or through a secured socket session, or even directly isolating the
catalogue to a pipe that goes from the remote host to the local
host:
on remote host: dar -A backup_name -C - -z
| some_program on local host: some_other_program >
CAT_backup_name.1.dar or use dar_xform as previously if you need splitting : some_other_program |
dar_xform -s 1440k CAT_backup_name then you can make your differential backup as usual: dar -A CAT_backup_name -c -
-z -R / | some_program or if this time you prefer to save the archive locally: dar -A CAT_backup_name -c
backup_diff -z -R / For
differential
backups instead of isolating the catalogue, it is also possible to read
an archive or its extracted catalogue through pipes. Yes, two pipes are
required for dar to be able to read an archive. The first goes from dar
to the external program "dar_slave" and carries orders (asking some
portions of the archive), and the other pipe, goes from "dar_slave"
back to "dar" and carries the asked data for reading.
By default, if you specify "-" as basename for -l, -t, -d, -x, or to -A (used with -C or -c), dar and dar_slave will use their standard input and output to communicate. Thus you need additional program to make the input of the first going to the output to the second, and vice versa. Warning: you cannot use named pipe that way, because dar and dar_slave would get blocked upon opening of the first named pipe, waiting for the peer to open it also, even before they have started (dead lock at shell level). For named pipes, there is -i and -o options that helps, they receive a filename as argument, which may be a named pipe. The -i argument is used instead of stdin and -o instead of stdout. Note that for dar -i and -o are only available if "-" is used as basename. Let's take an example: You now want to restore an archive from your remote backup server. Thus on it you have to run dar_slave this way on remote server: some_prog | dar_slave
backup_name | some_other_prog or
dar_slave -o
/tmp/pipe_todar -i /tmp/pipe_toslave backup_name and on the local host you have to run dar this way: some_prog | dar -x - -v ...
| some_other_prog or
dar -x - -i /tmp/pipe_todar
-o /tmp/pipe_toslave -v ... there is no
order
to run dar or dar_slave first, and dar can use -i and/or -o, while
dar_slave does not. What is important here is to connect in a way or in
an other their input and output, it does not matter how. The only
restriction is that communication support must be perfect: no data
loss, no duplication, no order change, thus communication over TCP
should be fine.
Of course,
you can
also isolate a catalogue through pipes, test an archive, make
difference, use a reference catalogue this way etc, and even then,
output the resulting archive to pipe ! If using -C or -c with "-" while
using -A also with "-", it is then mandatory to use -o: The output
catalogue will generated on standard output, thus to send order to
dar_slave you must use another channel with -o:
LOCAL
HOST
REMOTE HOST
+-----------------+
+-----------------------------+ | filesystem
|
| backup of reference | |
|
|
|
|
| |
|
|
|
|
| |
V
|
|
V
| |
+-----+ | backup of reference
|
+-----------+ | | | DAR
|--<-]=========================[-<--| DAR_SLAVE
| | | |
|-->-]=========================[->--|
| | |
+-----+ | orders to dar_slave
|
+-----------+ | |
|
|
|
+-----------+ | |
+--->---]=========================[->--| DAR_XFORM |--> backup|
|
| saved data
| +-----------+ to slices|
+-----------------+
+-----------------------------+ on local host : dar -c - -A - -i
/tmp/pipe_todar -o /tmp/pipe_toslave | some_prog on the remote host : dar_slave -i
/tmp/pipe_toslave -o /tmp/pipe_todar full_backup dar_slave provides the
full_backup for -A option
some_other_prog | dar_xform
- diff -s 140M -p ... while dar_xform make slice of the
output archive provided by dar
See below an example with netcat and another using ssh. |
Running DAR in backgroundDAR can be run in background:
dar [command-line
arguments] < /dev/null & |
Files' extension useddar suite programs use several
type of files:
If for slice the extension and
even the filename format cannot be
customized, (basename.slicenumber.dar) there is not mandatory rule for
the other type of files.
In the case you have no idea how to name these, here is the extensions I use: "*.dcf":
Dar Configuration file, aka DCF files (used with dar's -B option)
"*.dmd": Dar Manager Database, aka DMD files (used with dar_manager's -B and -C options) "*.duc": Dar User Command, aka DUC files (used with dar's -E, -F, -~ options) "*.dbp": Dar Backup Preparation, aka DBP files (used with dar's -= option) "*.dfl": Dar Filter List, aka DFL files (used with dar's -[ or -] options) but, you are totally free to use the filename you want ! ;-) |
Running command or scripts from DARYou can run command from dar at
two different places:
A - Between slices:This
concerns -E,
-F and -~ options. They all receive a string as
argument. Thus, if the argument must be a command with its own
arguments, you have to put these between quotes for they appear as a
single string to the shell that interprets the dar command-line. For
example if you want to call
df . [This is two worlds: "df" (the command) and "." its argument] then you have to use the following on DAR command-line: -E "df ." or
-E 'df .' DAR provides several substitution strings in that context:
The number
of the slice (
%n )
is either the just written slice or the next slice to be read. For
example if you create an new archive (either using -c, -C or -+), in -E
option, the %n macro is the number of the last
slice completed. Else (using -t, -d, -A (with -c or -C), -l or -x),
this is the number of the slice that will be required very soon. While
%c (the context) is substituted by "init", "operation" or "last_slice".
What the use of this feature? For example you want to burn the brand-new slices on CD as soon as they are available. let's build a little script for that: %cat burner
#!/bin/bash
if [ "$1" == "" -o "$2" == "" ] ; then echo "usage: $0 <filename> <number>" exit 1 fi
mkdir T mv $1 T mkisofs -o /tmp/image.iso -r -J -V "archive_$2" T cdrecord dev=0,0 speed=8 -data /tmp/image.iso rm /tmp/image.iso if diff /mnt/cdrom/$1 T/$1 ; then rm -rf T else endif
% This little
script, receive the
slice
filename, and its number as argument, what it does is to burn a CD with
it, and compare the resulting CD with the original slice. Upon failure,
the script return 2 (or 1 if syntax is not correct on the
command-line). Note that this script is only here for illustration,
there are many more interesting user scripts made by several dar users.
These are available in the examples
part of the documentation.
One could then use it this way: -E "./burner %p/%b.%n.dar
%n" which can lead to the following DAR command-line: dar -c ~/tmp/example -z -R
/ usr/local -s 650M -E "./burner %p/%b.%n.dar %n" -p First note
that as our script
does
not change CD from the device, we need to pause between slices (-p
option). The pause take place after the execution of the command (-E
option). Thus we could add in the script a command to send a mail or
play a music to inform us that the slice is burned. The advantage, here
is that we don't have to come twice by slices, once the slice is
ready, and once the slice is burnt.
Another example: you want to
send a huge file by
email. (OK that's better to use FTP,
but sometimes, people think than the less you can do the more they
control you, and thus they disable many services, either by fear of the
unknown, either by stupidity). So let's suppose that you only have mail
available to
transfer your data:
dar -c toto -s 2M
my_huge_file -E
"uuencode %b.%n.dar %b.%n.dar | mail -s 'slice %n' your@email.address ;
rm %b.%n.dar ; sleep 300" Here we make
an archive with
slices of 2 Megabytes, because our mail
system does not allow larger emails. We save only one file:
"my_huge_file" (but we could even save the whole filesystem it would
also work). The command we execute each time a slice is ready is:
Note that we did not used the
%p
substitution string, as
the slices are saved in the current directory.Last example, is while
extracting: in
the case the slices cannot all be present in the filesystem, you need a
script or a command to fetch the next to be requested slice. It could
be using ftp, lynx, ssh, etc. I let you do the script as an exercise.
:-). Note, if you plan to share
your DUC files, thanks to use the convention
fo DUC files.
B - Before and after saving a file:This concerns the -=, -< and
-> options. The -< (include) and -> (exclude) options, let you
define which file will need a command to be run before and after their
backup. While the -= option, let you define which command to run for
those files.
Let's suppose you have a very
large file changing often that is located
in /home/my/big/file, and several databases that each consist of
several files
under /home/*/database/data that need to have a coherent status and are
also changing very often.
Saving them without precaution,
will most probably make your big file flagged as "dirty" in dar's
archive, which means that the saved
status of the file may be a status that never existed for that file:
when dar saves a file it reads the first byte, then the second, etc. up
to the end of file. While dar is reading the middle of the file, an
application may change the very begin and then the very end of
that file, but only modified ending of that file will be saved, leading
the archive to contain a copy of the file in a state it never had.
For a database this is even worse, two or more files may need to have a coherent status. If dar saves one first file while another file is modified at the same time, this will not lead having the currently saved files flagged as "dirty", but may lead the database to have its files saved in incoherent states between them, thus leading you to have saved the database in a corrupted state. For that situation not to occur, we will use the following options: -R / "-<" home/my/big/file
"-<" "home/*/database/data"
First,
you must pay attention to quote the -< and -> options for the
shell not to consider you ask for redirection to stdout or from stdin.
Back to the example, that says that
for the files /home/my/big/file and for any "database/data" directory
(or file) in the home directory of a user, a command will be run before
and after saving that directory of file. We need thus to define the
command to run using the following option:
-=
"/root/scripts/before_after_backup.sh %f %p %c"
Well as you see, here too we may
(and should) use substitutions macro:
And our script here could
look like this:
cat
/root/scripts/before_after_backup.sh
#!/bin/sh
if [ "$filename" = "data" ]; then if ["$context" = "start" ]; then # action to stop the database located in "$2" else # action to restart the database located in "$2" fi else if ["$path_file" = "/home/my/big/file"]; then if ["$context" = "start" ]; then # suspend the application that writes to that file else # resume the application that writes to that file fi else # do nothing, or warn that no action is defined for that file fi So now, if we run dar with all these command, dar will execute our script once before entering any database/data directory located in a home directory of some user, and once all files of that directory will have been saved. It will run our script also before and after saving our /home/my/big/file file. If you plan to share your DBP
files, thanks to use the DBP convention.
|
Convention for DUC filesSince version 1.2.0 dar's user
can have dar calling a command or scripts between slices, thanks to
the -E, -F and -~ options, called DUC files. To be able to easily
share your DUC commands or
scripts, I propose you the following convention:
- use the ".duc" extension to show anyone the script/command respect the following - must be called from dar with the following arguments: example.duc
%p %b %n %e %c [other optional arguments] - when called without argument, it must provide brief help on what it does and what are the expected arguments. This is the standard "usage:" convention. Then, any user, could share their DUC files and don't bother much about how to use them. Moreover it would be easy to chain them: if for example two persons created their own script, one "burn.duc" which burns a slice onDVD-R(W) and "par.duc" which makes a Parchive redundancy file from a slice, anybody could use both at a time giving the following argument to dar: -E
"par.duc %p %b %n %e %c 1 ; burn.duc %p %b %n %e %c" or since version 2.1.0 with the following argument: -E
"par.duc %p %b %n %e %c 1" -E "burn.duc %p %b %n %e %c" of course a script has not to use all its arguments, in the case of burn.duc for example, the %c (context) is probably useless, and not used inside the script, while it is still possible to give it all the "normal" arguments of a DUC file, extra not used argument are simply ignored. If you have interesting DUC scripts, you are welcome to contact me by email, for I add them on the web site and in the following releases. For now, check doc/samples directory for a few examples of DUC files. Note that all DUC scripts are expected to return a exit status of zero meaning that the operation has succeeded. If another exit status has been returned, dar asks the user for decision (or aborts if no user has been identified, for example, dar is not ran under a controlling terminal). |
Convention for DBP filesSame as above, the following
convention is proposed to ease the sharing of Dar Backup Preparation
files:
- use the ".dbp" extension to show anyone the script/command respect the following - must be called from dar with the following arguments: example.dbp
%p %f %u %g %c [other optional arguments] - when called without argument, it must provide brief help on what it does and what are the expected arguments. This is the standard "usage:" convention. Identically to DUC files, DBP files are expected to return a exist status of zero, else the backup process is suspended for the user to decide wether to retry, ignore the failure or abort the whole backup process. |
Using data protection with DAR & ParchiveParchive
(PAR in the following)
is a
very nice program that makes possible to recover a file which has been
corrupted. It creates redundancy data stored in a separated file (or
set of files), which can be used to repair the original file. This
additional data may also be damaged, PAR will be able to repair the
original file as well as the redundancy files, up to a certain point,
of course. This point is defined by the percentage of redundancy you
defined for a given file. But,... check the official PAR site here:
http://parchive.sourceforge.net (original site no more maintained today) https://github.com/BlackIkeEagle/par2cmdline (fork from the official site maintained since decembre 2013) Since
version 2.4.0, dar is provided with a default /etc/darrc file. It
contains a set of user
targets among which is "par2". This user target invokes the dar_par.dcf
file provided beside dar that automatically creates parity file for
each slice during backup and verifies and if necessary repaires slices
when testing an archive. So now you only need to use dar this way to
activate Parchive with dar:
dar [options] par2
Simple no?
|
Examples of file filteringFile
filtering is what defines
which
files are saved, listed, restored, compared, tested, and so on. In
brief, in the following we will say which file are elected for the
operated, meaning by "operation", either a backup, a restoration, an
archive contents listing, an archive comparison, etc.
File filtering is done using the following options -X, -I, -P, -R, -[, -] or -g. OK, Let's start with some concretes examples:
dar -c
toto this will backup the current directory and all what is located into it to build the toto archive, also located in the current directory. Usually you should get a warning telling you that you are about to backup the archive itself Now let's see something less obvious: dar -c
toto -R / -g home/ftp the -R option tell dar to consider all file under the / root directory, while the -g "home/ftp" argument tells dar to restrict the operation only on the home/ftp subdirectory of the given root directory thus here /home/ftp. But this is a little bit different from the following:
dar -c
toto -R /home/ftp here dar will save any file under /home/ftp without any restriction. So what is the difference? Yes, exactly the same files will be saved as just above, but the file /home/ftp/welcome.msg for example, will be stored as <ROOT>/welcome.msg . Where <ROOT> will be replaced by the argument given to -R option (which defaults to "."), at restoration or comparison time. While in the previous example the same file would have been stored with the following path <ROOT>/home/ftp/welcome.msg . dar -c
toto -R / -P home/ftp/pub -g home/ftp -g etc as previously, but the -P option make all files under the /home/ftp/pub not to be considered for the operation. Additionally the /etc directory and its subdirectories are saved. dar -c
toto -R / -P etc/password -g etc here we save all the /etc except the /etc/password file. Arguments given to -P can be plain files also. But when they are directory this exclusion applies to the directory itself and its contents. Note that using -X to exclude "password" does have the same effect: dar -c
toto -R / -X "password" -g etc will save all the /etc directory except any file with name equal to "password". thus of course /etc/password will no be saved, but if it exists, /etc/rc.d/password will not be saved neither if it is not a directory. Yes, if a directory /etc/rc.d/password exist, it will not be affected by the -X option. As well as -I option, -X option do not apply to directories. The reason is to be able to filter some kind of file without excluding a particular directory for example you want to save all mp3 files and only MP3 files, dar -c
toto -R / -I "*.mp3" -I "*.MP3" home/ftp will save any mp3 or MP3 ending files under the /home/ftp directories and subdirectories. If instead -I (or -X) applied to directories, we would only be able to recurse in subdirectories ending by ".mp3" or ".MP3". If you had a directory named "/home/ftp/Music" for example, full of mp3, you would not have been able to save it. Note that the glob expressions (where comes the shell-like wild-card '*' '?' and so on), can do much more complicated things like "*.[mM][pP]3". You could thus replace the previous example by: dar -c
toto -R / -I "*.[mM][pP]3" home/ftp this would cover all .mp3 .mP3 .Mp3 and .MP3 files. One step further, the -acase option makes following filtering arguments become case sensitive (which is the default), while the -ano-case (alias -an in short) set to case insensitive mode filters arguments that follows it. In shorter we could have: dar -c toto -R / -an
-I "*.mp3' home/ftp And, instead of using glob expression, you can use regular expressions (regex) using the -aregex option. You can also use alternatively both of them using -aglob to return back to glob expressions. Each option -aregex / -aglob define the expected type of expression in the -I/-X/-P/-g/-u/-U/-Z/-Y options that follows, up to end of line or to the next -aregex / -aglob option. Last a more complete example: dar -c
toto -R / -P "*/.mozilla/*/[Cc]ache" -X ".*~" -X ".*~" -I
"*.[Mm][pP][123]" -g home/ftp -g "fake" so what ? OK, here we save all under /home/ftp and /fake but we do not save the contents of "*/.mozilla/*/[Cc]ache" like for example "/home/ftp/.mozilla/ftp/abcd.slt/Cache" directory and its contents. In these directories we save any file matching "*.[Mm][pP][123]" files except those ending by a tilde (~ character), Thus for example file which name is "toto.mp3" or ".bloup.Mp2" Now the inside algorithm: a file is elected for operation if 1 - its name does not match any -X option or it is a directory *and* 2 - if some -I is given, file is either a directory or match at least one of the -I option given. *and* 3 - path and filename do not match any -P option *and* 4 - if some -g options are given, the path to the file matches at least one of the -g options. The algorithm we detailed above is the default one, which is historical and called the unordered method, since version 2.2.x there is also an ordered method (activated adding -am option) which gives even more power to filters, the dar man mage will give you all the details. In parallel of file filtering, you will find Extended Attributes filtering thanks to the -u and -U options (they work the same as -X and -I option but apply to EA), you will also find the file compression filtering (-Z and -Y options) that defines which file to compress or to not compress, here too the way they work is the same as seen with -X and -I options, the -ano-case / -acase options do also apply here, as well as the -am option. Last all these filtering (file, EA, compression) can also use regular expression in place of glob expression (thanks to the -ag / -ar options). Note in very last point, that the --backup-hook-include and --backup-hook-exclude options act the same as -P and -g options but apply to the files about to be saved and provides to the user the possibility to perform an action (--backup-hook-execute) before and after saving files matching the masks options. The dar man page will give you all the necessary details to use this new feature. |