Issue
I am currently writing a bash script for rsync. I am pretty sure I am doing something wrong. But I can't tell what it is. I will try to elaborate everything in detail so hopefully someone can help me.
The goal of script is to do full backups and incremental ones using rsync. Everything seems to work perfectly well, besides one crucial thing. It seems like even though using the --link-dest
parameter, it still copies all the files. I have checked the file sizes with du -chs
.
First here is my script:
#!/bin/sh
while getopts m:p: flags
do
case "$flags" in
m) mode=${OPTARG};;
p) prev=${OPTARG};;
*) echo "usage: $0 [-m] [-p]" >&2
exit 1 ;;
esac
done
date="$(date '+%Y-%m-%d')";
#Create Folders If They Do Not Exist (-p paramter)
mkdir -p /Backups/Full && mkdir -p /Backups/Inc
FullBackup() {
#Backup Content Of Website
mkdir -p /Backups/Full/$date/Web/html
rsync -av user@IP:/var/www/html/ /Backups/Full/$date/Web/html/
#Backup All Config Files NEEDED. Saving Storage Is Key ;)
mkdir -p /Backups/Full/$date/Web/etc
rsync -av user@IP:/etc/apache2/ /Backups/Full/$date/Web/etc/
#Backup Fileserver
mkdir -p /Backups/Full/$date/Fileserver
rsync -av user@IP:/srv/samba/private/ /Backups/Full/$date/Fileserver/
#Backup MongoDB
ssh user@IP /usr/bin/mongodump --out /home/DB
rsync -av root@BackupServerIP:/home/DB/ /Backups/Full/$date/DB
ssh user@IP rm -rf /home/DB
}
IncrementalBackup(){
Method="";
if [ "$prev" == "full" ]
then
Method="Full";
elif [ "$prev" == "inc" ]
then
Method="Inc";
fi
if [ -z "$prev" ]
then
echo "-p Parameter Empty";
else
#Get Latest Folder - Ignore the hacky method, it works.
cd /Backups/$Method
NewestBackup=$(find . ! -path . -type d | sort -nr | head -1 | sed s@^./@@)
IFS='/'
read -a strarr <<< "$NewestBackup"
Latest_Backup="${strarr[0]}";
cd /Backups/
#Incremental-Backup Content Of Website
mkdir -p /Backups/Inc/$date/Web/html
rsync -av --link-dest /Backups/$Method/"$Latest_Backup"/Web/html/ user@IP:/var/www/html/ /Backups/Inc/$date/Web/html/
#Incremental-Backup All Config Files NEEDED
mkdir -p /Backups/Inc/$date/Web/etc
rsync -av --link-dest /Backups/$Method/"$Latest_Backup"/Web/etc/ user@IP:/etc/apache2/ /Backups/Inc/$date/Web/etc/
#Incremental-Backup Fileserver
mkdir -p /Backups/Inc/$date/Fileserver
rsync -av --link-dest /Backups/$Method/"$Latest_Backup"/Fileserver/ user@IP:/srv/samba/private/ /Backups/Inc/$date/Fileserver/
#Backup MongoDB
ssh user@IP /usr/bin/mongodump --out /home/DB
rsync -av root@BackupServerIP:/home/DB/ /Backups/Full/$date/DB
ssh user@IP rm -rf /home/DB
fi
}
if [ "$mode" == "full" ]
then
FullBackup;
elif [ "$mode" == "inc" ]
then
IncrementalBackup;
fi
The command i used:
Full-Backup
bash script.sh -m full
Incremental
bash script.sh -m inc -p full
Executing the script is not giving any errors at all. As I mentioned above, it just seems like it's still copying all the files. Here are some tests I did.
Output of du -chs
root@Backup:/Backups# du -chs /Backups/Full/2021-11-20/*
36K /Backups/Full/2021-11-20/DB
6.5M /Backups/Full/2021-11-20/Fileserver
696K /Backups/Full/2021-11-20/Web
7.2M total
root@Backup:/Backups# du -chs /Backups/Inc/2021-11-20/*
36K /Backups/Inc/2021-11-20/DB
6.5M /Backups/Inc/2021-11-20/Fileserver
696K /Backups/Inc/2021-11-20/Web
7.2M total
Output of ls -li
root@Backup:/Backups# ls -li /Backups/Full/2021-11-20/
total 12
1290476 drwxr-xr-x 4 root root 4096 Nov 20 19:26 DB
1290445 drwxrwxr-x 6 root root 4096 Nov 20 18:54 Fileserver
1290246 drwxr-xr-x 4 root root 4096 Nov 20 19:26 Web
root@Backup:/Backups# ls -li /Backups/Inc/2021-11-20/
total 12
1290506 drwxr-xr-x 4 root root 4096 Nov 20 19:28 DB
1290496 drwxrwxr-x 6 root root 4096 Nov 20 18:54 Fileserver
1290486 drwxr-xr-x 4 root root 4096 Nov 20 19:28 Web
Rsync Output when doing the incremental backup and changing/adding a file
receiving incremental file list
./
lol.html
sent 53 bytes received 194 bytes 164.67 bytes/sec
total size is 606 speedup is 2.45
receiving incremental file list
./
sent 33 bytes received 5,468 bytes 11,002.00 bytes/sec
total size is 93,851 speedup is 17.06
receiving incremental file list
./
sent 36 bytes received 1,105 bytes 760.67 bytes/sec
total size is 6,688,227 speedup is 5,861.72
*Irrelevant MongoDB Dump Text*
sent 146 bytes received 2,671 bytes 1,878.00 bytes/sec
total size is 2,163 speedup is 0.77
I suspect that the ./
has something to do with that. I might be wrong, but it looks suspicious. Though when executing the same command again, the ./
are not in the log, probably because I did it on the same day, so it was overwriting in the /Backup/Inc/2021-11-20
Folder.
Let me know for more information. I have been trying around for a long time now. Maybe I am simply wrong and there are links made and disk space economized.
Solution
I didn't read the entire code because the main problem didn't seem to lay there.
Verify the disk usage of your /Backups
directory with du -sh /Backups
and then compare it with the sum of du -sh /Backups/Full
and du -sh /Backups/Inc
.
I'll show you why with a little test:
Create a directory containing a file of 1 MiB:
mkdir -p /tmp/example/data
dd if=/dev/zero of=/tmp/example/data/zerofile bs=1M count=1
Do a "full" backup:
rsync -av /tmp/example/data/ /tmp/example/full
Do an "incremental" backup
rsync -av --link-dest=/tmp/example/full /tmp/example/data/ /tmp/example/incr
Now let's see what we got:
with ls -l
ls -l /tmp/example/*
-rw-rw-r-- 1 user group 1048576 Nov 21 00:24 /tmp/example/data/zerofile
-rw-rw-r-- 2 user group 1048576 Nov 21 00:24 /tmp/example/full/zerofile
-rw-rw-r-- 2 user group 1048576 Nov 21 00:24 /tmp/example/incr/zerofile
and with du -sh
du -sh /tmp/example/*
1.0M /tmp/example/data
1.0M /tmp/example/full
0 /tmp/example/incr
- Oh? There was a 1 MiB file in
/tmp/example/incr
butdu
missed it ?
Actually no. As the file wasn't modified since the previous backup (referenced with --link-dest
), rsync
created a hard-link to it instead of copying its content. — Hard-links connect a same memory space to different files
And du
can detect hard-links and show you the real disk usage, but only when the hard-linked files are included (even in sub-dirs) in its arguments. For example, if you use du -sh
independently for /tmp/example/incr
:
du -sh /tmp/example/incr
1.0M /tmp/example/incr
- How do you detect that there is hard-links to a file ?
ls -l
actually showed it to us:
-rw-rw-r-- 2 user group 1048576 Nov 21 00:24 /tmp/example/full/zerofile
^
HERE
This number means that there are two existing hard-links to the file: this file itself and another one in the same filesystem.
about your code
It doesn't change anything but I would replace:
#Get Latest Folder - Ignore the hacky method, it works.
cd /Backups/$Method
NewestBackup=$(find . ! -path . -type d | sort -nr | head -1 | sed s@^./@@)
IFS='/'
read -a strarr <<< "$NewestBackup"
Latest_Backup="${strarr[0]}";
cd /Backups/
with:
#Get Latest Folder
glob='20[0-9][0-9]-[0-1][0-9]-[0-3][0-9]' # match a timestamp (more or less)
NewestBackup=$(compgen -G "/Backups/$Method/$glob/" | sort -nr | head -n 1)
glob
makes sure that the directories/files found bycompgen -G
will have the right format.- Adding
/
at the end of a glob makes sure that it matches directories only.
Answered By - Fravadona