Gheek.net

September 7, 2010

RSYNC CentOS

Filed under: linux, RSYNC, shell scripts — lancevermilion @ 10:21 am

Here is a script to help RSYNC/MIRROR a CentOS distro. I use this to get a distro (CentOS 5.4 in this case) that I need. This script is made from a variety of scripts that found online. Some parts I have added to make it more complete for my purposes.

#!/bin/bash
#
#--------------[ To Do List ]--------------
#-- Fix logging to be complete.
#--
#--
#--
#--
Version='1.0'

# Default values for CLI and HELP
# 0 Disabled
# 1 Enabled
CLI=0
HELP=0

Usage () {
echo "
+------------------------------------------------------------------------------------+
| Script: `basename $0`
| Description: This script is used to rsync CentOS distribution from a known mirror.
| from a known mirror. This script can only have one instance running
| at a time. The default log location if \"/var/log/rsync.log\".
| Version: $VERSION
| Possible Arguments:
| --cli or -c Run from command line and DO NOT output progress.
| --type or -t ALL = to rsync everything. updates = rsync only updates
| --help or -h Help menu (this output).
|
| Syntax Examples:
| For a cron script
| `basename $0` -c -t updates
|
| To watch progress of the rsync
| `basename $0` -t updates
| or
| `basename $0` -t ALL
+------------------------------------------------------------------------------------+
"
}

if [ "$#" -eq 0 ]; then
Usage
RETVAL=1
exit $RETVAL
fi

args=$(getopt -o ct:h -- "$@")
eval set -- "$args"
while [ ! -z "$1" ]
do
case "$1" in

-c) CLI=1;;

-t) # Check for optional argument.
case "$2" in #+ Double colon is optional argument.
"") # Not there.
HELP=1
RETVAL=1
;;

ALL) # Got it
TYPE='ALL'
;;

updates) # Got it
TYPE='updates'
;;

*) # Help since we don't support anything else
HELP=1
RETVAL=1
;;

esac
;;

-h) HELP=1
RETVAL=0
;;

*) HELP=0
RETVAL=1;;
esac
shift
done

if [ "$HELP" -eq 1 ]; then
if [ "$CLI" -eq 0 ]; then
Usage
fi
RETVAL=35
exit $RETVAL
fi

if [ "$TYPE" != 'ALL' ]; then
if [ "$TYPE" != 'updates' ]; then
if [ "$CLI" -eq 0 ]; then
Usage
fi
RETVAL=45
exit $RETVAL
fi
fi

# User/Group ID to chown files to afterwards
CUID=root
CGID=root

# Define where the rsync binary is based on distro
# Linux
RSYNC_PATH=/usr/bin/rsync
# FreeBSD
#RSYNC_PATH=/usr/local/bin/rsync

# name the file we are going to log our output to.
LOGFILE="/var/log/rsync.log"

# Specify where the temporary lock file will be created
TMPLOCKFILE='/var/lock/subsys/rsync'

# Used to tell cleanup not to remove lock file.
# 1 = DELETE TEMPORARY LOCK FILE
# 0 = DO NOT DELETE TEMPORARY LOCK FILE
DELLOCK=1

# if the logfile writable by the user running the script
# 1 = Yes
# 0 = No
if [ -w $LOGFILE ]; then
WRITABLE='1'
else
WRITABLE='0'
fi

if [ "$WRITEABLE" = '0' ]; then
FILEPERMS=`ls -l $LOGFILE | awk '{print $1,$3,$4,$9}'`
printf "Writable Log File: Log file ($LOGFILE) is not writeable ($FILEPERMS)\n" >> $LOGFILE
RETVAL=1
exit $RETVAL
fi

#----------------------------------------------------------
# Function to perform some housekeeping before exiting.
#----------------------------------------------------------
function cleanup {
#
# Did we create a temporary lock file?
if [[ -f "$TMPLOCKFILE" && "$DELLOCK" -eq 1 ]]; then
# YES, then delete it before we exit.
rm $TMPLOCKFILE
if [ -f "$TMPLOCKFILE" ]; then
printf "Remove Temporary Lock File: FAILED to remove $TMPLOCKFILE. See administrator.\n" >> $LOGFILE
else
printf "Remove Temporary Lock File: Successfully removed $TMPLOCKFILE\n" >> $LOGFILE
fi
fi

# Did we create a temporary file?
if [ -f "$TMPFILE" ]; then
# YES, then delete it before we exit.
rm $TMPFILE
if [ -f "$TMPLOCKFILE" ]; then
printf "Remove Temporary File: FAILED to remove $TMPFILE. See administrator.\n" >> $LOGFILE
else
printf "Remove Temporary File: Successfully removed $TMPFILE\n" >> $LOGFILE
fi
fi
# Terminate the script with a return code of 0 (normal termination) or any other number (abnormal termination).
printf "********** STOP: `date +'[%A %b %d %Y] - [%r]'` **********\n" >> $LOGFILE
exit $RETVAL
}

#----------------------------------------------------------
# Function to check if rsync terminated without errors?
#----------------------------------------------------------
function rsync_return_state {
if [[ "$RETVAL" -ne 0 ]]; then
# NO, there was a problem with rsync. Write a FAILED notice to our log file, then exit.
case $RETVAL in
1) REASON="Syntax or usage error";;
2) REASON="Protocol incompatibility";;
3) REASON="Errors selecting input/output files, dirs";;
4) REASON="Requested action not supported";;
5) REASON="Error starting client-server protocol";;
6) REASON="Daemon unable to append to log-file";;
10) REASON="Error in socket I/O";;
11) REASON="Error in file I/O";;
12) REASON="Error in rsync protocol data stream";;
13) REASON="Errors with program diagnostics";;
14) REASON="Error in IPC code";;
20) REASON="Received SIGUSR1 or SIGINT";;
21) REASON="Some error returned by waitpid()";;
22) REASON="Error allocating core memory buffers";;
23) REASON="Partial transfer due to error";;
24) REASON="Partial transfer due to vanished source files";;
25) REASON="The --max-delete limit stopped deletions";;
30) REASON="Timeout in data send/receive";;
35) REASON="Timeout waiting for daemon connection";;
*) REASON="Undefined error";;
esac
printf "Repository update status FAILED with error code $RETVAL: $REASON\n" >> $LOGFILE
fi
}

# Trap CTRL-C and execute cleanup before exiting
trap cleanup INT

if [ -f "$TMPLOCKFILE" ]; then
DELLOCK=0
printf "\n\n********** START: `date +'[%A %b %d %Y] - [%r]'` **********\n" >> $LOGFILE
printf "Repository update status FAILED: A RSYNC temporary lock file ($TMPLOCKFILE) already exist.\n" >> $LOGFILE
RETVAL=1
cleanup
fi

# Create the temporary lock file.
touch $TMPLOCKFILE

# Was the temporary lock file created without errors?
if [ "$?" -ne 0 ]; then
# NO, print a message to our log file then terminate.
printf "\n\n********** START: `date +'[%A %b %d %Y] - [%r]'` **********\n" >> $LOGFILE
printf "Repository update status FAILED: Unable to create temporary file ($TMPLOCKFILE)\n" >> $LOGFILE
RETVAL=1
cleanup
else
printf "\n\n********** START: `date +'[%A %b %d %Y] - [%r]'` **********\n" >> $LOGFILE
printf "Created Temporary Lock File: $TMPLOCKFILE\n" >> $LOGFILE
fi

# Create a temporary file. Creates file in /tmp
TMPFILE=`mktemp -t rsync.XXXXXXXXXX`

# Was the temporary file created without errors?
if [ $? -ne 0 ]; then
# NO, print a message to our log file then terminate.
printf "Repository update FAILED: Unable to create temporary file ($TMPFILE)\n" >> $LOGFILE
RETVAL=1
cleanup
else
printf "Created Temporary File: $TMPFILE\n" >> $LOGFILE
fi

#if [ "$CLI" -eq 1 ]; then
echo ""
echo "Only download progress for RSYNC will display to STDOUT. You should view the log files \"$LOGFILE\""
echo "and \"$TMPFILE\"."
echo "Example: tail -f $LOGFILE"
echo "Example: tail -f $TMPFILE"
echo ""
#fi

# Variables for RSYNC
DL_PATH='/var/local/yum'
OS='linux'
DISTRIBUTION='centos'
VERSION='5.4'
# Only usable if you are getting the current stable version
#MIRROR_SOURCE="msync.centos.org::CentOS/$VERSION/";
# Great for older stable versions
MIRROR_SOURCE="rsync://mirrors.usc.edu/centos/$VERSION/";

# Variablize paramaters for RSYNC
# All flags used below are explained in detail at the very bottom of this file.
# -a, --archive archive mode; same as -rlptgoD (no -H)
# -q, --quiet suppress non-error messages
# -t, --times preserve times
# -z, --compress compress file data during the transfer
# -H, --hard-links preserve hard links
# --exclude=PATTERN exclude files matching PATTERN
# --progress show progress during transfer
# --delete delete files that don’t exist on sender
# --delay-updates put all updated files into place at end
# --stats Give us some extra file transfer stats. Good during an interactive session.

if [ "$TYPE" = "ALL" ]; then
RSYNC="$RSYNC_PATH --progress --stats -aqtzH --delete --delay-updates --exclude=x86_64 --exclude=SRPM*";
else
RSYNC="$RSYNC_PATH --progress --stats -aqtzH --delete --delay-updates --exclude=SRPM* --exclude=isos --exclude=x86_64";
BASELIST="updates"
fi

printf "Repository update status: Started\n" >> $LOGFILE
printf "Repository update status: Type = $TYPE\n" >> $LOGFILE
if [ "$TYPE" = "updates" ]; then
for BASE in $BASELIST
do
# RSYNC the stuff and change ownership accordingly.

if [ $CLI -eq 1 ]; then
printf "Repository update status Syntax: $RSYNC $MIRROR_SOURCE $DL_PATH/$OS/$DISTRIBUTION/$VERSION/$BASE >> $TMPFILE 2>&1\n" >> $LOGFILE
$RSYNC $MIRROR_SOURCE $DL_PATH/$OS/$DISTRIBUTION/$VERSION/$BASE >> $TMPFILE 2>&1
else
printf "Repository update status Syntax: $RSYNC $MIRROR_SOURCE $DL_PATH/$OS/$DISTRIBUTION/$VERSION/$BASE\n" >> $LOGFILE
$RSYNC $MIRROR_SOURCE $DL_PATH/$OS/$DISTRIBUTION/$VERSION/$BASE
fi

# get the return value from rsync and assign it to RETVAL.
RETVAL=$?
rsync_return_state

printf "Repository update status: Finished\n" >> $LOGFILE
printf "Chowning Repository Syntax: chown -R $CUID:$CGID $DL_PATH/$OS/$DISTRIBUTION/$VERSION/$BASE\n" >> $LOGFILE
chown -R $CUID:$CGID $DL_PATH/$OS/$DISTRIBUTION/$VERSION/$BASE
done

else
# RSYNC the stuff and change ownership accordingly.

if [ $CLI -eq 1 ]; then
printf "Repository update status Syntax: $RSYNC $MIRROR_SOURCE $DL_PATH/$OS/$DISTRIBUTION/$VERSION/ >> $TMPFILE 2>&1\n" >> $LOGFILE
$RSYNC $MIRROR_SOURCE $DL_PATH/$OS/$DISTRIBUTION/$VERSION/ >> $TMPFILE 2>&1
else
printf "Repository update status Syntax: $RSYNC $MIRROR_SOURCE $DL_PATH/$OS/$DISTRIBUTION/$VERSION/\n" >> $LOGFILE
$RSYNC $MIRROR_SOURCE $DL_PATH/$OS/$DISTRIBUTION/$VERSION/
fi

# get the return value from rsync and assign it to RETVAL.
RETVAL=$?
rsync_return_state

printf "Repository update status: Finished\n" >> $LOGFILE
printf "Chowning Repository Syntax: chown -R $CUID:$CGID $DL_PATH/$OS/$DISTRIBUTION/$VERSION\n" >> $LOGFILE
chown -R $CUID:$CGID $DL_PATH/$OS/$DISTRIBUTION/$VERSION
fi

if [[ "$RETVAL" -ne 0 ]]; then
RETVAL=2
cleanup
else

RETVAL=0
cleanup

fi

#
# Excert from the man page for all flags that are used.
#
# -a, --archive
# This is equivalent to -rlptgoD. It is a quick way of saying you want recursion and want to preserve
# almost everything (with -H being a notable omission). The only exception to the above equivalence is
# when --files-from is specified, in which case -r is not implied.
#
# Note that -a does not preserve hardlinks, because finding multiply-linked files is expensive. You must
# separately specify -H.
#
# -q, --quiet
# This option decreases the amount of information you are given during the transfer, notably suppressing
# information messages from the remote server. This flag is useful when invoking rsync from cron.
#
#
# -t, --times
# This tells rsync to transfer modification times along with the files and update them on the remote sys-
# tem. Note that if this option is not used, the optimization that excludes files that have not been mod-
# ified cannot be effective; in other words, a missing -t or -a will cause the next transfer to behave as
# if it used -I, causing all files to be updated (though the rsync algorithm will make the update fairly
# efficient if the files haven’t actually changed, you’re much better off using -t).
#
# -z, --compress
# With this option, rsync compresses the file data as it is sent to the destination machine, which reduces
# the amount of data being transmitted -- something that is useful over a slow connection.
#
# Note that this option typically achieves better compression ratios than can be achieved by using a com-
# pressing remote shell or a compressing transport because it takes advantage of the implicit information
# in the matching data blocks that are not explicitly sent over the connection.
#
# -H, --hard-links
# This tells rsync to look for hard-linked files in the transfer and link together the corresponding files
# on the receiving side. Without this option, hard-linked files in the transfer are treated as though
# they were separate files.
#
# Note that rsync can only detect hard links if both parts of the link are in the list of files being
# sent.
#
# --delete
# This tells rsync to delete extraneous files from the receiving side (ones that aren’t on the sending
# side), but only for the directories that are being synchronized. You must have asked rsync to send the
# whole directory (e.g. "dir" or "dir/") without using a wildcard for the directory’s contents (e.g.
# "dir/*") since the wildcard is expanded by the shell and rsync thus gets a request to transfer individ-
# ual files, not the files’ parent directory. Files that are excluded from transfer are also excluded
# from being deleted unless you use the --delete-excluded option or mark the rules as only matching on the
# sending side (see the include/exclude modifiers in the FILTER RULES section).
#
# Prior to rsync 2.6.7, this option would have no effect unless --recursive was in effect. Beginning with
# 2.6.7, deletions will also occur when --dirs (-d) is in effect, but only for directories whose contents
# are being copied.
#
# This option can be dangerous if used incorrectly! It is a very good idea to run first using the --dry-
# run option (-n) to see what files would be deleted to make sure important files aren’t listed.
#
# If the sending side detects any I/O errors, then the deletion of any files at the destination will be
# automatically disabled. This is to prevent temporary filesystem failures (such as NFS errors) on the
# sending side causing a massive deletion of files on the destination. You can override this with the
# --ignore-errors option.
#
# The --delete option may be combined with one of the --delete-WHEN options without conflict, as well as
# --delete-excluded. However, if none of the --delete-WHEN options are specified, rsync will currently
# choose the --delete-before algorithm. A future version may change this to choose the --delete-during
# algorithm. See also --delete-after.
#
# --delay-updates
# This option puts the temporary file from each updated file into a holding directory until the end of the
# transfer, at which time all the files are renamed into place in rapid succession. This attempts to make
# the updating of the files a little more atomic. By default the files are placed into a directory named
# ".~tmp~" in each file’s destination directory, but if you’ve specified the --partial-dir option, that
# directory will be used instead. See the comments in the --partial-dir section for a discussion of how
# this ".~tmp~" dir will be excluded from the transfer, and what you can do if you wnat rsync to cleanup
# old ".~tmp~" dirs that might be lying around. Conflicts with --inplace and --append.
#
# This option uses more memory on the receiving side (one bit per file transferred) and also requires
# enough free disk space on the receiving side to hold an additional copy of all the updated files. Note
# also that you should not use an absolute path to --partial-dir unless (1) there is no chance of any of
# the files in the transfer having the same name (since all the updated files will be put into a single
# directory if the path is absolute) and (2) there are no mount points in the hierarchy (since the delayed
# updates will fail if they can’t be renamed into place).
#
# See also the "atomic-rsync" perl script in the "support" subdir for an update algorithm that is even
# more atomic (it uses --link-dest and a parallel hierarchy of files).
#
# --progress
# This option tells rsync to print information showing the progress of the transfer. This gives a bored
# user something to watch. Implies --verbose if it wasn’t already specified.
#
# When the file is transferring, the data looks like this:
#
# 782448 63% 110.64kB/s 0:00:04
#
# This tells you the current file size, the percentage of the transfer that is complete, the current cal-
# culated file-completion rate (including both data over the wire and data being matched locally), and the
# estimated time remaining in this transfer.
#
# After a file is complete, the data looks like this:
#
# 1238099 100% 146.38kB/s 0:00:08 (5, 57.1% of 396)
#
# This tells you the final file size, that it’s 100% complete, the final transfer rate for the file, the
# amount of elapsed time it took to transfer the file, and the addition of a total-transfer summary in
# parentheses. These additional numbers tell you how many files have been updated, and what percent of
# the total number of files has been scanned.
#
# --exclude=PATTERN
# This option is a simplified form of the --filter option that defaults to an exclude rule and does not
# allow the full rule-parsing syntax of normal filter rules.
#
# See the FILTER RULES section for detailed information on this option.

Blog at WordPress.com.