Synchronize files between a Macbook and an Ubuntu machine using rsync via ssh
As a precondition, it is assumed, that you have two computers that are connected via a network. I am using a Macbook with Mac OS X 10.4 Tiger and an Ubuntu PC with Ubuntu 7.10 Gutsy connected via a WLAN where I have verified all the steps. Some of the steps described are Mac / Ubuntu specific. Most information should be valid for other operating systems where ssh and rsync can be installed (e.g. other Linux distributions) but you might check out the distribution specifics if something should not work. Many ideas came from (6).
Make sure, that the ssh client and server are installed and running
For general information about SSH see (1). If your don't mind sending your data unencrypted (e.g. if your are using a private cable network connection) you don't need SSH and you can leave out this step.
Start on the destination machine (Ubuntu = server). Set the Power Management Preferences so that the computer will never put to sleep when inactive (hard drive and display can go to sleep).
Change to the source machine (Mac OS = client). As user ("Steffen" in my case) try to login to the destination machine (Ubuntu) using ssh, specifying either an ip address or a hostname (if the hostname appears in /etc/hosts).
Note: You can add a host name with the command:
sudo nano /etc/hosts
You'll be asked for a password (Mac OS). Enter the IP address and the host name in the list. Of course, instead of nano you can use your preferred editor (e.g. Emacs or Vim).
To login into a remote computer (destination machine) running a ssh-server, open a terminal and log in with ssh <user>@<server> like this:
ssh steffen@myubuntu
You'll be asked for a password. This is the password for the user on the destination machine (ubuntu), not the local password.
Note: Replace "steffen@myubuntu" with whatever your username@hostname/ip
is. It would be a good idea to use the same login names on both machines, because than you could login just with ssh myubuntu.
Type "yes" to the authentication message. Enter your password. If this works than ssh works and you should type “exit” to return to the source machine. If not than test ssh on the server machine (Ubuntu) with the following command:
ssh localhost
If you can login, ssh-server is running.
If you get the following error message: "ssh: connect to host localhost port 22: Connection refused"
Install OpenSSH-server and OpenSSH-client with the command:
sudo apt-get install ssh
Note: This is Ubuntu specific. You might need a different command to install ssh if you have a different server system.
Try again. It should work now and if you can login with ssh from your client into your server, you can proceed with the next step.
Setup public/private key pair
On the client (Mac OS) generate a private and a public passkey by typing:
ssh-keygen -t rsa
Follow the prompts. This will yield the id_rsa.pub and id_rsa files (the public and private key pair):
...Generating public/private rsa key pair. [Enter]
...Enter file in which to save the key (/home/ross/.ssh/id_rsa): [Enter]
...Created directory '/home/ross/.ssh'. [Enter: you might not see this message]
...Enter passphrase (empty for no passphrase): [Enter a passphrase]
...Enter same passphrase again: [Enter a passphrase]
...Your identification has been saved in /Users/Steffen/.ssh/id_rsa.
...Your public key has been saved in /Users/Steffen/.ssh/id_rsa.pub.
Password and passphrase do different things. The password is saved in the /etc/passwd of the target system. The passphrase is used to decrypt your private key on your system. The actual security of public key authentication over password authentication is that two things are needed to get access:
your (encrypted) private key
your passphrase (which is needed to decrypt the private key)
Copy the public key to the destination machine
On the Mac type:
scp ~/.ssh/id_rsa.pub steffenadmin@myubuntu:~/.ssh/authorized_keys
Note: Replace "steffenadmin@myubuntu" with whatever your username@hostname/ip is.
You should now be logged in to the remote machine (Ubuntu). Log off with
exit
On Ubuntu logged in as administrator copy this also to other accounts where necessary, e.g.:
sudo cp ~/.ssh/authorized_keys /home/steffen/.ssh/authorized_keys
Note:Replace "/home/steffen/" with whatever your accounts are.
An alternative to using scp and cp could have been:
ssh-copy-id -i ~/.ssh/id_rsa.pub steffen@myubuntu
This alternative didn't work for me (permission denied, I guess, because I didn't use an administrator account to log into Ubuntu).
Another alternative instead of using scp could be (but I haven't tried that yet):
cat ~/.ssh/id_rsa.pub | ssh steffenadmin@myubuntu "cat >> ~/.ssh/authorized_keys"
Use the new keys with ssh
On the Mac type:
ssh steffen@myubuntu
Note: Replace "steffen@myubuntu" with whatever yourusername@hostname/ip is.
You should no longer be asked for the password but for the passphrase. Enter your passphrase, and provided your Ubuntu machine is configured to allow key-based logins, you should then be logged in. If it works, it will work for all ssh connections for that user. If not, take a look at (2) (also helpful for make your Ubuntu machine's ssh-server more secure) or check the error message on Google.
Password based authentication is enabled per default in Ubuntu. If you want to stop users from logging in remotely using passwords, disable password authentication manually, by setting "PasswordAuthentication no" in the file /etc/ssh/sshd_config. Do not forget to restart your ssh server after changing the configuration (sudo /etc/init.d/ssh restart).
Using rsync via ssh to backup files from Mac OS (source) to Ubuntu (destination)
Rsync is a free file transfer program capable of efficient remote update via a fast differencing algorithm distributed under GNU General Public License. In order to use rsync to mirror files from a source machine to a destination machine via ssh both ssh and rsync must be available on both machines and ssh must be configured correctly (see description above). As with all good command line tool interaction, the power to bend rsync to your will lies in the usage switches you provide it in the rsync call (ie. "rsync -avz"). Notice that you can only use switches that are available on both the rsync of the source machine and the rsync of the destination machine. To see all the available options, type "rsync -h" or "man rsync" in the terminal.
A few of the (for me) most interesting switches are:
-a, --archive | archive mode (recurse into directories, copy symlinks as symlinks, preserve permissions, owner, group, times and devices); equivalent to -rlptgoD |
-e, --rsh=COMMAND | specify the remote shell; -e ssh tunnels the file transfer over an encrypted ssh connection |
-n, --dry-run | show what would have been transferred without doing any file transfers |
-u, --update | update only (don't overwrite files that are newer on the receiver) |
-v, --verbose | increase verbosity |
-z, --compress | compress data during transfer using gzip; saves bandwidth but needs more CPU power so use it for slow/expensive connections only |
-E, --extended-attributes | Apple specific option to copy extended attributes, resource forks, and ACLs. Requires at least Mac OS X 10.4 or suitably patched rsync This switch doesn't work for copying files from my MacBook to Ubuntu Gutsy (7.10) because it is not available on the rsync on my Ubuntu machine |
-P | equivalent to --partial --progress (keep partially transferred files, show progress during transfer) |
-S, --sparse | Try to handle sparse files efficiently so they take up less space on the destination. NOTE: Don't use this option when the destination is a Solaris "tmpfs" filesystem. It ends up corrupting the files. |
-x, --one-file-system | don't cross filesystem boundaries (ignore mounted volumes) |
--delay-updates | put all updated files into place at transfer's end, very useful for live systems (is not available on my Mac OS rsync version) |
--delete-after | delete files in the target folder that are not in the source folder |
--exclude=PATTERN | exclude files matching PATTERN e.g.: --exclude "*.bak" --exclude "*~" to ignore "*.bak" and "*~" files |
--exclude-from=FILE | exclude patterns listed in FILE (one per line) |
--include=PATTERN | don't exclude files matching PATTERN |
--include-from=FILE | don't exclude patterns listed in FILE |
--stats | give some file transfer stats |
rsync -e ssh -nvauxPS --stats --exclude '.DS_Store' --exclude "*bak" --exclude "*~" ~/Documents/ steffen@myubuntu:'/media/sda1/Dokumente und Einstellungen/Steffen/Eigene Dateien'
The above command will login (with user "steffen" via ssh). As we have not setup passphrase-less keys, the script will halt and ask for the passphrase for the key '/Users/Steffen/.ssh/id_rsa'. It updates the directory ~/Documents/on my MacBook in the directory '/media/sda1/Dokumente und Einstellungen/Steffen/Eigene Dateien' on my Ubuntu machine ("myubuntu"). '.DS_Store', "*.bak" and "*~" files are ignored.
Notice the backslash in front of the spaces in the directory names. Also the closing slash for the source directory matters. If it would be left out, a subdirectory 'Documents' would be created in the destination directory. Furthermore, the single colon is needed for sending via ssh tunnel, as opposed to the regular rsh tunnel. If you use two colons, then despite the specification of ssh previously, the transfer would use rsh!
Check the output of the dry-run command. If everything seem to be OK, leave out the -n switch to actually do the transfer.
rsync -e ssh -vauxPS --stats --exclude '.DS_Store' --exclude "*bak" --exclude "*~" ~/Documents/ steffen@myubuntu:'/media/sda1/Dokumente und Einstellungen/Steffen/Eigene Dateien'
SSH Key Management
If entering the key passphrase each time you use ssh is bothering you, consider using SSHKeychain (for Mac OS 10.4 Tiger). It will store the passphrase and acts as gateway to the ssh-agent, so you will only be ask for your passphrase per ssh-session no matter who many commands use ssh afterwards. SSHKeychain also has an option to integrate key phrase into Apple Keychain so the key can be used just by unlocking the Keychain which makes usage within scripts also much easier. The easy installation procedure and the usage are described at the SSHKeychain Homepage. I have installed SSHKeychain 0.8.2 and it is working nicely.Mac OS X 10.5 (Leopard) seem to have built-in ssh-agent support for SSH Key Management, but as I don't have Mac OS 10.5, I did not try this.
If you are using another system as client, you might find similar tools (e.g. Keyring in Ubuntu), but I haven't checked, if they provide similar functionality.
Make a backup script with rsync
In order to not always have to enter the above rsync command in the shell, a shell script can be used (which should work on other UNIX - based operating systems too). Such scripts using rsync can be found easily in the Web. The following describes one way to make a simple bash-script.
nano backup.bash
Copy the rsync command in the file, (CTRL+O) and exit (CTRL+X) backup.bash and make it executable with:
chmod 744 backup.bash
From the directory in which backup.bash was saved, type:
./backup.bash
to run the backup.
You can add some logging functionality to the shell script:
#!/bin/bash
echo ================================ rsync Backup script ================================= >>~/Documents/Programming/Shell_scripts/rsync.log
date >>~/Documents/Programming/Shell_scripts/rsync.log
echo ==start rsync logging== >>~/Documents/Programming/Shell_scripts/rsync.log
rsync -e ssh -nvauxPS --stats --exclude '.DS_Store' --exclude "*bak" --exclude "*~" ~/Documents/ steffen@myubuntu:'/media/sda1/Dokumente und Einstellungen/Steffen/Eigene Dateien'>>~/Documents/Programming/Shell_scripts/rsync.log
echo =rsync Backup Ended== >>~/Documents/Programming/Shell_scripts/rsync.log
sleep 2m
echo ===== Backup Complete ===== >>~/Documents/Programming/Shell_scripts/rsync.log
open ~/Documents/Programming/Shell_scripts/rsync.log
If desired, more functionality could be added in the shell script. Examples (e.g. for an incremental backup) can be found in (3).
But instead of perfecting the shell script I decided to make a Python script (as I like Python) to wrap around the rsync shell command and an XML - file (to supply the paths that I want copy via rsync). The core of the Python scripts is: subprocess.call("rsync -e ssh -options source destination", shell=True). As starting rsync e- ssh via a shell call from Python doesn't prompt the user for a passphrase, this only works if ssh-agent is properly set up (see the chapter about SSH-Key-Management above). Otherwise the shell call with terminate with an error message (see messages in the Console for details).
The following assumes, that you have at least Python 2.4 installed on your client and know how to start Python scripts. I recommend to install the newest version of Mac Python from (4) as the Python version that come installed with Mac OS 10.4 doesn't support all the Python command that I have used in the following script.
#!/usr/local/bin/python
# Filename: RsyncMacWithUbuntu.py
#
# I confirm that, to the best of my knowledge and belief, this contribution is free of any claims of third parties under
# copyright, patent or other rights or interests ("claims").
#
# Copyright 2008 Steffen Hellmich Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at http://www.apache.org/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
"""Copies important data from my Mac to my Ubuntu machine with Rsync via SSH.
This scripts uses rsync to copy source folders to destination folders via ssh.
Required: XML-file "FoldersForRsync.xml" in the form of:
<Folders>
<options>-e ssh -nvauxPS --stats --exclude '*.DS_Store' --exclude '*bak' --exclude '*~'</options>
<rsyncTransfer use="yes">
<source>/Users/Steffen/Documents/</source>
<destination>steffen@myubuntu:/media/sda1/Dokumente und Einstellungen/Steffen/Eigene Dateien</destination>
<description>Important Documents</description>
</rsyncTransfer>
<rsyncTransfer use="no">
<source>/Users/Steffen/Documents/</source>
<destination></destination>
<description> ... </description>
</rsyncTransfer>
</Folders>
Status messages indicate process
Error messages indicate failure
The transfer is logged in "/Users/Steffen/Documents/Programming/Pyhton_scripts/rsync.log"
"""
from __future__ import with_statement
import subprocess
import sys
import os
import string
import xml.sax # XML handling module
import time
import xml.sax.handler
def escapeSpaces (text): # add backshlash in from of spaces
if text.find(' ') == -1 :
return text
elif text.find(':') == -1 :
return ''.join([''', text, '''])
else:
return ''.join([''', text.replace(" ","\ "),'''])
class RsyncTransferHandler(xml.sax.handler.ContentHandler):
def __init__(self):
self.inOptions = 0
self.options = ""
self.inDestination = 0
self.inDescription = 0
self.inSource = 0
self.destination = ""
self.description = ""
self.source = ""
self.use = ""
def startElement(self, name, attributes):
if name == "options":
self.inOptions = 1
self.options = ""
elif name == "rsyncTransfer":
self.source = ""
self.destination = ""
self.description = ""
self.use = attributes["use"]
elif name == "source":
self.inSource = 1
elif name == "destination":
self.inDestination = 1
elif name == "description":
self.inDescription = 1
def characters(self, data):
if self.inOptions:
self.options += data
elif self.inDestination:
self.destination += data
elif self.inDescription:
self.description += data
elif self.inSource:
self.source += data
def endElement(self, name):
if name == "options":
self.inOptions = 0
elif name == "destination":
self.inDestination = 0
elif name == "description":
self.inDescription = 0
elif name == "source":
self.inSource = 0
elif name == "rsyncTransfer":
if self.use == "yes":
print 'Rsync', self.description, 'with the following command via standard shell.'
cmd = "rsync"
logfile = "/Users/Steffen/Documents/Programming/Python_scripts/rsync.log"
rsyncCommand = string.join([cmd, self.options, escapeSpaces(self.source),
escapeSpaces(self.destination), '>>', logfile])
print rsyncCommand
with open(logfile,'a') as f:
f.write("======================== rsync backup script =============================n")
tDate = "-".join([time.strftime('%Y'),time.strftime('%m'),time.strftime('%d')])
tTime = ":".join([time.strftime('%H'),time.strftime('%M'),time.strftime('%S')])
f.write(" ".join([tDate, tTime, "== start rsync logging ==n"]))
f.write(" ".join(["Rsync", self.description, "withn", rsyncCommand, "n"]))
try:
retcode = subprocess.call(rsyncCommand, shell=True)
if retcode < 0:
print >>sys.stderr, "Child was terminated by signal", -retcode
else:
print >>sys.stderr, "Child returned", retcode
except OSError, e:
print >>sys.stderr, "Execution failed:", e
with open(logfile, 'a') as f:
f.write("======================== rsync backup ended ==============================n")
f.close()
try:
shellCmd = "ping -c 1 myubuntu"
retcode = subprocess.call(shellCmd, shell=True) # check if Ubuntu machine is reachable
if retcode == 0: # ping ok
print "Server myubuntu seem to be reachable. Return code for:", shellCmd, "=", retcode, "n"
shellCmd = "ssh steffen@myubuntu ls"
retcode = subprocess.call(shellCmd, shell=True) # check ssh login to Ubuntu machine
if retcode == 0: # ssh login ok --> start parsing XML-file for rsync transfer
print "Server myubuntu is reachable. Return code for:", shellCmd, "=", retcode, "n"
parser = xml.sax.make_parser( )
handler = RsyncTransferHandler( )
parser.setContentHandler(handler)
parser.parse("FoldersForRsync.xml") # name of XML-file to be parsed
secondsTimeout = 3
print 'nRsync finished.nnExiting in', secondsTimeout, 'seconds.'
time.sleep(3)
else: # ping ok, but ssh login not --> destination machine runs probably with Windows
print "Server myubuntu is not reachable. Return code for:", shellCmd, "=", retcode, "n"
print 'nErrors happened. Check in detail the messages above.n'
i = raw_input("Press enter to finish.") # wait until input
else: # Ubuntu machine not reachable
print "Server myubuntu is not reachable. Return code for:", shellCmd, "=", retcode, "n"
print 'nErrors happened. Check in detail the messages above.n'
i = raw_input("Press enter to finish.") # wait until input
except OSError, e:
print >>sys.stderr, "Execution shell call", shellCmd, "failed. Error:", e
Run the Python script via double-clicking from the desktop
To make it fast and easy to run the Python script, I created a shell script called backup.command that is placed directly on my desktop with the following content:
cd ~/Documents/Programming/Python_scripts/
python RsyncMacWithUbuntu.py
Double-clicking it will open a Terminal and run the shell script.
Further ideas
Automate the backup process by scheduling it with cron
You could automate the backup process by creating a cron job (scheduled task) or an repeating alarm in iCal (see: (5)) to call the backup script e.g. every night. That might be especially helpful, if you have to backup a lot of data and both machines are running in the night.
As I usually switch off my machines in the night and don't have a fixed time, when both machines are running, I prefer to start the backup manually.
Create multiple copies of anything (similar to a real backup)
I use the python script above only to mirror or synchronized data files/directories between my two machines. I don't use it to make multiple copies (as a backup scheme would). Others might have different needs (see page 2 of (5) or (7)).A final word
The steps described above are only one way of using rsync and ssh. If you are following them, try to understand what you are doing and check carefully with the -n ("dry run") switch, that rsync will do what you desire before actually doing the transfer.Rsync is perfectly good for synchronizing / backing up of data files. It should be enough, to recover most of my important data (e.g. my music files and documents) if one of my hard drives crashes. If you need a real backup of your whole system (e.g. entire bootable filesystem images) you might want to consider other ways.
Furthermore, you may have better or more efficient ways of doing this. Please post them so others can see what options there are. And, of course, I may have made mistakes that I have not found yet. Please help me to correct them.
References
(1) http://en.wikipedia.org/wiki/Secure_Shell(2) https://help.ubuntu.com/community/AdvancedOpenSSH
(3) http://rsync.samba.org/examples.html
(4) http://www.python.org/download/
(5) http://www.macdevcenter.com/pub/a/mac/2005/07/22/backup.html?page=1
(6) http://ubuntuforums.org/showthread.php?t=15082
(7) http://www.egg-tech.com/mac_backup/ to an external FireWire, USB and network drives using
rsync
(8)
http://troy.jdmz.net/rsync/index.html
Comments
Post a Comment