39

I am writing a very simple Bash script that tars a given directory, encrypts the output of that, and then splits the resultant file into multiple smaller files since the backup media doesn’t support huge files.

I don't have a lot of experience with Bash scripting. I believe I’m having issues with quoting my variables properly to allow spaces in the parameters. The script follows:

#! /bin/bash

# This script tars the given directory, encrypts it, and transfers
# it to the given directory (likely a USB key).

if [ $# -ne 2 ]
then
    echo "Usage: `basename $0` DIRECTORY BACKUP_DIRECTORY"
    exit 1
fi

DIRECTORY=$1
BACKUP_DIRECTORY=$2
BACKUP_FILE="$BACKUP_DIRECTORY/`date +%Y-%m-%dT%H-%M-%S.backup`"

TAR_CMD="tar cv $DIRECTORY"
SPLIT_CMD="split -b 1024m - \"$BACKUP_FILE\""

ENCRYPT_CMD='openssl des3 -salt'

echo "$TAR_CMD | $ENCRYPT_CMD | $SPLIT_CMD"

$TAR_CMD | $ENCRYPT_CMD | $SPLIT_CMD

say "Done backing up"

Running this command fails with:

split: "foo/2009-04-27T14-32-04.backup"aa: No such file or directory

I can fix it by removing the quotes around $BACKUP_FILE where I set $SPLIT_CMD. But, if I have a space in the name of my backup directory, it doesn't work. Also, if I copy and paste the output from the "echo" command directly into the terminal, it works fine. Clearly there's something I don't understand about how Bash is escaping things.

3
  • 1
    Why would you embed $BACKUP_FILE in SPLIT_CMD when you could just put it after $SPLIT_CMD in the pipeline? Commented Apr 27, 2009 at 18:50
  • 1
    Well I could do that, but at that point there's not really much point in having variables to contain my commands and I may as well just expand it all out as in Juliano's answer below. Commented Apr 27, 2009 at 19:44
  • 1
    mywiki.wooledge.org/BashFAQ/050 Commented Dec 15, 2015 at 8:57

5 Answers 5

65

Simply don't put whole commands in variables. You'll get into a lot of trouble trying to recover quoted arguments.

Also:

  1. Avoid using all-capitals variable names in scripts. It is an easy way to shoot yourself in the foot.
  2. Don't use backquotes. Use $(...) instead; it nests better.

#! /bin/bash

if [ $# -ne 2 ]
then
    echo "Usage: $(basename $0) DIRECTORY BACKUP_DIRECTORY"
    exit 1
fi

directory=$1
backup_directory=$2
current_date=$(date +%Y-%m-%dT%H-%M-%S)
backup_file="${backup_directory}/${current_date}.backup"

tar cv "$directory" | openssl des3 -salt | split -b 1024m - "$backup_file"
Sign up to request clarification or add additional context in comments.

5 Comments

Yeah I guess I'll probably do it this way. Seems less elegant to me than having the commands in variables, but I guess that's the nature of bash scripting.
@wxs: There is nothing elegant whatsoever about putting commands in variables. You don't gain any type of flexibility; on the contrary, you just cause bugs due to word splitting. What you might have intended to do is put commands in functions. You execute functions. You should never execute variable content. ever.
I don't understand 1. care to explain? How/why does it make easier to shoot oneself in the foot?
ata: Because you might inadvertently clobber environment variables (which are always upper-case) ?
@ata: see POSIX convention for environment variable names at pubs.opengroup.org/onlinepubs/009695399/basedefs/… -- explicitly, all-caps names are used by variables with meaning to the operating system or shell, and names with at least one lower-case character are guaranteed to be safe for applications to use without inadvertently modifying shell behavior.
14

eval is not an acceptable practice if your directory names can be generated by untrusted sources. See BashFAQ #48 for more on why eval should not be used, and BashFAQ #50 for more on the root cause of this problem and its proper solutions, some of which are touched on below:

If you need to build up your commands over time, use arrays:

tar_cmd=( tar cv "$directory" )
split_cmd=( split -b 1024m - "$backup_file" )
encrypt_cmd=( openssl des3 -salt )
"${tar_cmd[@]}" | "${encrypt_cmd[@]}" | "${split_cmd[@]}"

Alternately, if this is just about defining your commands in one central place, use functions:

tar_cmd() { tar cv "$directory"; }
split_cmd() { split -b 1024m - "$backup_file"; }
encrypt_cmd() { openssl des3 -salt; }
tar_cmd | split_cmd | encrypt_cmd

3 Comments

As long as you're not using eval with uncontrolled input it does have its uses. Using heredocs with variables can mimic macro definitions via eval, for example. It's not inherently evil or useless, or it wouldn't be in bash.
@ata, filenames are generally speaking uncontrolled inputs. Code equivalent to for f in "$dir"/* or the like are so common that any shell function that becomes dangerous in their presence is categorically unwise. Modern bash has tools to allow safer eval use, but eval existed long before those tools did -- it dates back to days when "the internet" wasn't a thing, and neither were users from untrusted organizations.
100% agree, I was making a point about the (relative) eval safety when working with controlled input, which filenames are clearly not. Maybe I didn't interpret in full the Bash FAQ entries, but I got the impression they were a blanket "all eval is evil". Sorry if I misinterpreted.
11

I am not sure, but it might be worth running an eval (near "The args are read and concatenated together") on the commands first.

This will let Bash expand the variables $TAR_CMD and such to their full breadth (just as the echo command does to the console, which you say works).

Bash will then read the line a second time with the variables expanded.

eval $TAR_CMD | $ENCRYPT_CMD | $SPLIT_CMD

Page Bash: Why use eval with variable expansion? looks like it might do a decent job at explaining why that is needed.

4 Comments

That seems to do it as well, but I feel like it makes everything a little too convoluted. Oh well, I see why people invented other scripting languages.
This is a major security risk. See BashFAQ #50 for the best-practice alternative: mywiki.wooledge.org/BashFAQ/050 -- and BashFAQ #48 for a description of why eval carries risks: mywiki.wooledge.org/BashFAQ/048
Good catch @CharlesDuffy, if this is being used on a shared system where other users are granted access to the script to run with elevated rights, it is a risk.
@Eddie, not just on a shared system, but against data you don't control. If you've got a script mirroring files from a remote server and unpacking them, and those filenames can be modified by someone malicious, eval carries unreasonable risks. Sometimes, "malicious" isn't even necessary -- I once saw a major data loss event caused by a buffer overflow dumping random content into filenames that were later processed by a sloppily-written script.
4

There is a point to only put commands and options in variables.

#! /bin/bash

if [ $# -ne 2 ]
then
    echo "Usage: `basename $0` DIRECTORY BACKUP_DIRECTORY"
    exit 1
fi

. standard_tools    

directory=$1
backup_directory=$2
current_date=$(date +%Y-%m-%dT%H-%M-%S)
backup_file="${backup_directory}/${current_date}.backup"

${tar_create} "${directory}" | ${openssl} | ${split_1024} "$backup_file"

You can relocate the commands to another file you source, so you can reuse the same commands and options across many scripts. This is very handy when you have a lot of scripts and you want to control how they all use tools. So standard_tools would contain:

export tar_create="tar cv"
export openssl="openssl des3 -salt"
export split_1024="split -b 1024m -"

3 Comments

tar_create is missing but helped nonetheless
This doesn't actually solve the problem for complex arguments. If your tar_create was tar cv --exclude="* *", it would have failed in much the same way as the original. And the exports do nothing useful here -- these variables are used in the same shell, so polluting subprocess's environment space is simply a waste.
Better to set aliases (using per system conditionals) than hiding commands inside of variables. Or you could set argument variables per command (again according to per system conditionals).
2

Quoting spaces inside variables such that the shell will reinterpret things properly is hard. It's this type of thing that prompts me to reach for a stronger language. Whether that's Perl, Python, Ruby, or whatever (I choose Perl, but that's not always for everyone), it's just something that will allow you to bypass the shell for quoting.

It's not that I've never managed to get it right with liberal doses of eval, but just that eval gives me the heebie-jeebies (becomes a whole new headache when you want to take user input and eval it, though in this case you'd be taking stuff that you wrote and evaling that instead), and that I've gotten headaches in debugging.

With Perl, as my example, I'd be able to do something like:

@tar_cmd = ( qw(tar cv), $directory );
@encrypt_cmd = ( qw(openssl des3 -salt) );
@split_cmd = ( qw(split -b 1024m -), $backup_file );

The hard part here is doing the pipes - but a bit of IO::Pipe, fork, and reopening standard output and standard error, and it's not bad. Some would say that's worse than quoting the shell properly, and I understand where they're coming from, but, for me, this is easier to read, maintain, and write. Heck, someone could take the hard work out of this and create a IO::Pipeline module and make the whole thing trivial ;-)

1 Comment

It's only a hard problem if you make it a hard problem by using eval. There's no good reason to do that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.