29

I have a bash script that is being used in a CGI. The CGI sets the $QUERY_STRING environment variable by reading everything after the ? in the URL. For example, http://example.com?a=123&b=456&c=ok sets QUERY_STRING=a=123&b=456&c=ok.

Somewhere I found the following ugliness:

b=$(echo "$QUERY_STRING" | sed -n 's/^.*b=\([^&]*\).*$/\1/p' | sed "s/%20/ /g")

which will set $b to whatever was found in $QUERY_STRING for b. However, my script has grown to have over ten input parameters. Is there an easier way to automatically convert the parameters in $QUERY_STRING into environment variables usable by bash?

Maybe I'll just use a for loop of some sort, but it'd be even better if the script was smart enough to automatically detect each parameter and maybe build an array that looks something like this:

${parm[a]}=123
${parm[b]}=456
${parm[c]}=ok

How could I write code to do that?

3
  • I just noticed that I'm actually stuck on Bash 3. Does anyone have a simple, secure solution that will not involve associative arrays? Commented Oct 13, 2010 at 16:30
  • 1
    See my edited answer for an alternative to associative arrays (also be sure to read the page I linked to ( BashFAQ/006 ). Commented Oct 19, 2010 at 1:32
  • this link will help you to solve your issue easily stackoverflow.com/questions/17021640/… Commented Jun 11, 2013 at 5:23

19 Answers 19

56

Try this:

saveIFS=$IFS
IFS='=&'
parm=($QUERY_STRING)
IFS=$saveIFS

Now you have this:

parm[0]=a
parm[1]=123
parm[2]=b
parm[3]=456
parm[4]=c
parm[5]=ok

In Bash 4, which has associative arrays, you can do this (using the array created above):

declare -A array
for ((i=0; i<${#parm[@]}; i+=2))
do
    array[${parm[i]}]=${parm[i+1]}
done

which will give you this:

array[a]=123
array[b]=456
array[c]=ok

Edit:

To use indirection in Bash 2 and later (using the parm array created above):

for ((i=0; i<${#parm[@]}; i+=2))
do
    declare var_${parm[i]}=${parm[i+1]}
done

Then you will have:

var_a=123
var_b=456
var_c=ok

You can access these directly:

echo $var_a

or indirectly:

for p in a b c
do
    name="var$p"
    echo ${!name}
done

If possible, it's better to avoid indirection since it can make code messy and be a source of bugs.

Sign up to request clarification or add additional context in comments.

4 Comments

+1 for the parm array generation. But all methods presented to loop that array fail to properly handle repeated keys. Each occurrence will overwrite the previous. For example, a=1&a=2&a=x would result in parm[a]=x
@MestreLion: You can add logic to deal with the possibility of repeated keys, but you would need to decide how you want to deal with them. You could do first-precedent or last-precedent or some method of accumulation.
parm=($QUERY_STRING) subjects the words resulting from the expansion of $QUERY to globbing, which is probably undesired. A more robust alternative that also saves you the trouble of saving and restoring $IFS: IFS='&=' read -ra parm <<<"$QUERY_STRING" It's better not to use all-uppercase shell-variable names in order to avoid conflicts with environment variables and special shell variables.
@dmnc: You can set a variable on the same line as a command (e.g. read) to have that variable value active only in the environment of the command. The same is not true when doing two assignments on the same line. In that case, the value persists. So in the example in my answer it is necessary to save and restore IFS. There are other ways to avoid saving and restoring, such as using a subshell. So I rolled back your edit.
18

you can break $QUERY down using IFS. For example, setting it to &

$ QUERY="a=123&b=456&c=ok"
$ echo $QUERY
a=123&b=456&c=ok
$ IFS="&"
$ set -- $QUERY
$ echo $1
a=123
$ echo $2
b=456
$ echo $3
c=ok

$ array=($@)

$ for i in "${array[@]}"; do IFS="=" ; set -- $i; echo $1 $2; done
a 123
b 456
c ok

And you can save to a hash/dictionary in Bash 4+

$ declare -A hash
$ for i in "${array[@]}"; do IFS="=" ; set -- $i; hash[$1]=$2; done
$ echo ${hash["b"]}
456

3 Comments

Except where you rely on word-splitting, please double-quote your variable references. Note that set -- $QUERY makes the words in $QUERY subject to globbing, which is probably undesired. It's better not to use all-uppercase shell-variable names in order to avoid conflicts with environment variables and special shell variables.
@mklement0: This relies on word-splitting, in particular on splitting at &. Globbing isn't an issue as the query string is url-encoded.
@MSalters: Thanks; I hadn't considered the constrained contents of the strings. The same applies to unquoted use of $1, $2, and $3, which happen to be fine in this particular case. I was only referring to these variable references with "please double-quote", noting "except where you rely on word-splitting" (set -- $QUERY and set -- $i). To promote good habits, (a) $1, $2, and $3 should still be double-quoted, and (b) the general pitfall of unintended globbing is worth pointing out directly in the answer.
7

Please don't use the evil eval junk.

Here's how you can reliably parse the string and get an associative array:

declare -A param   
while IFS='=' read -r -d '&' key value && [[ -n "$key" ]]; do
    param["$key"]=$value
done <<<"${QUERY_STRING}&"

If you don't like the key check, you could do this instead:

declare -A param   
while IFS='=' read -r -d '&' key value; do
    param["$key"]=$value
done <<<"${QUERY_STRING:+"${QUERY_STRING}&"}"

Listing all the keys and values from the array:

for key in "${!param[@]}"; do
    echo "$key: ${param[$key]}"
done

Comments

4

I packaged the sed command up into another script:

$cat getvar.sh

s='s/^.*'${1}'=\([^&]*\).*$/\1/p'
echo $QUERY_STRING | sed -n $s | sed "s/%20/ /g"

and I call it from my main cgi as:

id=`./getvar.sh id`
ds=`./getvar.sh ds`
dt=`./getvar.sh dt`

...etc, etc - you get idea.

works for me even with a very basic busybox appliance (my PVR in this case).

1 Comment

👎 for the unquoted $QUERY_STRING. You really, really must use double quotes around the variable.
4

To converts the contents of QUERY_STRING into bash variables use the following command:

eval $(echo ${QUERY_STRING//&/;})

The inner step, echo ${QUERY_STRING//&/;}, substitutes all ampersands with semicolons producing a=123;b=456;c=ok which the eval then evaluates into the current shell.

The result can then be used as bash variables.

echo $a
echo $b
echo $c

The assumptions are:

  • values will never contain '&'
  • values will never contain ';'
  • QUERY_STRING will never contain malicious code

4 Comments

Outch! Security Alert! Never evaluate anything from the network.
@ceving ...except if it is your test/mock system, or the http requests are coming from your own program.
Most trivial crack: your.host/your.cgi?rm%20-rf%20%7e <-- this will let your webserver to execute an rm -rf / :-)
I voted the post up because it is genially simple. Btw, I think a simple eval "${QUERY_STRING//&/;}" would be enough, you don't need to echo the variable and then substitute its output into the args of eval.
3

While the accepted answer is probably the most beautiful one, there might be cases where security is super-important, and it needs to be also well-visible from your script.

In such a case, first I wouldn't use bash for the task, but if it should be done on some reason, it might be better to avoid these new array - dictionary features, because you can't be sure, how exactly are they escaped.

In this case, the good old primitive solutions might work:

QS="${QUERY_STRING}"
while [ "${QS}" != "" ]
do
  nameval="${QS%%&*}"
  QS="${QS#$nameval}"
  QS="${QS#&}"
  name="${nameval%%=*}"
  val="${nameval#$name}"
  val="${nameval#=}"

  # and here we have $name and $val as names and values

  # ...

done

This iterates on the name-value pairs of the QUERY_STRING, and there is no way to circumvent it with any tricky escape sequence - the " is a very strong thing in bash, except a single variable name substitution, which is fully controlled by us, nothing can be tricked.

Furthermore, you can inject your own processing code into "# ...". This enables you to allow only your own, well-defined (and, ideally, short) list of the allowed variable names. Needless to say, LD_PRELOAD shouldn't be one of them. ;-)

Furthermore, no variable will be exported, and exclusively QS, nameval, name and val is used.

Comments

2

Following the correct answer, I've done myself some changes to support array variables like in this other question. I added also a decode function of which I can not find the author to give some credit.

Code appears somewhat messy, but it works. Changes and other recommendations would be greatly appreciated.

function cgi_decodevar() {
    [ $# -ne 1 ] && return
    local v t h
    # replace all + with whitespace and append %%
    t="${1//+/ }%%"
    while [ ${#t} -gt 0 -a "${t}" != "%" ]; do
        v="${v}${t%%\%*}" # digest up to the first %
        t="${t#*%}"       # remove digested part
        # decode if there is anything to decode and if not at end of string
        if [ ${#t} -gt 0 -a "${t}" != "%" ]; then
            h=${t:0:2} # save first two chars
            t="${t:2}" # remove these
            v="${v}"`echo -e \\\\x${h}` # convert hex to special char
        fi
    done
    # return decoded string
    echo "${v}"
    return
}

saveIFS=$IFS
IFS='=&'
VARS=($QUERY_STRING)
IFS=$saveIFS

for ((i=0; i<${#VARS[@]}; i+=2))
do
  curr="$(cgi_decodevar ${VARS[i]})"
  next="$(cgi_decodevar ${VARS[i+2]})"
  prev="$(cgi_decodevar ${VARS[i-2]})"
  value="$(cgi_decodevar ${VARS[i+1]})"

  array=${curr%"[]"}

  if  [ "$curr" == "$next" ] && [ "$curr" != "$prev" ] ;then
      j=0
      declare var_${array}[$j]="$value"
  elif [ $i -gt 1 ] && [ "$curr" == "$prev" ]; then
    j=$((j + 1))
    declare var_${array}[$j]="$value"
  else
    declare var_$curr="$value"
  fi
done

Comments

2

I would simply replace the & to ;. It will become to something like:

a=123;b=456;c=ok

So now you need just evaluate and read your vars:

eval `echo "${QUERY_STRING}"|tr '&' ';'`
echo $a
echo $b
echo $c

1 Comment

This is not only a security risk, but also fragile, given that values could themselves contain ; or start with ~.
2

One can use the bash-cgi.sh, which processes :

  • the query string into the $QUERY_STRING_GET key and value array;

  • the post request data (x-www-form-urlencoded) into the $QUERY_STRING_POST key and value array;

  • the cookies data into the $HTTP_COOKIES key and value array.

Demands bash version 4.0 or higher (to define the key and value arrays above).

All processing is made by bash only (i.e. in an one process) without any external dependencies and additional processes invoking.

It has:

  • the check for max length of data, which can be transferred to it's input, as well as processed as query string and cookies;

  • the redirect() procedure to produce redirect to itself with the extension changed to .html (it is useful for an one page's sites);

  • the http_header_tail() procedure to output the last two strings of the HTTP(S) respond's header;

  • the $REMOTE_ADDR value sanitizer from possible injections;

  • the parser and evaluator of the escaped UTF-8 symbols embedded into the values passed to the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES;

  • the sanitizer of the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES values against possible SQL injections (the escaping like the mysql_real_escape_string php function does, plus the escaping of @ and $).

It is available here:

https://github.com/VladimirBelousov/fancy_scripts

Comments

1

A nice way to handle CGI query strings is to use Haserl which acts as a wrapper around your Bash cgi script, and offers convenient and secure query string parsing.

Comments

1

This works in dash using for in loop

IFS='&'
for f in $query_string; do
   value=${f##*=}
   key=${f%%=*}
    # if you need environment variable -> eval "qs_$key=$value"
done

Comments

0

To bring this up to date, if you have a recent Bash version then you can achieve this with regular expressions:

q="$QUERY_STRING"
re1='^(\w+=\w+)&?'
re2='^(\w+)=(\w+)$'
declare -A params
while [[ $q =~ $re1 ]]; do
  q=${q##*${BASH_REMATCH[0]}}       
  [[ ${BASH_REMATCH[1]} =~ $re2 ]] && params+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})
done

If you don't want to use associative arrays then just change the penultimate line to do what you want. For each iteration of the loop the parameter is in ${BASH_REMATCH[1]} and its value is in ${BASH_REMATCH[2]}.

Here is the same thing as a function in a short test script that iterates over the array outputs the query string's parameters and their values

#!/bin/bash
QUERY_STRING='foo=hello&bar=there&baz=freddy'

get_query_string() {
  local q="$QUERY_STRING"
  local re1='^(\w+=\w+)&?'
  local re2='^(\w+)=(\w+)$'
  while [[ $q =~ $re1 ]]; do
    q=${q##*${BASH_REMATCH[0]}}
    [[ ${BASH_REMATCH[1]} =~ $re2 ]] && eval "$1+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})"
  done
}

declare -A params
get_query_string params

for k in "${!params[@]}"
do
  v="${params[$k]}"
  echo "$k : $v"
done          

Note the parameters end up in the array in reverse order (it's associative so that shouldn't matter).

1 Comment

@starfy thanks for this but it does not work with a few characters that are admissable in parameter values, e.g. the simple hyphen "-". When parsing such parameters - e.g. p=foo-bar - only the first part of the value is returned (foo).
0

why not this

    $ echo "${QUERY_STRING}"
    name=carlo&last=lanza&city=pfungen-CH
    $ saveIFS=$IFS
    $ IFS='&'
    $ eval $QUERY_STRING
    $ IFS=$saveIFS

now you have this

    name = carlo
    last = lanza
    city = pfungen-CH

    $ echo "name is ${name}"
    name is carlo
    $ echo "last is ${last}"
    last is lanza
    $ echo "city is ${city}"
    city is pfungen-CH

1 Comment

This is kind of dangerous - any variable from the script could be rewritten by those params.
0

@giacecco

To include a hiphen in the regex you could change the two lines as such in answer from @starfry.

Change these two lines:

  local re1='^(\w+=\w+)&?'
  local re2='^(\w+)=(\w+)$'

To these two lines:

  local re1='^(\w+=(\w+|-|)+)&?'
  local re2='^(\w+)=((\w+|-|)+)$'

Comments

0

For all those who couldn't get it working with the posted answers (like me), this guy figured it out.

Can't upvote his post unfortunately...

Let me repost the code here real quick:

#!/bin/sh

if [ "$REQUEST_METHOD" = "POST" ]; then
  if [ "$CONTENT_LENGTH" -gt 0 ]; then
      read -n $CONTENT_LENGTH POST_DATA <&0
  fi
fi

#echo "$POST_DATA" > data.bin
IFS='=&'
set -- $POST_DATA

#2- Value1
#4- Value2
#6- Value3
#8- Value4

echo $2 $4 $6 $8

echo "Content-type: text/html"
echo ""
echo "<html><head><title>Saved</title></head><body>"
echo "Data received: $POST_DATA"
echo "</body></html>"

Hope this is of help for anybody.

Cheers

Comments

0

Actually I liked bolt's answer, so I made a version which works with Busybox as well (ash in Busybox does not support here string). This code will accept key1 and key2 parameters, all others will be ignored.

while IFS= read -r -d '&' KEYVAL && [[ -n "$KEYVAL" ]]; do
case ${KEYVAL%=*} in
        key1) KEY1=${KEYVAL#*=} ;;
        key2) KEY2=${KEYVAL#*=} ;;
    esac
done <<END
$(echo "${QUERY_STRING}&")
END

Comments

0

Using @Henning approach as inspiration, I decided to expand it a bit further so that it's more self contained and self documenting.

## @fn cgi_query_string_get_value_by_key()
## @brief get value from cgi env string (Limitation: does not handle quoted string)
## @param search_key keyword in CGI query string
## @param default_value default value. Defaults to empty string if omitted.
## @return found value or default value
function cgi_query_string_get_value_by_key () {
    search_key=$1
    default_value=${2-}

    # Search each parameter for a matching key
    saveIFS=$IFS
    IFS='&'
    for f in $QUERY_STRING; do
        key=${f%%=*}
        value=${f##*=}
        if [ "$key" == "$search_key" ]; then
            # Param Found
            IFS=$saveIFS
            echo "$value"
            return
        fi
    done

    # Param Is Missing. Return Default Value
    IFS=$saveIFS
    echo "$default_value"
}

As for how you can use it, below is some example usage

QUERY_STRING="hello=32&world=23"
echo "Given a GCI query string: '$QUERY_STRING'"

hello=$(cgi_query_string_get_value_by_key "hello")
if [ ! -z "$hello" ]; then
    echo "'hello' key is found got $hello"
else
    echo "'hello' key is missing"
fi

world=$(cgi_query_string_get_value_by_key "world")
if [ ! -z "$world" ]; then
    echo "'world' key is found got $world"
else
    echo "'world' key is missing"
fi

unknown=$(cgi_query_string_get_value_by_key "unknown")
if [ ! -z "$unknown"] ; then
    echo "'unknown' key is found as $unknown"
else
    echo "'unknown' key is missing"
fi

default=$(cgi_query_string_get_value_by_key "default" "4")
if [ ! -z "$default" ]; then
    echo "'default' is missing, but got fallback value as $default"
else
    echo "'default' is missing"
fi

This will lead to this output

Given a GCI query string: 'hello=32&world=23'
'hello' key is found got 32
'world' key is found got 23
'unknown' key is missing
'default' is missing, but got fallback value as 4

This was also checked against set -euo pipefail for any obvious errors. This will hopefully be useful for those expecting to only deal with simple values that does not have repeating keys.

Comments

0

This seems to work for me:

export $(echo ${QUERY_STRING} | tr '&' '\n' | xargs -L 1)

So if the QUERY_STRING was FOO=value1&BAR=value2, the above would give these as environment variables:

env
...
FOO=value1
BAR=value2
...

Comments

0

Just for the fun of it, I found another way:

Example, extracts only the first argument and value:

value=$(echo "$QUERY_STRING" | cut -d= -f1)
argument=$(echo "$QUERY_STRING" | cut -d= -f2)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.