4

I am new to bash scripting and I am writing a small script where I need to decode the encoded string and convert it to a byte array.

my_array=(`echo "dGVzdA==" | base64 -d`)

when I print the size of the array it's showing as 1 instead of 4.

echo "${#my_array[*]}"

Question - will not base64 -d convert to a byte array? There is an equivalent java code where it does for me:

byte [] b = Base64.getDecoder().decode("dGVzdA==");

Any thoughts?

2
  • 2
    Bash doesn't have a byte type, or any other types. Vars are strings. Commented Sep 23, 2020 at 16:55
  • 3
    It's not encrypted, it's encoded. Commented Sep 23, 2020 at 17:07

4 Answers 4

12

to byte array

There are no "byte arrays" in shells. You can basically choose from two types of variables:

  • a string
  • an array of strings (bash extension, not available on all shells)

How you interpret the data is up to you, the shell can only store strings. Bash has an extension called "bash arrays" that allows to create an array. Note that strings are zero terminated - it's impossible to store the zero byte.

Question - will it not base64 -d converts into byte array ?

The string dGVzdA== is base64 encoded string test. So you can convert the string dGVzdA== to the string test:

$ echo "dGVzdA==" | base64 -d
test# no newline is printed after test, as the newline is not in input

Any thoughts ?

You could convert the string test to a string that contains the representation of each byte as hexadecimal number, typically with xxd:

$ echo "dGVzdA==" | base64 -d | xxd -p
74657374

Now you could insert a newline after each two characters in the string and read the output as a newline separated list of strings into an array:

$ readarray -t array < <(echo "dGVzdA==" | base64 -d | xxd -p -c1)
$ declare -p array
declare -a array=([0]="74" [1]="65" [2]="73" [3]="74")

Each element of the array is one byte in the input represented as a string that contains a number in base16. This may be as close as you can get to a "byte array".

Sign up to request clarification or add additional context in comments.

2 Comments

"Note that strings are zero terminated - it's impossible to store the zero byte." - you can store a null byte in a file: printf "\0" >somefile and you can send a null byte through a pipe: printf "a\0b" | xargs -0 -n1 echo
@milahu, yes, the context was clearly indicating that it's impossible to store the zero byte literally in a bash string.
2
#!/bin/bash

read -r -a my_array < <(echo 'dGVzdA==' | base64 -d | sed 's/./& /g')
echo "${#my_array[@]}"

Comments

2

Edit apr 12 2025

Add pure version without loop, requiring V5.2+

Introduction

Syntax:

 encoded='dGVzdA=='
 myVar=$(base64 -d <<<"$encoded")

Work fine, but if you plan to work with a lot of encoded strings, doing:

base64Decoder() {
    local -n _result=${2:-B64Decoded}
    _result=$(base64 -d <<<"$1")
}

Will be enough:

base64Decoder $encoded decoded
echo $decoded
test

But having to fork to base64 -d for each operation could become heavy...

Pure bash BASE64 decoder function.

As this question stand for

Bash script - decode encoded string to byte array

to be used as library function, there is a very efficient pure bash way. This is a lot quicker, for small strings!

#!/bin/bash
# Base64 decoder - pure bash function
# <C> 2024 F-Hauri.ch - [email protected]
# Licensed under terms of GPL v3. www.gnu.org
# shellcheck disable=SC2154  # not assigned 'myVar' ??
shopt -s extglob

declare -a B64=( {A..Z} {a..z} {0..9} + / '=' )
declare -Ai 'B64R=()'
for i in "${!B64[@]}"; do B64R["${B64[i]}"]=i%64; done
# shellcheck disable=SC2034 # Unused 'B64R' ??
declare -r B64R
unset B64
b64dec() {
    local _4B _Tail _hVal _v _opt OPTIND
    local -i iFd _24b _ar
    while getopts "av:" _opt; do case $_opt in
        a) _ar=1;; v) _v=${OPTARG};; *) return 1;; esac; done
    shift $((OPTIND-1))
    if [[ $_v ]];then local -n _res=${_v}; else local _res; fi
    if [[ $1 ]]; then        exec {iFd}<<<"$1"      # Open Input FD from string
    else            exec {iFd}<&0   ; fi    # Open Input FD from STDIN
    _res=()
    while read -rn4 -u $iFd _4B; do
        if [[ "$_4B" ]]; then
            _Tail=$_4B
            _24b=" B64R['${_4B::1}'] << 18 | B64R['${_4B:1:1}'] << 12 |
                   B64R['${_4B:2:1}'] << 6 | B64R['${_4B:3:1}'] "
            printf -v _hval %02x\  $((_24b>>16)) $((_24b>>8&255)) $((_24b&255))
            read -ra _hval <<<"$_hval"
            _res+=("${_hval[@]}")
        fi
    done
    exec {iFd}<&-
    _Tail=${_Tail##*([^=])}
    while [[ $_Tail ]]; do
        unset "_res[-1]"
        _Tail=${_Tail:1}
    done
    ((_ar==0)) && printf -v _res %b "${_res[@]/#/\\x}" && _res=("${_res[0]}")
    [[ -z $_v ]] && echo "${_res[@]}"
}

For handling binary datas, I use array of bytes values because it's the only way for handling null bytes (\x0) in bash. I use hexadecimal values in order to make conversion to binary easy by using printf %b ${varname[@]/#/\\x} syntaxe.

Usage: b64dec [-v varname] [-a] [string]
    -v VARNAME  Assign result to $VARNAME instead of print them on STDOUT
    -a          Return an array of hexadecimal values for hangling binary
If no "string" are submited as argument, b64dec will wait for input on STDIN.

Then:

echo -n '1 '; b64dec 'dGVzdA=='
1 test
echo -n '2 '; b64dec <<<'dGVzdA=='
2 test
b64dec -v myVar 'dGVzdA=='
echo "3 $myVar"
3 test
b64dec -av myVar 'dGVzdA=='
printf -v str %b "${myVar[@]/#/\\x}"
printf '4 %s <= [%s].\n' "$str" "${myVar[*]}"
4 test <= [74 65 73 74].
lstr='H4sIAAAAAAACA/NIzcnJVyjPL8pJU'
for str in ${lstr}{eQCAEHkqbINAAAA,VTkAgAYI85FDgAAAA==,VRU5AIAj62OVw8AAAA=}; do
    b64dec -v test -a <<<"$str"
    zcat < <(printf %b "${test[@]/#/\\x}")
done
Hello world!
Hello world!!
Hello world!!!
b64dec -av myVar < <( seq 1 100 | column | gzip | base64 )
gzip -d < <(printf %b "${myVar[@]/#/\\x}")
1      11      21      31      41      51      61      71      81      91
2      12      22      32      42      52      62      72      82      92
3      13      23      33      43      53      63      73      83      93
4      14      24      34      44      54      64      74      84      94
5      15      25      35      45      55      65      75      85      95
6      16      26      36      46      56      66      76      86      96
7      17      27      37      47      57      67      77      87      97
8      18      28      38      48      58      68      78      88      98
9      19      29      39      49      59      69      79      89      99
10     20      30      40      50      60      70      80      90      100

Where $myVar do hold a lot of null bytes:

declare -p myVar
declare -a myVar=([0]="1f" [1]="8b" [2]="08" [3]="00" [4]="00" [5]="00" [6]="00"
 [7]="00" [8]="00" [9]="03" [10]="05" [11]="c1" [12]="49" [13]="01" [14]="00" [1
5]="30" [16]="08" [17]="04" [18]="b1" [19]="f7" [20]="54" [21]="0d" [22]="cb" [2
...
"67" [147]="c8" [148]="ec" [149]="7d" [150]="3d" [151]="d9" [152]="29" [153]="b9
" [154]="24" [155]="01" [156]="00" [157]="00")

Version 2025, using bash V5.2+

Without any loop!

# Base64 decoder - pure bash function - require bash V5.2+
# (C) 2021-2025 F-Hauri.ch - [email protected]
# Version: 0.1.2 -- Last update: Fri Apr 11 14:23:11 CEST 2025
# Licensed under terms of GPL v3. www.gnu.org
# shellcheck disable=SC2154  # not assigned 'myVar' ??
shopt -s extglob

declare -a B64=( {A..Z} {a..z} {0..9} + / '=' )
printf -v _b64_tstr '["\44{B64[%d]}"]=%%d%%%%64 ' {0..64}
# shellcheck disable=SC2059  # format is variable.
printf -v _b64_tstr "$_b64_tstr" {0..64}
declare -Ai "B64R=($_b64_tstr)"
unset B64 _b64_tstr
declare -r B64R
b64dec() {
    local _line _Tail _resArry _tmpStr _v _opt OPTIND
    local -i iFd _24b _ar
    while getopts "av:" _opt; do case $_opt in
        a) _ar=1;; v) _v=${OPTARG};; *) return 1;; esac; done
    shift $((OPTIND-1))
    if [[ $_v ]];then local -n _res=${_v}; else local _res; fi
    if [[ $1 ]]; then    exec {iFd}<<<"$1"          # Open Input FD from string
    else                 exec {iFd}<&0        ; fi  # Open Input FD from STDIN
    mapfile -tu $iFd _lines
    read -ra _resArry <<<"${_lines[*]//?/& }"
    printf -v _tmpStr '"B64R[%s]<<18|B64R[%s]<<12|B64R[%s]<<6|B64R[%s]" ' \
           "${_resArry[@]}"
    local -ia "_tmpArry=($_tmpStr)"
    printf -v _tmpStr '%06X' "${_tmpArry[@]}"
    read -ra _res <<<"${_tmpStr//??/& }"
    exec {iFd}<&-
    _Tail=${_lines[-1]##*([^=])}
    _res=("${_res[@]::${#_res[@]}-${#_Tail}}")
    ((_ar==0)) && printf -v _res %b "${_res[@]/#/\\x}" && _res=("${_res[0]}")
    [[ -z $_v ]] && echo "${_res[@]}"
}

Same usage, same tests, but something quicker! See my comparison at Decode a base64 string and encode it as hex using xxd

2 Comments

Full rewrite 2024!! New function 100% pure bash! With two nice options -a for output an array of hexadecimal values and -v to store result into a variable.
Add new version using bash V5.2+, without loop!
1

Edited (thank you Charles Duffy for calling my attention back to this from 2020!)

Beware the X/Y problem - depending on what you are doing, you may not need this split out into a "real" array. You can also access individual characters in a string variable positionally to do this.

$: v=$( base64 -d <<< "dGVzdA==" )
$: for pos in 2 1 3; do echo "${v:pos:1}"; done
s
e
t

If you absolutely do need a real array made up of the characters in the decoded string -

$: my_array=()
$: v=$( base64 -d <<< "dGVzdA==" )
$: for ((i=0; i<${#v}; i++)); do my_array+=( "${v:i:1}" ); done
$: for i in 2 1 3; do echo "${my_array[i]}"; done
s
e
t

Original:

$: v=$( echo "dGVzdA==" | base64 -d )
$: my_array=( $( for(( i=0; i<${#v}; i++ )); do echo "${v:i:1}"; done ) )
$: echo ${my_array[2]}
s

This isn't a one-liner, but aside from the base64 call I think it's all done in the parser.

1 Comment

Why eat the cost (and correctness impact) of word-splitting? my_array=(); for ((i=0; i<${#v}; i++)); do my_array+=( "${v:i:1}" ); done is faster (no subshells!) and has less room for surprises (no undefined behavior via echo)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.