Is there any comprehensive list of characters that need to be escaped in Bash? Can it be checked just with sed
?
In particular, I was checking whether %
needs to be escaped or not. I tried
echo "h%h" | sed 's/%/i/g'
and worked fine, without escaping %
. Does it mean %
does not need to be escaped? Was this a good way to check the necessity?
And more general: are they the same characters to escape in shell
and bash
?
This question is related to
bash
shell
unix
escaping
special-characters
To save someone else from having to RTFM... in bash:
Enclosing characters in double quotes preserves the literal value of all characters within the quotes, with the exception of
$
,`
,\
, and, when history expansion is enabled,!
.
...so if you escape those (and the quote itself, of course) you're probably okay.
If you take a more conservative 'when in doubt, escape it' approach, it should be possible to avoid getting instead characters with special meaning by not escaping identifier characters (i.e. ASCII letters, numbers, or '_'). It's very unlikely these would ever (i.e. in some weird POSIX-ish shell) have special meaning and thus need to be escaped.
I noticed that bash automatically escapes some characters when using auto-complete.
For example, if you have a directory named dir:A
, bash will auto-complete to dir\:A
Using this, I runned some experiments using characters of the ASCII table and derived the following lists:
Characters that bash escapes on auto-complete: (includes space)
!"$&'()*,:;<=>?@[\]^`{|}
Characters that bash does not escape:
#%+-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~
(I excluded /
, as it cannot be used in directory names)
I presume that you're talking about bash strings. There are different types of strings which have a different set of requirements for escaping. eg. Single quotes strings are different from double quoted strings.
The best reference is the Quoting section of the bash manual.
It explains which characters needs escaping. Note that some characters may need escaping depending on which options are enabled such as history expansion.
${var@Q}
Under bash, you could store your variable content with Parameter Expansion's @
command for Parameter transformation:
${parameter@operator} Parameter transformation. The expansion is either a transforma- tion of the value of parameter or information about parameter itself, depending on the value of operator. Each operator is a single letter: Q The expansion is a string that is the value of parameter quoted in a format that can be reused as input. ... A The expansion is a string in the form of an assignment statement or declare command that, if evaluated, will recreate parameter with its attributes and value.
Sample:
$ var=$'Hello\nGood world.\n'
$ echo "$var"
Hello
Good world.
$ echo "${var@Q}"
$'Hello\nGood world.\n'
$ echo "${var@A}"
var=$'Hello\nGood world.\n'
There is a special printf
format directive (%q
) built for this kind of request:
printf [-v var] format [arguments]
%q causes printf to output the corresponding argument in a format that can be reused as shell input.
read foo
Hello world
printf "%q\n" "$foo"
Hello\ world
printf "%q\n" $'Hello world!\n'
$'Hello world!\n'
This could be used through variables too:
printf -v var "%q" "$foo
"
echo "$var"
$'Hello world\n'
Note that all bytes from 128 to 255 have to be escaped.
for i in {0..127} ;do
printf -v var \\%o $i
printf -v var $var
printf -v res "%q" "$var"
esc=E
[ "$var" = "$res" ] && esc=-
printf "%02X %s %-7s\n" $i $esc "$res"
done |
column
This must render something like:
00 E '' 1A E $'\032' 34 - 4 4E - N 68 - h
01 E $'\001' 1B E $'\E' 35 - 5 4F - O 69 - i
02 E $'\002' 1C E $'\034' 36 - 6 50 - P 6A - j
03 E $'\003' 1D E $'\035' 37 - 7 51 - Q 6B - k
04 E $'\004' 1E E $'\036' 38 - 8 52 - R 6C - l
05 E $'\005' 1F E $'\037' 39 - 9 53 - S 6D - m
06 E $'\006' 20 E \ 3A - : 54 - T 6E - n
07 E $'\a' 21 E \! 3B E \; 55 - U 6F - o
08 E $'\b' 22 E \" 3C E \< 56 - V 70 - p
09 E $'\t' 23 E \# 3D - = 57 - W 71 - q
0A E $'\n' 24 E \$ 3E E \> 58 - X 72 - r
0B E $'\v' 25 - % 3F E \? 59 - Y 73 - s
0C E $'\f' 26 E \& 40 - @ 5A - Z 74 - t
0D E $'\r' 27 E \' 41 - A 5B E \[ 75 - u
0E E $'\016' 28 E \( 42 - B 5C E \\ 76 - v
0F E $'\017' 29 E \) 43 - C 5D E \] 77 - w
10 E $'\020' 2A E \* 44 - D 5E E \^ 78 - x
11 E $'\021' 2B - + 45 - E 5F - _ 79 - y
12 E $'\022' 2C E \, 46 - F 60 E \` 7A - z
13 E $'\023' 2D - - 47 - G 61 - a 7B E \{
14 E $'\024' 2E - . 48 - H 62 - b 7C E \|
15 E $'\025' 2F - / 49 - I 63 - c 7D E \}
16 E $'\026' 30 - 0 4A - J 64 - d 7E E \~
17 E $'\027' 31 - 1 4B - K 65 - e 7F E $'\177'
18 E $'\030' 32 - 2 4C - L 66 - f
19 E $'\031' 33 - 3 4D - M 67 - g
Where first field is hexa value of byte, second contain E
if character need to be escaped and third field show escaped presentation of character.
,
?You could see some characters that don't always need to be escaped, like ,
, }
and {
.
So not always but sometime:
echo test 1, 2, 3 and 4,5.
test 1, 2, 3 and 4,5.
or
echo test { 1, 2, 3 }
test { 1, 2, 3 }
but care:
echo test{1,2,3}
test1 test2 test3
echo test\ {1,2,3}
test 1 test 2 test 3
echo test\ {\ 1,\ 2,\ 3\ }
test 1 test 2 test 3
echo test\ {\ 1\,\ 2,\ 3\ }
test 1, 2 test 3
Using the print '%q'
technique, we can run a loop to find out which characters are special:
#!/bin/bash
special=$'`!@#$%^&*()-_+={}|[]\\;\':",.<>?/ '
for ((i=0; i < ${#special}; i++)); do
char="${special:i:1}"
printf -v q_char '%q' "$char"
if [[ "$char" != "$q_char" ]]; then
printf 'Yes - character %s needs to be escaped\n' "$char"
else
printf 'No - character %s does not need to be escaped\n' "$char"
fi
done | sort
It gives this output:
No, character % does not need to be escaped
No, character + does not need to be escaped
No, character - does not need to be escaped
No, character . does not need to be escaped
No, character / does not need to be escaped
No, character : does not need to be escaped
No, character = does not need to be escaped
No, character @ does not need to be escaped
No, character _ does not need to be escaped
Yes, character needs to be escaped
Yes, character ! needs to be escaped
Yes, character " needs to be escaped
Yes, character # needs to be escaped
Yes, character $ needs to be escaped
Yes, character & needs to be escaped
Yes, character ' needs to be escaped
Yes, character ( needs to be escaped
Yes, character ) needs to be escaped
Yes, character * needs to be escaped
Yes, character , needs to be escaped
Yes, character ; needs to be escaped
Yes, character < needs to be escaped
Yes, character > needs to be escaped
Yes, character ? needs to be escaped
Yes, character [ needs to be escaped
Yes, character \ needs to be escaped
Yes, character ] needs to be escaped
Yes, character ^ needs to be escaped
Yes, character ` needs to be escaped
Yes, character { needs to be escaped
Yes, character | needs to be escaped
Yes, character } needs to be escaped
Some of the results, like ,
look a little suspicious. Would be interesting to get @CharlesDuffy's inputs on this.
Characters that need escaping are different in Bourne or POSIX shell than Bash. Generally (very) Bash is a superset of those shells, so anything you escape in shell
should be escaped in Bash.
A nice general rule would be "if in doubt, escape it". But escaping some characters gives them a special meaning, like \n
. These are listed in the man bash
pages under Quoting
and echo
.
Other than that, escape any character that is not alphanumeric, it is safer. I don't know of a single definitive list.
The man pages list them all somewhere, but not in one place. Learn the language, that is the way to be sure.
One that has caught me out is !
. This is a special character (history expansion) in Bash (and csh) but not in Korn shell. Even echo "Hello world!"
gives problems. Using single-quotes, as usual, removes the special meaning.
Source: Stackoverflow.com