When encoding a query string to be sent to a web server - when do you use escape()
and when do you use encodeURI()
or encodeURIComponent()
:
Use escape:
escape("% +&=");
OR
use encodeURI() / encodeURIComponent()
encodeURI("http://www.google.com?var1=value1&var2=value2");
encodeURIComponent("var1=value1&var2=value2");
This question is related to
javascript
encoding
query-string
encodeURI() - the escape() function is for javascript escaping, not HTTP.
Also remember that they all encode different sets of characters, and select the one you need appropriately. encodeURI() encodes fewer characters than encodeURIComponent(), which encodes fewer (and also different, to dannyp's point) characters than escape().
encodeURI() - the escape() function is for javascript escaping, not HTTP.
The accepted answer is good. To extend on the last part:
Note that encodeURIComponent does not escape the ' character. A common bug is to use it to create html attributes such as href='MyUrl', which could suffer an injection bug. If you are constructing html from strings, either use " instead of ' for attribute quotes, or add an extra layer of encoding (' can be encoded as %27).
If you want to be on the safe side, percent encoding unreserved characters should be encoded as well.
You can use this method to escape them (source Mozilla)
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
// fixedEncodeURIComponent("'") --> "%27"
Small comparison table Java vs. JavaScript vs. PHP.
1. Java URLEncoder.encode (using UTF8 charset)
2. JavaScript encodeURIComponent
3. JavaScript escape
4. PHP urlencode
5. PHP rawurlencode
char JAVA JavaScript --PHP---
[ ] + %20 %20 + %20
[!] %21 ! %21 %21 %21
[*] * * * %2A %2A
['] %27 ' %27 %27 %27
[(] %28 ( %28 %28 %28
[)] %29 ) %29 %29 %29
[;] %3B %3B %3B %3B %3B
[:] %3A %3A %3A %3A %3A
[@] %40 %40 @ %40 %40
[&] %26 %26 %26 %26 %26
[=] %3D %3D %3D %3D %3D
[+] %2B %2B + %2B %2B
[$] %24 %24 %24 %24 %24
[,] %2C %2C %2C %2C %2C
[/] %2F %2F / %2F %2F
[?] %3F %3F %3F %3F %3F
[#] %23 %23 %23 %23 %23
[[] %5B %5B %5B %5B %5B
[]] %5D %5D %5D %5D %5D
----------------------------------------
[~] %7E ~ %7E %7E ~
[-] - - - - -
[_] _ _ _ _ _
[%] %25 %25 %25 %25 %25
[\] %5C %5C %5C %5C %5C
----------------------------------------
char -JAVA- --JavaScript-- -----PHP------
[ä] %C3%A4 %C3%A4 %E4 %C3%A4 %C3%A4
[?] %D1%84 %D1%84 %u0444 %D1%84 %D1%84
I have this function...
var escapeURIparam = function(url) {
if (encodeURIComponent) url = encodeURIComponent(url);
else if (encodeURI) url = encodeURI(url);
else url = escape(url);
url = url.replace(/\+/g, '%2B'); // Force the replacement of "+"
return url;
};
For the purpose of encoding javascript has given three inbuilt functions -
escape()
- does not encode @*/+
This method is deprecated after the ECMA 3 so it should be avoided.
encodeURI()
- does not encode ~!@#$&*()=:/,;?+'
It assumes that the URI is a complete URI, so does not encode reserved characters that have special meaning in the URI.
This method is used when the intent is to convert the complete URL instead of some special segment of URL.
Example - encodeURI('http://stackoverflow.com');
will give - http://stackoverflow.com
encodeURIComponent()
- does not encode - _ . ! ~ * ' ( )
This function encodes a Uniform Resource Identifier (URI) component by replacing each instance of certain characters by one, two, three, or four escape sequences representing the UTF-8 encoding of the character. This method should be used to convert a component of URL. For instance some user input needs to be appended
Example - encodeURIComponent('http://stackoverflow.com');
will give - http%3A%2F%2Fstackoverflow.com
All this encoding is performed in UTF 8 i.e the characters will be converted in UTF-8 format.
encodeURIComponent differ from encodeURI in that it encode reserved characters and Number sign # of encodeURI
I've found that experimenting with the various methods is a good sanity check even after having a good handle of what their various uses and capabilities are.
Towards that end I have found this website extremely useful to confirm my suspicions that I am doing something appropriately. It has also proven useful for decoding an encodeURIComponent'ed string which can be rather challenging to interpret. A great bookmark to have:
I recommend not to use one of those methods as is. Write your own function which does the right thing.
MDN has given a good example on url encoding shown below.
var fileName = 'my file(2).txt';
var header = "Content-Disposition: attachment; filename*=UTF-8''" + encodeRFC5987ValueChars(fileName);
console.log(header);
// logs "Content-Disposition: attachment; filename*=UTF-8''my%20file%282%29.txt"
function encodeRFC5987ValueChars (str) {
return encodeURIComponent(str).
// Note that although RFC3986 reserves "!", RFC5987 does not,
// so we do not need to escape it
replace(/['()]/g, escape). // i.e., %27 %28 %29
replace(/\*/g, '%2A').
// The following are not required for percent-encoding per RFC5987,
// so we can allow for a little better readability over the wire: |`^
replace(/%(?:7C|60|5E)/g, unescape);
}
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent
For the purpose of encoding javascript has given three inbuilt functions -
escape()
- does not encode @*/+
This method is deprecated after the ECMA 3 so it should be avoided.
encodeURI()
- does not encode ~!@#$&*()=:/,;?+'
It assumes that the URI is a complete URI, so does not encode reserved characters that have special meaning in the URI.
This method is used when the intent is to convert the complete URL instead of some special segment of URL.
Example - encodeURI('http://stackoverflow.com');
will give - http://stackoverflow.com
encodeURIComponent()
- does not encode - _ . ! ~ * ' ( )
This function encodes a Uniform Resource Identifier (URI) component by replacing each instance of certain characters by one, two, three, or four escape sequences representing the UTF-8 encoding of the character. This method should be used to convert a component of URL. For instance some user input needs to be appended
Example - encodeURIComponent('http://stackoverflow.com');
will give - http%3A%2F%2Fstackoverflow.com
All this encoding is performed in UTF 8 i.e the characters will be converted in UTF-8 format.
encodeURIComponent differ from encodeURI in that it encode reserved characters and Number sign # of encodeURI
I recommend not to use one of those methods as is. Write your own function which does the right thing.
MDN has given a good example on url encoding shown below.
var fileName = 'my file(2).txt';
var header = "Content-Disposition: attachment; filename*=UTF-8''" + encodeRFC5987ValueChars(fileName);
console.log(header);
// logs "Content-Disposition: attachment; filename*=UTF-8''my%20file%282%29.txt"
function encodeRFC5987ValueChars (str) {
return encodeURIComponent(str).
// Note that although RFC3986 reserves "!", RFC5987 does not,
// so we do not need to escape it
replace(/['()]/g, escape). // i.e., %27 %28 %29
replace(/\*/g, '%2A').
// The following are not required for percent-encoding per RFC5987,
// so we can allow for a little better readability over the wire: |`^
replace(/%(?:7C|60|5E)/g, unescape);
}
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent
encodeURIComponent doesn't encode -_.!~*'()
, causing problem in posting data to php in xml string.
For example:
<xml><text x="100" y="150" value="It's a value with single quote" />
</xml>
General escape with encodeURI
%3Cxml%3E%3Ctext%20x=%22100%22%20y=%22150%22%20value=%22It's%20a%20value%20with%20single%20quote%22%20/%3E%20%3C/xml%3E
You can see, single quote is not encoded. To resolve issue I created two functions to solve issue in my project, for Encoding URL:
function encodeData(s:String):String{
return encodeURIComponent(s).replace(/\-/g, "%2D").replace(/\_/g, "%5F").replace(/\./g, "%2E").replace(/\!/g, "%21").replace(/\~/g, "%7E").replace(/\*/g, "%2A").replace(/\'/g, "%27").replace(/\(/g, "%28").replace(/\)/g, "%29");
}
For Decoding URL:
function decodeData(s:String):String{
try{
return decodeURIComponent(s.replace(/\%2D/g, "-").replace(/\%5F/g, "_").replace(/\%2E/g, ".").replace(/\%21/g, "!").replace(/\%7E/g, "~").replace(/\%2A/g, "*").replace(/\%27/g, "'").replace(/\%28/g, "(").replace(/\%29/g, ")"));
}catch (e:Error) {
}
return "";
}
I have this function...
var escapeURIparam = function(url) {
if (encodeURIComponent) url = encodeURIComponent(url);
else if (encodeURI) url = encodeURI(url);
else url = escape(url);
url = url.replace(/\+/g, '%2B'); // Force the replacement of "+"
return url;
};
Just try encodeURI()
and encodeURIComponent()
yourself...
console.log(encodeURIComponent('@#$%^&*'));
_x000D_
Input: @#$%^&*
. Output: %40%23%24%25%5E%26*
. So, wait, what happened to *
? Why wasn't this converted? It could definitely cause problems if you tried to do linux command "$string"
. TLDR: You actually want fixedEncodeURIComponent()
and fixedEncodeURI()
. Long-story...
When to use encodeURI()
? Never. encodeURI()
fails to adhere to RFC3986 with regard to bracket-encoding. Use fixedEncodeURI()
, as defined and further explained at the MDN encodeURI() Documentation...
function fixedEncodeURI(str) { return encodeURI(str).replace(/%5B/g, '[').replace(/%5D/g, ']'); }
When to use encodeURIComponent()
? Never. encodeURIComponent()
fails to adhere to RFC3986 with regard to encoding: !'()*
. Use fixedEncodeURIComponent()
, as defined and further explained at the MDN encodeURIComponent() Documentation...
function fixedEncodeURIComponent(str) { return encodeURIComponent(str).replace(/[!'()*]/g, function(c) { return '%' + c.charCodeAt(0).toString(16); }); }
Then you can use fixedEncodeURI()
to encode a single URL piece, whereas fixedEncodeURIComponent()
will encode URL pieces and connectors; or, simply, fixedEncodeURI()
will not encode +@?=:#;,$&
(as &
and +
are common URL operators), but fixedEncodeURIComponent()
will.
encodeURI() - the escape() function is for javascript escaping, not HTTP.
Inspired by Johann's table, I've decided to extend the table. I wanted to see which ASCII characters get encoded.
var ascii = " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~";_x000D_
_x000D_
var encoded = [];_x000D_
_x000D_
ascii.split("").forEach(function (char) {_x000D_
var obj = { char };_x000D_
if (char != encodeURI(char))_x000D_
obj.encodeURI = encodeURI(char);_x000D_
if (char != encodeURIComponent(char))_x000D_
obj.encodeURIComponent = encodeURIComponent(char);_x000D_
if (obj.encodeURI || obj.encodeURIComponent)_x000D_
encoded.push(obj);_x000D_
});_x000D_
_x000D_
console.table(encoded);
_x000D_
Table shows only the encoded characters. Empty cells mean that the original and the encoded characters are the same.
Just to be extra, I'm adding another table for urlencode()
vs rawurlencode()
. The only difference seems to be the encoding of space character.
<script>
<?php
$ascii = str_split(" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~", 1);
$encoded = [];
foreach ($ascii as $char) {
$obj = ["char" => $char];
if ($char != urlencode($char))
$obj["urlencode"] = urlencode($char);
if ($char != rawurlencode($char))
$obj["rawurlencode"] = rawurlencode($char);
if (isset($obj["rawurlencode"]) || isset($obj["rawurlencode"]))
$encoded[] = $obj;
}
echo "var encoded = " . json_encode($encoded) . ";";
?>
console.table(encoded);
</script>
The difference between encodeURI()
and encodeURIComponent()
are exactly 11 characters encoded by encodeURIComponent but not by encodeURI:
I generated this table easily with console.table in Google Chrome with this code:
var arr = [];_x000D_
for(var i=0;i<256;i++) {_x000D_
var char=String.fromCharCode(i);_x000D_
if(encodeURI(char)!==encodeURIComponent(char)) {_x000D_
arr.push({_x000D_
character:char,_x000D_
encodeURI:encodeURI(char),_x000D_
encodeURIComponent:encodeURIComponent(char)_x000D_
});_x000D_
}_x000D_
}_x000D_
console.table(arr);
_x000D_
encodeURIComponent doesn't encode -_.!~*'()
, causing problem in posting data to php in xml string.
For example:
<xml><text x="100" y="150" value="It's a value with single quote" />
</xml>
General escape with encodeURI
%3Cxml%3E%3Ctext%20x=%22100%22%20y=%22150%22%20value=%22It's%20a%20value%20with%20single%20quote%22%20/%3E%20%3C/xml%3E
You can see, single quote is not encoded. To resolve issue I created two functions to solve issue in my project, for Encoding URL:
function encodeData(s:String):String{
return encodeURIComponent(s).replace(/\-/g, "%2D").replace(/\_/g, "%5F").replace(/\./g, "%2E").replace(/\!/g, "%21").replace(/\~/g, "%7E").replace(/\*/g, "%2A").replace(/\'/g, "%27").replace(/\(/g, "%28").replace(/\)/g, "%29");
}
For Decoding URL:
function decodeData(s:String):String{
try{
return decodeURIComponent(s.replace(/\%2D/g, "-").replace(/\%5F/g, "_").replace(/\%2E/g, ".").replace(/\%21/g, "!").replace(/\%7E/g, "~").replace(/\%2A/g, "*").replace(/\%27/g, "'").replace(/\%28/g, "(").replace(/\%29/g, ")"));
}catch (e:Error) {
}
return "";
}
Also remember that they all encode different sets of characters, and select the one you need appropriately. encodeURI() encodes fewer characters than encodeURIComponent(), which encodes fewer (and also different, to dannyp's point) characters than escape().
I found this article enlightening : Javascript Madness: Query String Parsing
I found it when I was trying to undersand why decodeURIComponent was not decoding '+' correctly. Here is an extract:
String: "A + B"
Expected Query String Encoding: "A+%2B+B"
escape("A + B") = "A%20+%20B" Wrong!
encodeURI("A + B") = "A%20+%20B" Wrong!
encodeURIComponent("A + B") = "A%20%2B%20B" Acceptable, but strange
Encoded String: "A+%2B+B"
Expected Decoding: "A + B"
unescape("A+%2B+B") = "A+++B" Wrong!
decodeURI("A+%2B+B") = "A+++B" Wrong!
decodeURIComponent("A+%2B+B") = "A+++B" Wrong!
Modern rewrite of @johann-echavarria's answer:
console.log(_x000D_
Array(256)_x000D_
.fill()_x000D_
.map((ignore, i) => String.fromCharCode(i))_x000D_
.filter(_x000D_
(char) =>_x000D_
encodeURI(char) !== encodeURIComponent(char)_x000D_
? {_x000D_
character: char,_x000D_
encodeURI: encodeURI(char),_x000D_
encodeURIComponent: encodeURIComponent(char)_x000D_
}_x000D_
: false_x000D_
)_x000D_
)
_x000D_
Or if you can use a table, replace console.log
with console.table
(for the prettier output).
The accepted answer is good. To extend on the last part:
Note that encodeURIComponent does not escape the ' character. A common bug is to use it to create html attributes such as href='MyUrl', which could suffer an injection bug. If you are constructing html from strings, either use " instead of ' for attribute quotes, or add an extra layer of encoding (' can be encoded as %27).
If you want to be on the safe side, percent encoding unreserved characters should be encoded as well.
You can use this method to escape them (source Mozilla)
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
// fixedEncodeURIComponent("'") --> "%27"
encodeURI() - the escape() function is for javascript escaping, not HTTP.
I found this article enlightening : Javascript Madness: Query String Parsing
I found it when I was trying to undersand why decodeURIComponent was not decoding '+' correctly. Here is an extract:
String: "A + B"
Expected Query String Encoding: "A+%2B+B"
escape("A + B") = "A%20+%20B" Wrong!
encodeURI("A + B") = "A%20+%20B" Wrong!
encodeURIComponent("A + B") = "A%20%2B%20B" Acceptable, but strange
Encoded String: "A+%2B+B"
Expected Decoding: "A + B"
unescape("A+%2B+B") = "A+++B" Wrong!
decodeURI("A+%2B+B") = "A+++B" Wrong!
decodeURIComponent("A+%2B+B") = "A+++B" Wrong!
The difference between encodeURI()
and encodeURIComponent()
are exactly 11 characters encoded by encodeURIComponent but not by encodeURI:
I generated this table easily with console.table in Google Chrome with this code:
var arr = [];_x000D_
for(var i=0;i<256;i++) {_x000D_
var char=String.fromCharCode(i);_x000D_
if(encodeURI(char)!==encodeURIComponent(char)) {_x000D_
arr.push({_x000D_
character:char,_x000D_
encodeURI:encodeURI(char),_x000D_
encodeURIComponent:encodeURIComponent(char)_x000D_
});_x000D_
}_x000D_
}_x000D_
console.table(arr);
_x000D_
Inspired by Johann's table, I've decided to extend the table. I wanted to see which ASCII characters get encoded.
var ascii = " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~";_x000D_
_x000D_
var encoded = [];_x000D_
_x000D_
ascii.split("").forEach(function (char) {_x000D_
var obj = { char };_x000D_
if (char != encodeURI(char))_x000D_
obj.encodeURI = encodeURI(char);_x000D_
if (char != encodeURIComponent(char))_x000D_
obj.encodeURIComponent = encodeURIComponent(char);_x000D_
if (obj.encodeURI || obj.encodeURIComponent)_x000D_
encoded.push(obj);_x000D_
});_x000D_
_x000D_
console.table(encoded);
_x000D_
Table shows only the encoded characters. Empty cells mean that the original and the encoded characters are the same.
Just to be extra, I'm adding another table for urlencode()
vs rawurlencode()
. The only difference seems to be the encoding of space character.
<script>
<?php
$ascii = str_split(" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~", 1);
$encoded = [];
foreach ($ascii as $char) {
$obj = ["char" => $char];
if ($char != urlencode($char))
$obj["urlencode"] = urlencode($char);
if ($char != rawurlencode($char))
$obj["rawurlencode"] = rawurlencode($char);
if (isset($obj["rawurlencode"]) || isset($obj["rawurlencode"]))
$encoded[] = $obj;
}
echo "var encoded = " . json_encode($encoded) . ";";
?>
console.table(encoded);
</script>
Also remember that they all encode different sets of characters, and select the one you need appropriately. encodeURI() encodes fewer characters than encodeURIComponent(), which encodes fewer (and also different, to dannyp's point) characters than escape().
Modern rewrite of @johann-echavarria's answer:
console.log(_x000D_
Array(256)_x000D_
.fill()_x000D_
.map((ignore, i) => String.fromCharCode(i))_x000D_
.filter(_x000D_
(char) =>_x000D_
encodeURI(char) !== encodeURIComponent(char)_x000D_
? {_x000D_
character: char,_x000D_
encodeURI: encodeURI(char),_x000D_
encodeURIComponent: encodeURIComponent(char)_x000D_
}_x000D_
: false_x000D_
)_x000D_
)
_x000D_
Or if you can use a table, replace console.log
with console.table
(for the prettier output).
Just try encodeURI()
and encodeURIComponent()
yourself...
console.log(encodeURIComponent('@#$%^&*'));
_x000D_
Input: @#$%^&*
. Output: %40%23%24%25%5E%26*
. So, wait, what happened to *
? Why wasn't this converted? It could definitely cause problems if you tried to do linux command "$string"
. TLDR: You actually want fixedEncodeURIComponent()
and fixedEncodeURI()
. Long-story...
When to use encodeURI()
? Never. encodeURI()
fails to adhere to RFC3986 with regard to bracket-encoding. Use fixedEncodeURI()
, as defined and further explained at the MDN encodeURI() Documentation...
function fixedEncodeURI(str) { return encodeURI(str).replace(/%5B/g, '[').replace(/%5D/g, ']'); }
When to use encodeURIComponent()
? Never. encodeURIComponent()
fails to adhere to RFC3986 with regard to encoding: !'()*
. Use fixedEncodeURIComponent()
, as defined and further explained at the MDN encodeURIComponent() Documentation...
function fixedEncodeURIComponent(str) { return encodeURIComponent(str).replace(/[!'()*]/g, function(c) { return '%' + c.charCodeAt(0).toString(16); }); }
Then you can use fixedEncodeURI()
to encode a single URL piece, whereas fixedEncodeURIComponent()
will encode URL pieces and connectors; or, simply, fixedEncodeURI()
will not encode +@?=:#;,$&
(as &
and +
are common URL operators), but fixedEncodeURIComponent()
will.
Small comparison table Java vs. JavaScript vs. PHP.
1. Java URLEncoder.encode (using UTF8 charset)
2. JavaScript encodeURIComponent
3. JavaScript escape
4. PHP urlencode
5. PHP rawurlencode
char JAVA JavaScript --PHP---
[ ] + %20 %20 + %20
[!] %21 ! %21 %21 %21
[*] * * * %2A %2A
['] %27 ' %27 %27 %27
[(] %28 ( %28 %28 %28
[)] %29 ) %29 %29 %29
[;] %3B %3B %3B %3B %3B
[:] %3A %3A %3A %3A %3A
[@] %40 %40 @ %40 %40
[&] %26 %26 %26 %26 %26
[=] %3D %3D %3D %3D %3D
[+] %2B %2B + %2B %2B
[$] %24 %24 %24 %24 %24
[,] %2C %2C %2C %2C %2C
[/] %2F %2F / %2F %2F
[?] %3F %3F %3F %3F %3F
[#] %23 %23 %23 %23 %23
[[] %5B %5B %5B %5B %5B
[]] %5D %5D %5D %5D %5D
----------------------------------------
[~] %7E ~ %7E %7E ~
[-] - - - - -
[_] _ _ _ _ _
[%] %25 %25 %25 %25 %25
[\] %5C %5C %5C %5C %5C
----------------------------------------
char -JAVA- --JavaScript-- -----PHP------
[ä] %C3%A4 %C3%A4 %E4 %C3%A4 %C3%A4
[?] %D1%84 %D1%84 %u0444 %D1%84 %D1%84
Source: Stackoverflow.com