[url] URL encoding the space character: + or %20?

This confusion is because URLs are still 'broken' to this day.

Take "http://www.google.com" for instance. This is a URL. A URL is a Uniform Resource Locator and is really a pointer to a web page (in most cases). URLs actually have a very well-defined structure since the first specification in 1994.

We can extract detailed information about the "http://www.google.com" URL:

+---------------+-------------------+
|      Part     |      Data         |
+---------------+-------------------+
|  Scheme       | http              |
|  Host         | www.google.com    |
+---------------+-------------------+

If we look at a more complex URL such as:

"https://bob:[email protected]:8080/file;p=1?q=2#third"

we can extract the following information:

+-------------------+---------------------+
|        Part       |       Data          |
+-------------------+---------------------+
|  Scheme           | https               |
|  User             | bob                 |
|  Password         | bobby               |
|  Host             | www.lunatech.com    |
|  Port             | 8080                |
|  Path             | /file;p=1           |
|  Path parameter   | p=1                 |
|  Query            | q=2                 |
|  Fragment         | third               |
+-------------------+---------------------+

https://bob:[email protected]:8080/file;p=1?q=2#third
\___/   \_/ \___/ \______________/ \__/\_______/ \_/ \___/
  |      |    |          |          |      | \_/  |    |
Scheme User Password    Host       Port  Path |   | Fragment
        \_____________________________/       | Query
                       |               Path parameter
                   Authority

The reserved characters are different for each part.

For HTTP URLs, a space in a path fragment part has to be encoded to "%20" (not, absolutely not "+"), while the "+" character in the path fragment part can be left unencoded.

Now in the query part, spaces may be encoded to either "+" (for backwards compatibility: do not try to search for it in the URI standard) or "%20" while the "+" character (as a result of this ambiguity) has to be escaped to "%2B".

This means that the "blue+light blue" string has to be encoded differently in the path and query parts:

"http://example.com/blue+light%20blue?blue%2Blight+blue".

From there you can deduce that encoding a fully constructed URL is impossible without a syntactical awareness of the URL structure.

This boils down to:

You should have %20 before the ? and + after.

Source