The conventions of RFC 3986 specify the range of characters that are acceptable for use with URLs. These are the characters in the US-ASCII set ? those that are not included in this range must first be converted to valid US-ASCII characters for compatibility with major browser software packages that are in use around the world. URL encoding or percent encoding is the process through which special characters are transformed into a format that is acceptable by the transfer protocols on the Internet.
There are two major steps in the process of URL encoding. In the first step, the URL information is encoded in accordance with the conventions of UTF-8 character encoding. The characters that are not included in the unreserved set are then percent encoded by changing their bytes into a hexadecimal value that should be preceded by a hexadecimal value. Note that although some characters are included in the US-ASCII range, they could be reserved because certain URL schemes use them as delimiters.
Reserved Characters
!
#
$
&
'
(
)
*
+
,
/
:
;
=
?
@
[
]
%21
%23
%24
%26
%27
%28
%29
%2A
%2B
%2C
%2F
%3A
%3B
%3D
%3F
%40
%5B
%5D
Unreserved Characters
A |
B |
C |
D |
E |
F |
G |
H |
I |
J |
K |
L |
M |
N |
O |
P |
Q |
R |
S |
T |
U |
V |
W |
X |
Y |
Z |
a |
b |
c |
d |
e |
f |
g |
h |
i |
j |
k |
l |
m |
n |
o |
p |
q |
r |
s |
t |
u |
v |
w |
x |
y |
z |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
- |
_ |
. |
~ |