The encodeURI()
function encodes a
by replacing each instance of certain characters by one, two, three, or four escape sequences representing the
encoding of the character (will only be four escape sequences for characters composed of two surrogate characters). Compared to encodeURIComponent(), this function encodes fewer characters, preserving those that are part of the URI syntax.
Syntax
encodeURI(uri)
Parameters
uri
- : A string to be encoded as a URI.
Return value
A new string representing the provided string encoded as a URI.
Exceptions
- URIError
- : Thrown if
uri
contains a lone surrogate.
- : Thrown if
Description
encodeURI()
is a function property of the global object.
The encodeURI()
function escapes characters by UTF-8 code units, with each octet encoded in the format %XX
, left-padded with 0 if necessary. Because lone surrogates in UTF-16 do not encode any valid Unicode character, they cause encodeURI()
to throw a URIError.
encodeURI()
escapes all characters except:
A–Z a–z 0–9 - _ . ! ~ * ' ( )
; / ? : @ & = + $ , #
The characters on the second line are characters that may be part of the URI syntax, and are only escaped by encodeURIComponent()
. Both encodeURI()
and encodeURIComponent()
do not encode the characters -.!~*'()
, known as "unreserved marks", which do not have a reserved purpose but are allowed in a URI "as is". (See RFC2396)
The encodeURI()
function does not encode characters that have special meaning (reserved characters) for a URI. The following example shows all the parts that a URI can possibly contain. Note how certain characters are used to signify special meaning:
http://username:password@www.example.com:80/path/to/file.php?foo=316&bar=this+has+spaces#anchor
Examples
encodeURI() vs. encodeURIComponent()
encodeURI()
differs from encodeURIComponent() as follows:
const set1 = ";/?:@&=+$,#"; // Reserved Characters
const set2 = "-.!~*'()"; // Unreserved Marks
const set3 = "ABC abc 123"; // Alphanumeric Characters + Space
console.log(encodeURI(set1)); // ;/?:@&=+$,#
console.log(encodeURI(set2)); // -.!~*'()
console.log(encodeURI(set3)); // ABC%20abc%20123 (the space gets encoded as %20)
console.log(encodeURIComponent(set1)); // %3B%2C%2F%3F%3A%40%26%3D%2B%24%23
console.log(encodeURIComponent(set2)); // -.!~*'()
console.log(encodeURIComponent(set3)); // ABC%20abc%20123 (the space gets encoded as %20)
Note that encodeURI()
by itself cannot form proper HTTP
and
requests, such as for
, because &
, +
, and =
are not encoded, which are treated as special characters in GET
and POST
requests. encodeURIComponent()
, however, does encode these characters.
Encoding a lone surrogate throws
A URIError will be thrown if one attempts to encode a surrogate which is not part of a high-low pair. For example:
// High-low pair OK
encodeURI("\uD800\uDFFF"); // "%F0%90%8F%BF"
// Lone high-surrogate code unit throws "URIError: malformed URI sequence"
encodeURI("\uD800");
// Lone low-surrogate code unit throws "URIError: malformed URI sequence"
encodeURI("\uDFFF");
You can use String.prototype.toWellFormed, which replaces lone surrogates with the Unicode replacement character (U+FFFD), to avoid this error. You can also use String.prototype.isWellFormed to check if a string contains lone surrogates before passing it to encodeURI()
.
Encoding for RFC3986
The more recent RFC3986 makes square brackets reserved (for ) and thus not encoded when forming something which could be part of a URL (such as a host). It also reserves !, ', (, ), and *, even though these characters have no formalized URI delimiting uses. The following function encodes a string for RFC3986-compliant URL format.
function encodeRFC3986URI(str) {
return encodeURI(str)
.replace(/%5B/g, "[")
.replace(/%5D/g, "]")
.replace(
/[!'()*]/g,
(c) => `%${c.charCodeAt(0).toString(16).toUpperCase()}`,
);
}