UNB/ CS/ David Bremner/ teaching/ cs2613/ books/ mdn/ Reference/ Global Objects/ encodeURI()

The encodeURI() function encodes a by replacing each instance of certain characters by one, two, three, or four escape sequences representing the encoding of the character (will only be four escape sequences for characters composed of two surrogate characters). Compared to encodeURIComponent(), this function encodes fewer characters, preserving those that are part of the URI syntax.

Syntax

encodeURI(uri)

Parameters

Return value

A new string representing the provided string encoded as a URI.

Exceptions

Description

encodeURI() is a function property of the global object.

The encodeURI() function escapes characters by UTF-8 code units, with each octet encoded in the format %XX, left-padded with 0 if necessary. Because lone surrogates in UTF-16 do not encode any valid Unicode character, they cause encodeURI() to throw a URIError.

encodeURI() escapes all characters except:

A–Z a–z 0–9 - _ . ! ~ * ' ( )

; / ? : @ & = + $ , #

The characters on the second line are characters that may be part of the URI syntax, and are only escaped by encodeURIComponent(). Both encodeURI() and encodeURIComponent() do not encode the characters -.!~*'(), known as "unreserved marks", which do not have a reserved purpose but are allowed in a URI "as is". (See RFC2396)

The encodeURI() function does not encode characters that have special meaning (reserved characters) for a URI. The following example shows all the parts that a URI can possibly contain. Note how certain characters are used to signify special meaning:

http://username:password@www.example.com:80/path/to/file.php?foo=316&bar=this+has+spaces#anchor

Examples

encodeURI() vs. encodeURIComponent()

encodeURI() differs from encodeURIComponent() as follows:

const set1 = ";/?:@&=+$,#"; // Reserved Characters
const set2 = "-.!~*'()"; // Unreserved Marks
const set3 = "ABC abc 123"; // Alphanumeric Characters + Space

console.log(encodeURI(set1)); // ;/?:@&=+$,#
console.log(encodeURI(set2)); // -.!~*'()
console.log(encodeURI(set3)); // ABC%20abc%20123 (the space gets encoded as %20)

console.log(encodeURIComponent(set1)); // %3B%2C%2F%3F%3A%40%26%3D%2B%24%23
console.log(encodeURIComponent(set2)); // -.!~*'()
console.log(encodeURIComponent(set3)); // ABC%20abc%20123 (the space gets encoded as %20)

Note that encodeURI() by itself cannot form proper HTTP and requests, such as for , because &, +, and = are not encoded, which are treated as special characters in GET and POST requests. encodeURIComponent(), however, does encode these characters.

Encoding a lone surrogate throws

A URIError will be thrown if one attempts to encode a surrogate which is not part of a high-low pair. For example:

// High-low pair OK
encodeURI("\uD800\uDFFF"); // "%F0%90%8F%BF"

// Lone high-surrogate code unit throws "URIError: malformed URI sequence"
encodeURI("\uD800");

// Lone low-surrogate code unit throws "URIError: malformed URI sequence"
encodeURI("\uDFFF");

You can use String.prototype.toWellFormed, which replaces lone surrogates with the Unicode replacement character (U+FFFD), to avoid this error. You can also use String.prototype.isWellFormed to check if a string contains lone surrogates before passing it to encodeURI().

Encoding for RFC3986

The more recent RFC3986 makes square brackets reserved (for ) and thus not encoded when forming something which could be part of a URL (such as a host). It also reserves !, ', (, ), and *, even though these characters have no formalized URI delimiting uses. The following function encodes a string for RFC3986-compliant URL format.

function encodeRFC3986URI(str) {
  return encodeURI(str)
    .replace(/%5B/g, "[")
    .replace(/%5D/g, "]")
    .replace(
      /[!'()*]/g,
      (c) => `%${c.charCodeAt(0).toString(16).toUpperCase()}`,
    );
}

Specifications

Browser compatibility

See also