The codePointAt()
method of String values returns a non-negative integer that is the Unicode code point value of the character starting at the given index. Note that the index is still based on UTF-16 code units, not Unicode code points.
Syntax
codePointAt(index)
Parameters
index
- : Zero-based index of the character to be returned. Converted to an integer —
undefined
is converted to 0.
- : Zero-based index of the character to be returned. Converted to an integer —
Return value
A non-negative integer representing the code point value of the character at the given index
.
- If
index
is out of the range of0
–str.length - 1
,codePointAt()
returns undefined. - If the element at
index
is a UTF-16 leading surrogate, returns the code point of the surrogate pair. - If the element at
index
is a UTF-16 trailing surrogate, returns only the trailing surrogate code unit.
Description
Characters in a string are indexed from left to right. The index of the first character is 0
, and the index of the last character in a string called str
is str.length - 1
.
Unicode code points range from 0
to 1114111
(0x10FFFF
). In UTF-16, each string index is a code unit with value 0
– 65535
. Higher code points are represented by a pair of 16-bit surrogate pseudo-characters. Therefore, codePointAt()
returns a code point that may span two string indices. For information on Unicode, see UTF-16 characters, Unicode code points, and grapheme clusters.
Examples
Using codePointAt()
"ABC".codePointAt(0); // 65
"ABC".codePointAt(0).toString(16); // 41
"😍".codePointAt(0); // 128525
"\ud83d\ude0d".codePointAt(0); // 128525
"\ud83d\ude0d".codePointAt(0).toString(16); // 1f60d
"😍".codePointAt(1); // 56845
"\ud83d\ude0d".codePointAt(1); // 56845
"\ud83d\ude0d".codePointAt(1).toString(16); // de0d
"ABC".codePointAt(42); // undefined
Looping with codePointAt()
Because using string indices for looping causes the same code point to be visited twice (once for the leading surrogate, once for the trailing surrogate), and the second time codePointAt()
returns only the trailing surrogate, it's better to avoid looping by index.
const str = "\ud83d\udc0e\ud83d\udc71\u2764";
for (let i = 0; i < str.length; i++) {
console.log(str.codePointAt(i).toString(16));
}
// '1f40e', 'dc0e', '1f471', 'dc71', '2764'
Instead, use a for...of
statement or spread the string, both of which invoke the string's @@iterator
, which iterates by code points. Then, use codePointAt(0)
to get the code point of each element.
for (const codePoint of str) {
console.log(codePoint.codePointAt(0).toString(16));
}
// '1f40e', '1f471', '2764'
[...str].map((cp) => cp.codePointAt(0).toString(16));
// ['1f40e', '1f471', '2764']