(Thanks, Kim!)
String.length
actually measure?
Let's consult the standard!
The String type is the set of all ordered sequences of zero or more 16-bit unsigned integer values (“elements”)
The String type is generally used to represent textual data in a running ECMAScript program, in which case each element in the String is treated as a UTF-16 code unit value.
ECMAScript 6.0 Standard §6.1.4The length of a String is the number of elements (i.e., 16-bit values) within it.
Unique number given to each character
They look like this:
U+hhhh
or U+hhhhhh
Range from U+0000
to U+10FFFF
1,114,112 code points available in total
(See Unicode Chapter 3)
Code points are divided into 17 planes
U+hhhh
U+0000
to U+FFFF
U+hhhhhh
U+010000
to U+10FFFF
Smallest unit of storage required to store or transmit a single character in an encoding scheme
We want to count code points and not code units
String.prototype[@@iterator]
(String.prototype.codePointAt()
exists too)
When the @@iterator method is called it returns an Iterator object (25.1.1.2) that iterates over the code points of a String value, returning each code point as a String value.
let a = [];
for (let c of s) {
a.push(c);
}
var i, a = [],
for (i = 0; i < s.length; i++) {
a.push(s[i]);
}
Trick: Use Array#from
(it just does this:)
function (s) {
let a = [];
for (let c of s) {
a.push(c);
}
return a;
}
(Thanks, Kim!)
There are multiple ways of representing the same abstract character sequence
Useful for comparing different representations of the same abstract character sequence
(See UAX #15)
String.prototype.normalize()
When in doubt, use
Check the compatibility table!