[javascript] How do I split a string into an array of characters?

Old question but I should warn:

Do NOT use .split('')

You'll get weird results with non-BMP (non-Basic-Multilingual-Plane) character sets.

Reason is that methods like .split() and .charCodeAt() only respect the characters with a code point below 65536; bec. higher code points are represented by a pair of (lower valued) "surrogate" pseudo-characters.

''.length     // —> 6
''.split('')  // —> ["?", "?", "?", "?", "?", "?"]

''.length      // —> 2
''.split('')   // —> ["?", "?"]

Use ES2015 (ES6) features where possible:

Using the spread operator:

let arr = [...str];

Or Array.from

let arr = Array.from(str);

Or split with the new u RegExp flag:

let arr = str.split(/(?!$)/u);

Examples:

[...'']        // —> ["", "", ""]
[...'']     // —> ["", "", ""]

For ES5, options are limited:

I came up with this function that internally uses MDN example to get the correct code point of each character.

function stringToArray() {
  var i = 0,
    arr = [],
    codePoint;
  while (!isNaN(codePoint = knownCharCodeAt(str, i))) {
    arr.push(String.fromCodePoint(codePoint));
    i++;
  }
  return arr;
}

This requires knownCharCodeAt() function and for some browsers; a String.fromCodePoint() polyfill.

if (!String.fromCodePoint) {
// ES6 Unicode Shims 0.1 , © 2012 Steven Levithan , MIT License
    String.fromCodePoint = function fromCodePoint () {
        var chars = [], point, offset, units, i;
        for (i = 0; i < arguments.length; ++i) {
            point = arguments[i];
            offset = point - 0x10000;
            units = point > 0xFFFF ? [0xD800 + (offset >> 10), 0xDC00 + (offset & 0x3FF)] : [point];
            chars.push(String.fromCharCode.apply(null, units));
        }
        return chars.join("");
    }
}

Examples:

stringToArray('')     // —> ["", "", ""]
stringToArray('')  // —> ["", "", ""]

Note: str[index] (ES5) and str.charAt(index) will also return weird results with non-BMP charsets. e.g. ''.charAt(0) returns "?".

UPDATE: Read this nice article about JS and unicode.