[javascript] Converting between strings and ArrayBuffers

Is there a commonly accepted technique for efficiently converting JavaScript strings to ArrayBuffers and vice-versa? Specifically, I'd like to be able to write the contents of an ArrayBuffer to localStorage and to read it back.

This question is related to javascript serialization arraybuffer typed-arrays

The answer is


All the following is about getting binary strings from array buffers

I'd recommend not to use

var binaryString = String.fromCharCode.apply(null, new Uint8Array(arrayBuffer));

because it

  1. crashes on big buffers (somebody wrote about "magic" size of 246300 but I got Maximum call stack size exceeded error on 120000 bytes buffer (Chrome 29))
  2. it has really poor performance (see below)

If you exactly need synchronous solution use something like

var
  binaryString = '',
  bytes = new Uint8Array(arrayBuffer),
  length = bytes.length;
for (var i = 0; i < length; i++) {
  binaryString += String.fromCharCode(bytes[i]);
}

it is as slow as the previous one but works correctly. It seems that at the moment of writing this there is no quite fast synchronous solution for that problem (all libraries mentioned in this topic uses the same approach for their synchronous features).

But what I really recommend is using Blob + FileReader approach

function readBinaryStringFromArrayBuffer (arrayBuffer, onSuccess, onFail) {
  var reader = new FileReader();
  reader.onload = function (event) {
    onSuccess(event.target.result);
  };
  reader.onerror = function (event) {
    onFail(event.target.error);
  };
  reader.readAsBinaryString(new Blob([ arrayBuffer ],
    { type: 'application/octet-stream' }));
}

the only disadvantage (not for all) is that it is asynchronous. And it is about 8-10 times faster then previous solutions! (Some details: synchronous solution on my environment took 950-1050 ms for 2.4Mb buffer but solution with FileReader had times about 100-120 ms for the same amount of data. And I have tested both synchronous solutions on 100Kb buffer and they have taken almost the same time, so loop is not much slower the using 'apply'.)

BTW here: How to convert ArrayBuffer to and from String author compares two approaches like me and get completely opposite results (his test code is here) Why so different results? Probably because of his test string that is 1Kb long (he called it "veryLongStr"). My buffer was a really big JPEG image of size 2.4Mb.


From emscripten:

function stringToUTF8Array(str, outU8Array, outIdx, maxBytesToWrite) {
  if (!(maxBytesToWrite > 0)) return 0;
  var startIdx = outIdx;
  var endIdx = outIdx + maxBytesToWrite - 1;
  for (var i = 0; i < str.length; ++i) {
    var u = str.charCodeAt(i);
    if (u >= 55296 && u <= 57343) {
      var u1 = str.charCodeAt(++i);
      u = 65536 + ((u & 1023) << 10) | u1 & 1023
    }
    if (u <= 127) {
      if (outIdx >= endIdx) break;
      outU8Array[outIdx++] = u
    } else if (u <= 2047) {
      if (outIdx + 1 >= endIdx) break;
      outU8Array[outIdx++] = 192 | u >> 6;
      outU8Array[outIdx++] = 128 | u & 63
    } else if (u <= 65535) {
      if (outIdx + 2 >= endIdx) break;
      outU8Array[outIdx++] = 224 | u >> 12;
      outU8Array[outIdx++] = 128 | u >> 6 & 63;
      outU8Array[outIdx++] = 128 | u & 63
    } else {
      if (outIdx + 3 >= endIdx) break;
      outU8Array[outIdx++] = 240 | u >> 18;
      outU8Array[outIdx++] = 128 | u >> 12 & 63;
      outU8Array[outIdx++] = 128 | u >> 6 & 63;
      outU8Array[outIdx++] = 128 | u & 63
    }
  }
  outU8Array[outIdx] = 0;
  return outIdx - startIdx
}

Use like:

stringToUTF8Array('abs', new Uint8Array(3), 0, 4);

  stringToArrayBuffer(byteString) {
    var byteArray = new Uint8Array(byteString.length);
    for (var i = 0; i < byteString.length; i++) {
      byteArray[i] = byteString.codePointAt(i);
    }
    return byteArray;
  }
  arrayBufferToString(buffer) {
    var byteArray = new Uint8Array(buffer);
    var byteString = '';
    for (var i = 0; i < byteArray.byteLength; i++) {
      byteString += String.fromCodePoint(byteArray[i]);
    }
    return byteString;
  }

if you used huge array example arr.length=1000000 you can this code to avoid stack callback problems

function ab2str(buf) {
var bufView = new Uint16Array(buf);
var unis =""
for (var i = 0; i < bufView.length; i++) {
    unis=unis+String.fromCharCode(bufView[i]);
}
return unis
}

reverse function mangini answer from top

function str2ab(str) {
    var buf = new ArrayBuffer(str.length*2); // 2 bytes for each char
    var bufView = new Uint16Array(buf);
    for (var i=0, strLen=str.length; i<strLen; i++) {
        bufView[i] = str.charCodeAt(i);
    }
    return buf;
}

The "native" binary string that atob() returns is a 1-byte-per-character Array.

So we shouldn't store 2 byte into a character.

var arrayBufferToString = function(buffer) {
  return String.fromCharCode.apply(null, new Uint8Array(buffer));
}

var stringToArrayBuffer = function(str) {
  return (new Uint8Array([].map.call(str,function(x){return x.charCodeAt(0)}))).buffer;
}

Blob is much slower than String.fromCharCode(null,array);

but that fails if the array buffer gets too big. The best solution I have found is to use String.fromCharCode(null,array); and split it up into operations that won't blow the stack, but are faster than a single char at a time.

The best solution for large array buffer is:

function arrayBufferToString(buffer){

    var bufView = new Uint16Array(buffer);
    var length = bufView.length;
    var result = '';
    var addition = Math.pow(2,16)-1;

    for(var i = 0;i<length;i+=addition){

        if(i + addition > length){
            addition = length - i;
        }
        result += String.fromCharCode.apply(null, bufView.subarray(i,i+addition));
    }

    return result;

}

I found this to be about 20 times faster than using blob. It also works for large strings of over 100mb.


I'd recommend NOT using deprecated APIs like BlobBuilder

BlobBuilder has long been deprecated by the Blob object. Compare the code in Dennis' answer — where BlobBuilder is used — with the code below:

function arrayBufferGen(str, cb) {

  var b = new Blob([str]);
  var f = new FileReader();

  f.onload = function(e) {
    cb(e.target.result);
  }

  f.readAsArrayBuffer(b);

}

Note how much cleaner and less bloated this is compared to the deprecated method... Yeah, this is definitely something to consider here.


Yes:

const encstr = (`TextEncoder` in window) ? new TextEncoder().encode(str) : Uint8Array.from(str, c => c.codePointAt(0));

For node.js and also for browsers using https://github.com/feross/buffer

function ab2str(buf: Uint8Array) {
  return Buffer.from(buf).toString('base64');
}
function str2ab(str: string) {
  return new Uint8Array(Buffer.from(str, 'base64'))
}

Note: Solutions here didn't work for me. I need to support node.js and browsers and just serialize UInt8Array to a string. I could serialize it as a number[] but that occupies unnecessary space. With that solution I don't need to worry about encodings since it's base64. Just in case other people struggle with the same problem... My two cents


var decoder = new TextDecoder ();
var string = decoder.decode (arrayBuffer);

See https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/decode


See here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays/StringView (a C-like interface for strings based upon the JavaScript ArrayBuffer interface)


Although Dennis and gengkev solutions of using Blob/FileReader work, I wouldn't suggest taking that approach. It is an async approach to a simple problem, and it is much slower than a direct solution. I've made a post in html5rocks with a simpler and (much faster) solution: http://updates.html5rocks.com/2012/06/How-to-convert-ArrayBuffer-to-and-from-String

And the solution is:

function ab2str(buf) {
  return String.fromCharCode.apply(null, new Uint16Array(buf));
}

function str2ab(str) {
  var buf = new ArrayBuffer(str.length*2); // 2 bytes for each char
  var bufView = new Uint16Array(buf);
  for (var i=0, strLen=str.length; i<strLen; i++) {
    bufView[i] = str.charCodeAt(i);
  }
  return buf;
}

EDIT:

The Encoding API helps solving the string conversion problem. Check out the response from Jeff Posnik on Html5Rocks.com to the above original article.

Excerpt:

The Encoding API makes it simple to translate between raw bytes and native JavaScript strings, regardless of which of the many standard encodings you need to work with.

<pre id="results"></pre>

<script>
  if ('TextDecoder' in window) {
    // The local files to be fetched, mapped to the encoding that they're using.
    var filesToEncoding = {
      'utf8.bin': 'utf-8',
      'utf16le.bin': 'utf-16le',
      'macintosh.bin': 'macintosh'
    };

    Object.keys(filesToEncoding).forEach(function(file) {
      fetchAndDecode(file, filesToEncoding[file]);
    });
  } else {
    document.querySelector('#results').textContent = 'Your browser does not support the Encoding API.'
  }

  // Use XHR to fetch `file` and interpret its contents as being encoded with `encoding`.
  function fetchAndDecode(file, encoding) {
    var xhr = new XMLHttpRequest();
    xhr.open('GET', file);
    // Using 'arraybuffer' as the responseType ensures that the raw data is returned,
    // rather than letting XMLHttpRequest decode the data first.
    xhr.responseType = 'arraybuffer';
    xhr.onload = function() {
      if (this.status == 200) {
        // The decode() method takes a DataView as a parameter, which is a wrapper on top of the ArrayBuffer.
        var dataView = new DataView(this.response);
        // The TextDecoder interface is documented at http://encoding.spec.whatwg.org/#interface-textdecoder
        var decoder = new TextDecoder(encoding);
        var decodedString = decoder.decode(dataView);
        // Add the decoded file's text to the <pre> element on the page.
        document.querySelector('#results').textContent += decodedString + '\n';
      } else {
        console.error('Error while requesting', file, this);
      }
    };
    xhr.send();
  }
</script>

I used this and works for me.

function arrayBufferToBase64( buffer ) {
    var binary = '';
    var bytes = new Uint8Array( buffer );
    var len = bytes.byteLength;
    for (var i = 0; i < len; i++) {
        binary += String.fromCharCode( bytes[ i ] );
    }
    return window.btoa( binary );
}



function base64ToArrayBuffer(base64) {
    var binary_string =  window.atob(base64);
    var len = binary_string.length;
    var bytes = new Uint8Array( len );
    for (var i = 0; i < len; i++)        {
        bytes[i] = binary_string.charCodeAt(i);
    }
    return bytes.buffer;
}

Recently I also need to do this for one of my project so did a well research and got a result from Google's Developer community which states this in a simple manner:

For ArrayBuffer to String

function ab2str(buf) {
  return String.fromCharCode.apply(null, new Uint16Array(buf));
}
// Here Uint16 can be different like Uinit8/Uint32 depending upon your buffer value type.

For String to ArrayBuffer

function str2ab(str) {
  var buf = new ArrayBuffer(str.length*2); // 2 bytes for each char
  var bufView = new Uint16Array(buf);
  for (var i=0, strLen=str.length; i < strLen; i++) {
    bufView[i] = str.charCodeAt(i);
  }
  return buf;
}
//Same here also for the Uint16Array.

For more in detail reference you can refer this blog by Google.


You can use TextEncoder and TextDecoder from the Encoding standard, which is polyfilled by the stringencoding library, to convert string to and from ArrayBuffers:

var uint8array = new TextEncoder().encode(string);
var string = new TextDecoder(encoding).decode(uint8array);

(Update Please see the 2nd half of this answer, where I have (hopefully) provided a more complete solution.)

I also ran into this issue, the following works for me in FF 6 (for one direction):

var buf = new ArrayBuffer( 10 );
var view = new Uint8Array( buf );
view[ 3 ] = 4;
alert(Array.prototype.slice.call(view).join(""));

Unfortunately, of course, you end up with ASCII text representations of the values in the array, rather than characters. It still (should be) much more efficient than a loop, though. eg. For the example above, the result is 0004000000, rather than several null chars & a chr(4).

Edit:

After looking on MDC here, you may create an ArrayBuffer from an Array as follows:

var arr = new Array(23);
// New Uint8Array() converts the Array elements
//  to Uint8s & creates a new ArrayBuffer
//  to store them in & a corresponding view.
//  To get at the generated ArrayBuffer,
//  you can then access it as below, with the .buffer property
var buf = new Uint8Array( arr ).buffer;

To answer your original question, this allows you to convert ArrayBuffer <-> String as follows:

var buf, view, str;
buf = new ArrayBuffer( 256 );
view = new Uint8Array( buf );

view[ 0 ] = 7; // Some dummy values
view[ 2 ] = 4;

// ...

// 1. Buffer -> String (as byte array "list")
str = bufferToString(buf);
alert(str); // Alerts "7,0,4,..."

// 1. String (as byte array) -> Buffer    
buf = stringToBuffer(str);
alert(new Uint8Array( buf )[ 2 ]); // Alerts "4"

// Converts any ArrayBuffer to a string
//  (a comma-separated list of ASCII ordinals,
//  NOT a string of characters from the ordinals
//  in the buffer elements)
function bufferToString( buf ) {
    var view = new Uint8Array( buf );
    return Array.prototype.join.call(view, ",");
}
// Converts a comma-separated ASCII ordinal string list
//  back to an ArrayBuffer (see note for bufferToString())
function stringToBuffer( str ) {
    var arr = str.split(",")
      , view = new Uint8Array( arr );
    return view.buffer;
}

For convenience, here is a function for converting a raw Unicode String to an ArrayBuffer (will only work with ASCII/one-byte characters)

function rawStringToBuffer( str ) {
    var idx, len = str.length, arr = new Array( len );
    for ( idx = 0 ; idx < len ; ++idx ) {
        arr[ idx ] = str.charCodeAt(idx) & 0xFF;
    }
    // You may create an ArrayBuffer from a standard array (of values) as follows:
    return new Uint8Array( arr ).buffer;
}

// Alerts "97"
alert(new Uint8Array( rawStringToBuffer("abc") )[ 0 ]);

The above allow you to go from ArrayBuffer -> String & back to ArrayBuffer again, where the string may be stored in eg. .localStorage :)

Hope this helps,

Dan


I found I had problems with this approach, basically because I was trying to write the output to a file and it was non encoded properly. Since JS seems to use UCS-2 encoding (source, source), we need to stretch this solution a step further, here's my enhanced solution that works to me.

I had no difficulties with generic text, but when it was down to Arab or Korean, the output file didn't have all the chars but instead was showing error characters

File output: ","10k unit":"",Follow:"Õ©íüY‹","Follow %{screen_name}":"%{screen_name}U“’Õ©íü",Tweet:"ĤüÈ","Tweet %{hashtag}":"%{hashtag} ’ĤüÈY‹","Tweet to %{name}":"%{name}U“xĤüÈY‹"},ko:{"%{followers_count} followers":"%{followers_count}…X \Ì","100K+":"100Ì tÁ","10k unit":"Ì è",Follow:"\°","Follow %{screen_name}":"%{screen_name} Ø \°X0",K:"œ",M:"1Ì",Tweet:"¸","Tweet %{hashtag}":"%{hashtag}

Original: ","10k unit":"?",Follow:"??????","Follow %{screen_name}":"%{screen_name}???????",Tweet:"????","Tweet %{hashtag}":"%{hashtag} ???????","Tweet to %{name}":"%{name}?????????"},ko:{"%{followers_count} followers":"%{followers_count}?? ???","100K+":"100? ??","10k unit":"? ??",Follow:"???","Follow %{screen_name}":"%{screen_name} ? ?????",K:"?",M:"??",Tweet:"??","Tweet %{hashtag}":"%{hashtag}

I took the information from dennis' solution and this post I found.

Here's my code:

function encode_utf8(s) {
  return unescape(encodeURIComponent(s));
}

function decode_utf8(s) {
  return decodeURIComponent(escape(s));
}

 function ab2str(buf) {
   var s = String.fromCharCode.apply(null, new Uint8Array(buf));
   return decode_utf8(decode_utf8(s))
 }

function str2ab(str) {
   var s = encode_utf8(str)
   var buf = new ArrayBuffer(s.length); 
   var bufView = new Uint8Array(buf);
   for (var i=0, strLen=s.length; i<strLen; i++) {
     bufView[i] = s.charCodeAt(i);
   }
   return bufView;
 }

This allows me to save the content to a file without encoding problems.

How it works: It basically takes the single 8-byte chunks composing a UTF-8 character and saves them as single characters (therefore an UTF-8 character built in this way, could be composed by 1-4 of these characters). UTF-8 encodes characters in a format that variates from 1 to 4 bytes in length. What we do here is encoding the sting in an URI component and then take this component and translate it in the corresponding 8 byte character. In this way we don't lose the information given by UTF8 characters that are more than 1 byte long.


The following is a working Typescript implementation:

bufferToString(buffer: ArrayBuffer): string {
    return String.fromCharCode.apply(null, Array.from(new Uint16Array(buffer)));
}

stringToBuffer(value: string): ArrayBuffer {
    let buffer = new ArrayBuffer(value.length * 2); // 2 bytes per char
    let view = new Uint16Array(buffer);
    for (let i = 0, length = value.length; i < length; i++) {
        view[i] = value.charCodeAt(i);
    }
    return buffer;
}

I've used this for numerous operations while working with crypto.subtle.


In case you have binary data in a string (obtained from nodejs + readFile(..., 'binary'), or cypress + cy.fixture(..., 'binary'), etc), you can't use TextEncoder. It supports only utf8. Bytes with values >= 128 are each turned into 2 bytes.

ES2015:

a = Uint8Array.from(s, x => x.charCodeAt(0))

Uint8Array(33) [2, 134, 140, 186, 82, 70, 108, 182, 233, 40, 143, 247, 29, 76, 245, 206, 29, 87, 48, 160, 78, 225, 242, 56, 236, 201, 80, 80, 152, 118, 92, 144, 48

s = String.fromCharCode.apply(null, a)

"ºRFl¶é(÷LõÎW0 Náò8ìÉPPv\0"


Well, here's a somewhat convoluted way of doing the same thing:

var string = "Blah blah blah", output;
var bb = new (window.BlobBuilder||window.WebKitBlobBuilder||window.MozBlobBuilder)();
bb.append(string);
var f = new FileReader();
f.onload = function(e) {
  // do whatever
  output = e.target.result;
}
f.readAsArrayBuffer(bb.getBlob());

Edit: BlobBuilder has long been deprecated in favor of the Blob constructor, which did not exist when I first wrote this post. Here's an updated version. (And yes, this has always been a very silly way to do the conversion, but it was just for fun!)

var string = "Blah blah blah", output;
var f = new FileReader();
f.onload = function(e) {
  // do whatever
  output = e.target.result;
};
f.readAsArrayBuffer(new Blob([string]));

Let's say you have an arrayBuffer binaryStr:

let text = String.fromCharCode.apply(null, new Uint8Array(binaryStr));

and then you assign the text to the state.


Based on the answer of gengkev, I created functions for both ways, because BlobBuilder can handle String and ArrayBuffer:

function string2ArrayBuffer(string, callback) {
    var bb = new BlobBuilder();
    bb.append(string);
    var f = new FileReader();
    f.onload = function(e) {
        callback(e.target.result);
    }
    f.readAsArrayBuffer(bb.getBlob());
}

and

function arrayBuffer2String(buf, callback) {
    var bb = new BlobBuilder();
    bb.append(buf);
    var f = new FileReader();
    f.onload = function(e) {
        callback(e.target.result)
    }
    f.readAsText(bb.getBlob());
}

A simple test:

string2ArrayBuffer("abc",
    function (buf) {
        var uInt8 = new Uint8Array(buf);
        console.log(uInt8); // Returns `Uint8Array { 0=97, 1=98, 2=99}`

        arrayBuffer2String(buf, 
            function (string) {
                console.log(string); // returns "abc"
            }
        )
    }
)

After playing with mangini's solution for converting from ArrayBuffer to String - ab2str (which is the most elegant and useful one I have found - thanks!), I had some issues when handling large arrays. More specefivally, calling String.fromCharCode.apply(null, new Uint16Array(buf)); throws an error:

arguments array passed to Function.prototype.apply is too large.

In order to solve it (bypass) I have decided to handle the input ArrayBuffer in chunks. So the modified solution is:

function ab2str(buf) {
   var str = "";
   var ab = new Uint16Array(buf);
   var abLen = ab.length;
   var CHUNK_SIZE = Math.pow(2, 16);
   var offset, len, subab;
   for (offset = 0; offset < abLen; offset += CHUNK_SIZE) {
      len = Math.min(CHUNK_SIZE, abLen-offset);
      subab = ab.subarray(offset, offset+len);
      str += String.fromCharCode.apply(null, subab);
   }
   return str;
}

The chunk size is set to 2^16 because this was the size I have found to work in my development landscape. Setting a higher value caused the same error to reoccur. It can be altered by setting the CHUNK_SIZE variable to a different value. It is important to have an even number.

Note on performance - I did not make any performance tests for this solution. However, since it is based on the previous solution, and can handle large arrays, I see no reason why not to use it.


Unlike the solutions here, I needed to convert to/from UTF-8 data. For this purpose, I coded the following two functions, using the (un)escape/(en)decodeURIComponent trick. They're pretty wasteful of memory, allocating 9 times the length of the encoded utf8-string, though those should be recovered by gc. Just don't use them for 100mb text.

function utf8AbFromStr(str) {
    var strUtf8 = unescape(encodeURIComponent(str));
    var ab = new Uint8Array(strUtf8.length);
    for (var i = 0; i < strUtf8.length; i++) {
        ab[i] = strUtf8.charCodeAt(i);
    }
    return ab;
}

function strFromUtf8Ab(ab) {
    return decodeURIComponent(escape(String.fromCharCode.apply(null, ab)));
}

Checking that it works:

strFromUtf8Ab(utf8AbFromStr('latin????????aß?de???????'))
-> "latin????????aß?de???????"

Examples related to javascript

need to add a class to an element How to make a variable accessible outside a function? Hide Signs that Meteor.js was Used How to create a showdown.js markdown extension Please help me convert this script to a simple image slider Highlight Anchor Links when user manually scrolls? Summing radio input values How to execute an action before close metro app WinJS javascript, for loop defines a dynamic variable name Getting all files in directory with ajax

Examples related to serialization

laravel Unable to prepare route ... for serialization. Uses Closure TypeError: Object of type 'bytes' is not JSON serializable Best way to save a trained model in PyTorch? Convert Dictionary to JSON in Swift Java: JSON -> Protobuf & back conversion Understanding passport serialize deserialize How to generate serial version UID in Intellij Parcelable encountered IOException writing serializable object getactivity() Task not serializable: java.io.NotSerializableException when calling function outside closure only on classes not objects Cannot deserialize the JSON array (e.g. [1,2,3]) into type ' ' because type requires JSON object (e.g. {"name":"value"}) to deserialize correctly

Examples related to arraybuffer

Convert base64 string to ArrayBuffer Conversion between UTF-8 ArrayBuffer and String How to go from Blob to ArrayBuffer ArrayBuffer to base64 encoded string Convert a binary NodeJS Buffer to JavaScript ArrayBuffer Converting between strings and ArrayBuffers

Examples related to typed-arrays

Converting between strings and ArrayBuffers