[javascript] How do I split a string with multiple separators in JavaScript?

How do I split a string with multiple separators in JavaScript?

I'm trying to split on both commas and spaces, but AFAIK JavaScript's split() function only supports one separator.

This question is related to javascript regex split

The answer is


You can pass a regex into Javascript's split operator. For example:

"1,2 3".split(/,| /) 
["1", "2", "3"]

Or, if you want to allow multiple separators together to act as one only:

"1, 2, , 3".split(/(?:,| )+/) 
["1", "2", "3"]

(You have to use the non-capturing (?:) parens because otherwise it gets spliced back into the result. Or you can be smart like Aaron and use a character class.)

(Examples tested in Safari + FF)


You could just lump all the characters you want to use as separators either singularly or collectively into a regular expression and pass them to the split function. For instance you could write:

console.log( "dasdnk asd, (naks) :d skldma".split(/[ \(,\)]+/) );

And the output will be:

["dasdnk", "asd", "naks", ":d", "skldma"]

Tricky method:

var s = "dasdnk asd, (naks) :d skldma";
var a = s.replace('(',' ').replace(')',' ').replace(',',' ').split(' ');
console.log(a);//["dasdnk", "asd", "naks", ":d", "skldma"]

I ran into this question wile looking for a replacement for the C# string.Split() function which splits a string using the characters in its argument.

In JavaScript you can do the same using map an reduce to iterate over the splitting characters and the intermediate values:

let splitters = [",", ":", ";"]; // or ",:;".split("");
let start= "a,b;c:d";
let values = splitters.reduce((old, c) => old.map(v => v.split(c)).flat(), [start]);
// values is ["a", "b", "c", "d"]

flat() is used to flatten the intermediate results so each iteration works on a list of strings without nested arrays. Each iteration applies split to all of the values in old and then returns the list of intermediate results to be split by the next value in splitters. reduce() is initialized with an array containing the initial string value.


Lets keep it simple: (add a "[ ]+" to your RegEx means "1 or more")

This means "+" and "{1,}" are the same.

var words = text.split(/[ .:;?!~,`"&|()<>{}\[\]\r\n/\\]+/); // note ' and - are kept

My refactor of @Brian answer

_x000D_
_x000D_
var string = 'and this is some kind of information and another text and simple and some egample or red or text';_x000D_
var separators = ['and', 'or'];_x000D_
_x000D_
function splitMulti(str, separators){_x000D_
            var tempChar = 't3mp'; //prevent short text separator in split down_x000D_
            _x000D_
            //split by regex e.g. \b(or|and)\b_x000D_
            var re = new RegExp('\\b(' + separators.join('|') + ')\\b' , "g");_x000D_
            str = str.replace(re, tempChar).split(tempChar);_x000D_
            _x000D_
            // trim & remove empty_x000D_
            return str.map(el => el.trim()).filter(el => el.length > 0);_x000D_
}_x000D_
_x000D_
console.log(splitMulti(string, separators))
_x000D_
_x000D_
_x000D_


Here is a new way to achieving same in ES6:

_x000D_
_x000D_
function SplitByString(source, splitBy) {_x000D_
  var splitter = splitBy.split('');_x000D_
  splitter.push([source]); //Push initial value_x000D_
_x000D_
  return splitter.reduceRight(function(accumulator, curValue) {_x000D_
    var k = [];_x000D_
    accumulator.forEach(v => k = [...k, ...v.split(curValue)]);_x000D_
    return k;_x000D_
  });_x000D_
}_x000D_
_x000D_
var source = "abc,def#hijk*lmn,opq#rst*uvw,xyz";_x000D_
var splitBy = ",*#";_x000D_
console.log(SplitByString(source, splitBy));
_x000D_
_x000D_
_x000D_

Please note in this function:

  • No Regex involved
  • Returns splitted value in same order as it appears in source

Result of above code would be:

enter image description here


I don't know the performance of RegEx, but here is another alternative for RegEx leverages native HashSet and works in O( max(str.length, delimeter.length) ) complexity instead:

var multiSplit = function(str,delimiter){
    if (!(delimiter instanceof Array))
        return str.split(delimiter);
    if (!delimiter || delimiter.length == 0)
        return [str];
    var hashSet = new Set(delimiter);
    if (hashSet.has(""))
        return str.split("");
    var lastIndex = 0;
    var result = [];
    for(var i = 0;i<str.length;i++){
        if (hashSet.has(str[i])){
            result.push(str.substring(lastIndex,i));
            lastIndex = i+1;
        }
    }
    result.push(str.substring(lastIndex));
    return result;
}

multiSplit('1,2,3.4.5.6 7 8 9',[',','.',' ']);
// Output: ["1", "2", "3", "4", "5", "6", "7", "8", "9"]

multiSplit('1,2,3.4.5.6 7 8 9',' ');
// Output: ["1,2,3.4.5.6", "7", "8", "9"]

Not the best way but works to Split with Multiple and Different seperators/delimiters

html

<button onclick="myFunction()">Split with Multiple and Different seperators/delimiters</button>
<p id="demo"></p>

javascript

<script>
function myFunction() {

var str = "How : are | you doing : today?";
var res = str.split(' | ');

var str2 = '';
var i;
for (i = 0; i < res.length; i++) { 
    str2 += res[i];

    if (i != res.length-1) {
      str2 += ",";
    }
}
var res2 = str2.split(' : ');

//you can add countless options (with or without space)

document.getElementById("demo").innerHTML = res2;
</script>

I use regexp:

str =  'Write a program that extracts from a given text all palindromes, e.g. "ABBA", "lamal", "exe".';

var strNew = str.match(/\w+/g);

// Output: ["Write", "a", "program", "that", "extracts", "from", "a", "given", "text", "all", "palindromes", "e", "g", "ABBA", "lamal", "exe"]

An easy way to do this is to process each character of the string with each delimiter and build an array of the splits:

splix = function ()
{
  u = [].slice.call(arguments); v = u.slice(1); u = u[0]; w = [u]; x = 0;

  for (i = 0; i < u.length; ++i)
  {
    for (j = 0; j < v.length; ++j)
    {
      if (u.slice(i, i + v[j].length) == v[j])
      {
        y = w[x].split(v[j]); w[x] = y[0]; w[++x] = y[1];
      };
    };
  };
  
  return w;
};

_x000D_
_x000D_
console.logg = function ()
{
  document.body.innerHTML += "<br>" + [].slice.call(arguments).join();
}

splix = function() {
  u = [].slice.call(arguments);
  v = u.slice(1);
  u = u[0];
  w = [u];
  x = 0;
  console.logg("Processing: <code>" + JSON.stringify(w) + "</code>");

  for (i = 0; i < u.length; ++i) {
    for (j = 0; j < v.length; ++j) {
      console.logg("Processing: <code>[\x22" + u.slice(i, i + v[j].length) + "\x22, \x22" + v[j] + "\x22]</code>");
      if (u.slice(i, i + v[j].length) == v[j]) {
        y = w[x].split(v[j]);
        w[x] = y[0];
        w[++x] = y[1];
        console.logg("Currently processed: " + JSON.stringify(w) + "\n");
      };
    };
  };

  console.logg("Return: <code>" + JSON.stringify(w) + "</code>");
};

setTimeout(function() {
  console.clear();
  splix("1.23--4", ".", "--");
}, 250);
_x000D_
@import url("http://fonts.googleapis.com/css?family=Roboto");

body {font: 20px Roboto;}
_x000D_
_x000D_
_x000D_

Usage: splix(string, delimiters...)

Example: splix("1.23--4", ".", "--")

Returns: ["1", "23", "4"]


I will provide a classic implementation for a such function. The code works in almost all versions of JavaScript and is somehow optimum.

  • It doesn't uses regex, which is hard to maintain
  • It doesn't uses new features of JavaScript
  • It doesn't uses multiple .split() .join() invocation which require more computer memory

Just pure code:

var text = "Create a function, that will return an array (of string), with the words inside the text";

println(getWords(text));

function getWords(text)
{
    let startWord = -1;
    let ar = [];

    for(let i = 0; i <= text.length; i++)
    {
        let c = i < text.length ? text[i] : " ";

        if (!isSeparator(c) && startWord < 0)
        {
            startWord = i;
        }

        if (isSeparator(c) && startWord >= 0)
        {
            let word = text.substring(startWord, i);
            ar.push(word);

            startWord = -1;
        }
    }

    return ar;
}

function isSeparator(c)
{
    var separators = [" ", "\t", "\n", "\r", ",", ";", ".", "!", "?", "(", ")"];
    return separators.includes(c);
}

You can see the code running in playground: https://codeguppy.com/code.html?IJI0E4OGnkyTZnoszAzf


For fun, I solved this with reduce and filter. In real life I would probably use Aarons answere here. Nevertheless I think my version isn't totally unreadable or inefficient:

[' ','_','-','.',',',':','@'].reduce(
(segs, sep) => segs.reduce(
(out, seg) => out.concat(seg.split(sep)), []), 
['E-mail Address: [email protected], Phone Number: +1-800-555-0011']
).filter(x => x)

Or as a function:

function msplit(str, seps) {
  return seps.reduce((segs, sep) => segs.reduce(
    (out, seg) => out.concat(seg.split(sep)), []
  ), [str]).filter(x => x);
}

This will output:

['E','mail','Address','user','domain','com','0','Phone','Number','+1','800','555','0011']

Without the filter at the end you would get empty strings in the array where two different separators are next to each other.


I find that one of the main reasons I need this is to split file paths on both / and \. It's a bit of a tricky regex so I'll post it here for reference:

var splitFilePath = filePath.split(/[\/\\]/);

Hi for example if you have split and replace in String 07:05:45PM

var hour = time.replace("PM", "").split(":");

Result

[ '07', '05', '45' ]

a = "a=b,c:d"

array = ['=',',',':'];

for(i=0; i< array.length; i++){ a= a.split(array[i]).join(); }

this will return the string without a special charecter.


I think it's easier if you specify what you wanna leave, instead of what you wanna remove.

As if you wanna have only English words, you can use something like this:

text.match(/[a-z'\-]+/gi);

Examples (run snippet):

_x000D_
_x000D_
var R=[/[a-z'\-]+/gi,/[a-z'\-\s]+/gi];_x000D_
var s=document.getElementById('s');_x000D_
for(var i=0;i<R.length;i++)_x000D_
 {_x000D_
  var o=document.createElement('option');_x000D_
  o.innerText=R[i]+'';_x000D_
  o.value=i;_x000D_
  s.appendChild(o);_x000D_
 }_x000D_
var t=document.getElementById('t');_x000D_
var r=document.getElementById('r');_x000D_
_x000D_
s.onchange=function()_x000D_
 {_x000D_
  r.innerHTML='';_x000D_
  var x=s.value;_x000D_
  if((x>=0)&&(x<R.length))_x000D_
   x=t.value.match(R[x]);_x000D_
  for(i=0;i<x.length;i++)_x000D_
   {_x000D_
    var li=document.createElement('li');_x000D_
    li.innerText=x[i];_x000D_
    r.appendChild(li);_x000D_
   }_x000D_
 }
_x000D_
<textarea id="t" style="width:70%;height:12em">even, test; spider-man_x000D_
_x000D_
But saying o'er what I have said before:_x000D_
My child is yet a stranger in the world;_x000D_
She hath not seen the change of fourteen years,_x000D_
Let two more summers wither in their pride,_x000D_
Ere we may think her ripe to be a bride._x000D_
_x000D_
—Shakespeare, William. The Tragedy of Romeo and Juliet</textarea>_x000D_
_x000D_
<p><select id="s">_x000D_
 <option selected>Select a regular expression</option>_x000D_
 <!-- option value="1">/[a-z'\-]+/gi</option>_x000D_
 <option value="2">/[a-z'\-\s]+/gi</option -->_x000D_
</select></p>_x000D_
 <ol id="r" style="display:block;width:auto;border:1px inner;overflow:scroll;height:8em;max-height:10em;"></ol>_x000D_
</div>
_x000D_
_x000D_
_x000D_


Starting from @stephen-sweriduk solution (that was the more interesting to me!), I have slightly modified it to make more generic and reusable:

/**
 * Adapted from: http://stackoverflow.com/questions/650022/how-do-i-split-a-string-with-multiple-separators-in-javascript
*/
var StringUtils = {

  /**
   * Flatten a list of strings
   * http://rosettacode.org/wiki/Flatten_a_list
   */
  flatten : function(arr) {
    var self=this;
    return arr.reduce(function(acc, val) {
        return acc.concat(val.constructor === Array ? self.flatten(val) : val);
    },[]);
  },

  /**
   * Recursively Traverse a list and apply a function to each item
   * @param list array
   * @param expression Expression to use in func
   * @param func function of (item,expression) to apply expression to item
   *
   */
  traverseListFunc : function(list, expression, index, func) {
    var self=this;
    if(list[index]) {
        if((list.constructor !== String) && (list[index].constructor === String))
            (list[index] != func(list[index], expression)) ? list[index] = func(list[index], expression) : null;
        (list[index].constructor === Array) ? self.traverseListFunc(list[index], expression, 0, func) : null;
        (list.constructor === Array) ? self.traverseListFunc(list, expression, index+1, func) : null;
    }
  },

  /**
   * Recursively map function to string
   * @param string
   * @param expression Expression to apply to func
   * @param function of (item, expressions[i])
   */
  mapFuncToString : function(string, expressions, func) {
    var self=this;
    var list = [string];
    for(var i=0, len=expressions.length; i<len; i++) {
        self.traverseListFunc(list, expressions[i], 0, func);
    }
    return self.flatten(list);
  },

  /**
   * Split a string
   * @param splitters Array of characters to apply the split
   */
  splitString : function(string, splitters) {
    return this.mapFuncToString(string, splitters, function(item, expression) {
      return item.split(expression);
    })
  },

}

and then

var stringToSplit = "people and_other/things";
var splitList = [" ", "_", "/"];
var splittedString=StringUtils.splitString(stringToSplit, splitList);
console.log(splitList, stringToSplit, splittedString);

that gives back as the original:

[ ' ', '_', '/' ] 'people and_other/things' [ 'people', 'and', 'other', 'things' ]

For those of you who want more customization in their splitting function, I wrote a recursive algorithm that splits a given string with a list of characters to split on. I wrote this before I saw the above post. I hope it helps some frustrated programmers.

splitString = function(string, splitters) {
    var list = [string];
    for(var i=0, len=splitters.length; i<len; i++) {
        traverseList(list, splitters[i], 0);
    }
    return flatten(list);
}

traverseList = function(list, splitter, index) {
    if(list[index]) {
        if((list.constructor !== String) && (list[index].constructor === String))
            (list[index] != list[index].split(splitter)) ? list[index] = list[index].split(splitter) : null;
        (list[index].constructor === Array) ? traverseList(list[index], splitter, 0) : null;
        (list.constructor === Array) ? traverseList(list, splitter, index+1) : null;    
    }
}

flatten = function(arr) {
    return arr.reduce(function(acc, val) {
        return acc.concat(val.constructor === Array ? flatten(val) : val);
    },[]);
}

var stringToSplit = "people and_other/things";
var splitList = [" ", "_", "/"];
splitString(stringToSplit, splitList);

Example above returns: ["people", "and", "other", "things"]

Note: flatten function was taken from Rosetta Code


Splitting URL by .com/ or .net/

url.split(/\.com\/|\.net\//)

Another simple but effective method is to use split + join repeatedly.

"a=b,c:d".split('=').join(',').split(':').join(',').split(',')

Essentially doing a split followed by a join is like a global replace so this replaces each separator with a comma then once all are replaced it does a final split on comma

The result of the above expression is:

['a', 'b', 'c', 'd']

Expanding on this you could also place it in a function:

function splitMulti(str, tokens){
        var tempChar = tokens[0]; // We can use the first token as a temporary join character
        for(var i = 1; i < tokens.length; i++){
            str = str.split(tokens[i]).join(tempChar);
        }
        str = str.split(tempChar);
        return str;
}

Usage:

splitMulti('a=b,c:d', ['=', ',', ':']) // ["a", "b", "c", "d"]

If you use this functionality a lot it might even be worth considering wrapping String.prototype.split for convenience (I think my function is fairly safe - the only consideration is the additional overhead of the conditionals (minor) and the fact that it lacks an implementation of the limit argument if an array is passed).

Be sure to include the splitMulti function if using this approach to the below simply wraps it :). Also worth noting that some people frown on extending built-ins (as many people do it wrong and conflicts can occur) so if in doubt speak to someone more senior before using this or ask on SO :)

    var splitOrig = String.prototype.split; // Maintain a reference to inbuilt fn
    String.prototype.split = function (){
        if(arguments[0].length > 0){
            if(Object.prototype.toString.call(arguments[0]) == "[object Array]" ) { // Check if our separator is an array
                return splitMulti(this, arguments[0]);  // Call splitMulti
            }
        }
        return splitOrig.apply(this, arguments); // Call original split maintaining context
    };

Usage:

var a = "a=b,c:d";
    a.split(['=', ',', ':']); // ["a", "b", "c", "d"]

// Test to check that the built-in split still works (although our wrapper wouldn't work if it didn't as it depends on it :P)
        a.split('='); // ["a", "b,c:d"] 

Enjoy!


Examples related to javascript

need to add a class to an element How to make a variable accessible outside a function? Hide Signs that Meteor.js was Used How to create a showdown.js markdown extension Please help me convert this script to a simple image slider Highlight Anchor Links when user manually scrolls? Summing radio input values How to execute an action before close metro app WinJS javascript, for loop defines a dynamic variable name Getting all files in directory with ajax

Examples related to regex

Why my regexp for hyphenated words doesn't work? grep's at sign caught as whitespace Preg_match backtrack error regex match any single character (one character only) re.sub erroring with "Expected string or bytes-like object" Only numbers. Input number in React Visual Studio Code Search and Replace with Regular Expressions Strip / trim all strings of a dataframe return string with first match Regex How to capture multiple repeated groups?

Examples related to split

Parameter "stratify" from method "train_test_split" (scikit Learn) Pandas split DataFrame by column value How to split large text file in windows? Attribute Error: 'list' object has no attribute 'split' Split function in oracle to comma separated values with automatic sequence How would I get everything before a : in a string Python Split String by delimiter position using oracle SQL JavaScript split String with white space Split a String into an array in Swift? Split pandas dataframe in two if it has more than 10 rows