[javascript] Regular expression to get a string between two strings in Javascript

I have found very similar posts, but I can't quite get my regular expression right here.

I am trying to write a regular expression which returns a string which is between two other strings. For example: I want to get the string which resides between the strings "cow" and "milk".

My cow always gives milk

would return

"always gives"

Here is the expression I have pieced together so far:

(?=cow).*(?=milk)

However, this returns the string "cow always gives".

This question is related to javascript regex string

The answer is


The method match() searches a string for a match and returns an Array object.

// Original string
var str = "My cow always gives milk";

// Using index [0] would return<br/>
// "**cow always gives milk**"
str.match(/cow(.*)milk/)**[0]**


// Using index **[1]** would return
// "**always gives**"
str.match(/cow(.*)milk/)[1]

  • You need capture the .*
  • You can (but don't have to) make the .* nongreedy
  • There's really no need for the lookahead.

    > /cow(.*?)milk/i.exec('My cow always gives milk');
    ["cow always gives milk", " always gives "]
    

You can use destructuring to only focus on the part of your interest.

So you can do:

_x000D_
_x000D_
let str = "My cow always gives milk";

let [, result] = str.match(/\bcow\s+(.*?)\s+milk\b/) || [];

console.log(result);
_x000D_
_x000D_
_x000D_

In this way you ignore the first part (the complete match) and only get the capture group's match. The addition of || [] may be interesting if you are not sure there will be a match at all. In that case match would return null which cannot be destructured, and so we return [] instead in that case, and then result will be null.

The additional \b ensures the surrounding words "cow" and "milk" are really separate words (e.g. not "milky"). Also \s+ is needed to avoid that the match includes some outer spacing.


I was able to get what I needed using Martinho Fernandes' solution below. The code is:

var test = "My cow always gives milk";

var testRE = test.match("cow(.*)milk");
alert(testRE[1]);

You'll notice that I am alerting the testRE variable as an array. This is because testRE is returning as an array, for some reason. The output from:

My cow always gives milk

Changes into:

always gives

Regular expression to get a string between two strings in JavaScript

The most complete solution that will work in the vast majority of cases is using a capturing group with a lazy dot matching pattern. However, a dot . in JavaScript regex does not match line break characters, so, what will work in 100% cases is a [^] or [\s\S]/[\d\D]/[\w\W] constructs.

ECMAScript 2018 and newer compatible solution

In JavaScript environments supporting ECMAScript 2018, s modifier allows . to match any char including line break chars, and the regex engine supports lookbehinds of variable length. So, you may use a regex like

var result = s.match(/(?<=cow\s+).*?(?=\s+milk)/gs); // Returns multiple matches if any
// Or
var result = s.match(/(?<=cow\s*).*?(?=\s*milk)/gs); // Same but whitespaces are optional

In both cases, the current position is checked for cow with any 1/0 or more whitespaces after cow, then any 0+ chars as few as possible are matched and consumed (=added to the match value), and then milk is checked for (with any 1/0 or more whitespaces before this substring).

Scenario 1: Single-line input

This and all other scenarios below are supported by all JavaScript environments. See usage examples at the bottom of the answer.

cow (.*?) milk

cow is found first, then a space, then any 0+ chars other than line break chars, as few as possible as *? is a lazy quantifier, are captured into Group 1 and then a space with milk must follow (and those are matched and consumed, too).

Scenario 2: Multiline input

cow ([\s\S]*?) milk

Here, cow and a space are matched first, then any 0+ chars as few as possible are matched and captured into Group 1, and then a space with milk are matched.

Scenario 3: Overlapping matches

If you have a string like >>>15 text>>>67 text2>>> and you need to get 2 matches in-between >>>+number+whitespace and >>>, you can't use />>>\d+\s(.*?)>>>/g as this will only find 1 match due to the fact the >>> before 67 is already consumed upon finding the first match. You may use a positive lookahead to check for the text presence without actually "gobbling" it (i.e. appending to the match):

/>>>\d+\s(.*?)(?=>>>)/g

See the online regex demo yielding text1 and text2 as Group 1 contents found.

Also see How to get all possible overlapping matches for a string.

Performance considerations

Lazy dot matching pattern (.*?) inside regex patterns may slow down script execution if very long input is given. In many cases, unroll-the-loop technique helps to a greater extent. Trying to grab all between cow and milk from "Their\ncow\ngives\nmore\nmilk", we see that we just need to match all lines that do not start with milk, thus, instead of cow\n([\s\S]*?)\nmilk we can use:

/cow\n(.*(?:\n(?!milk$).*)*)\nmilk/gm

See the regex demo (if there can be \r\n, use /cow\r?\n(.*(?:\r?\n(?!milk$).*)*)\r?\nmilk/gm). With this small test string, the performance gain is negligible, but with very large text, you will feel the difference (especially if the lines are long and line breaks are not very numerous).

Sample regex usage in JavaScript:

_x000D_
_x000D_
//Single/First match expected: use no global modifier and access match[1]
console.log("My cow always gives milk".match(/cow (.*?) milk/)[1]);
// Multiple matches: get multiple matches with a global modifier and
// trim the results if length of leading/trailing delimiters is known
var s = "My cow always gives milk, thier cow also gives milk";
console.log(s.match(/cow (.*?) milk/g).map(function(x) {return x.substr(4,x.length-9);}));
//or use RegExp#exec inside a loop to collect all the Group 1 contents
var result = [], m, rx = /cow (.*?) milk/g;
while ((m=rx.exec(s)) !== null) {
  result.push(m[1]);
}
console.log(result);
_x000D_
_x000D_
_x000D_

Using the modern String#matchAll method

_x000D_
_x000D_
const s = "My cow always gives milk, thier cow also gives milk";
const matches = s.matchAll(/cow (.*?) milk/g);
console.log(Array.from(matches, x => x[1]));
_x000D_
_x000D_
_x000D_


Just use the following regular expression:

(?<=My cow\s).*?(?=\smilk)

The chosen answer didn't work for me...hmm...

Just add space after cow and/or before milk to trim spaces from " always gives "

/(?<=cow ).*(?= milk)/

enter image description here


Task

Extract substring between two string (excluding this two strings)

Solution

let allText = "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum";
let textBefore = "five centuries,";
let textAfter = "electronic typesetting";
var regExp = new RegExp(`(?<=${textBefore}\\s)(.+?)(?=\\s+${textAfter})`, "g");
var results = regExp.exec(allText);
if (results && results.length > 1) {
    console.log(results[0]);
}

I find regex to be tedious and time consuming given the syntax. Since you are already using javascript it is easier to do the following without regex:

const text = 'My cow always gives milk'
const start = `cow`;
const end = `milk`;
const middleText = text.split(start)[1].split(end)[0]
console.log(middleText) // prints "always gives"

If the data is on multiple lines then you may have to use the following,

/My cow ([\s\S]*)milk/gm

My cow always gives 
milk

Regex 101 example


Here's a regex which will grab what's between cow and milk (without leading/trailing space):

srctext = "My cow always gives milk.";
var re = /(.*cow\s+)(.*)(\s+milk.*)/;
var newtext = srctext.replace(re, "$2");

An example: http://jsfiddle.net/entropo/tkP74/


You can use the method match() to extract a substring between two strings. Try the following code:

var str = "My cow always gives milk";
var subStr = str.match("cow(.*)milk");
console.log(subStr[1]);

Output:

always gives

See a complete example here : How to find sub-string between two strings.


Examples related to javascript

need to add a class to an element How to make a variable accessible outside a function? Hide Signs that Meteor.js was Used How to create a showdown.js markdown extension Please help me convert this script to a simple image slider Highlight Anchor Links when user manually scrolls? Summing radio input values How to execute an action before close metro app WinJS javascript, for loop defines a dynamic variable name Getting all files in directory with ajax

Examples related to regex

Why my regexp for hyphenated words doesn't work? grep's at sign caught as whitespace Preg_match backtrack error regex match any single character (one character only) re.sub erroring with "Expected string or bytes-like object" Only numbers. Input number in React Visual Studio Code Search and Replace with Regular Expressions Strip / trim all strings of a dataframe return string with first match Regex How to capture multiple repeated groups?

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript