[java] Using Java to find substring of a bigger string using Regular Expression

If I have a string like this:

FOO[BAR]

I need a generic way to get the "BAR" string out of the string so that no matter what string is between the square brackets it would be able to get the string.

e.g.

FOO[DOG] = DOG
FOO[CAT] = CAT

This question is related to java regex string

The answer is


If you simply need to get whatever is between [], the you can use \[([^\]]*)\] like this:

Pattern regex = Pattern.compile("\\[([^\\]]*)\\]");
Matcher m = regex.matcher(str);
if (m.find()) {
    result = m.group();
}

If you need it to be of the form identifier + [ + content + ] then you can limit extracting the content only when the identifier is a alphanumerical:

[a-zA-Z][a-z-A-Z0-9_]*\s*\[([^\]]*)\]

This will validate things like Foo [Bar], or myDevice_123["input"] for instance.

Main issue

The main problem is when you want to extract the content of something like this:

FOO[BAR[CAT[123]]+DOG[FOO]]

The Regex won't work and will return BAR[CAT[123 and FOO.
If we change the Regex to \[(.*)\] then we're OK but then, if you're trying to extract the content from more complex things like:

FOO[BAR[CAT[123]]+DOG[FOO]] = myOtherFoo[BAR[5]]

None of the Regexes will work.

The most accurate Regex to extract the proper content in all cases would be a lot more complex as it would need to balance [] pairs and give you they content.

A simpler solution

If your problems is getting complex and the content of the [] arbitrary, you could instead balance the pairs of [] and extract the string using plain old code rathe than a Regex:

int i;
int brackets = 0;
string c;
result = "";
for (i = input.indexOf("["); i < str.length; i++) {
    c = str.substring(i, i + 1);
    if (c == '[') {
        brackets++;
    } else if (c == ']') {
        brackets--;
        if (brackets <= 0) 
            break;
    }
    result = result + c;
}   

This is more pseudo-code than real code, I'm not a Java coder so I don't know if the syntax is correct, but it should be easy enough to improve upon.
What count is that this code should work and allow you to extract the content of the [], however complex it is.


"FOO[DOG]".replaceAll("^.*?\\[|\\].*", "");

This will return a string taking only the string inside square brackets.

This remove all string outside from square brackets.

You can test this java sample code online: http://tpcg.io/wZoFu0

You can test this regex from here: https://regex101.com/r/oUAzsS/1


I think your regular expression would look like:

/FOO\[(.+)\]/

Assuming that FOO going to be constant.

So, to put this in Java:

Pattern p = Pattern.compile("FOO\\[(.+)\\]");
Matcher m = p.matcher(inputLine);

the non-regex way:

String input = "FOO[BAR]", extracted;
extracted = input.substring(input.indexOf("["),input.indexOf("]"));

alternatively, for slightly better performance/memory usage (thanks Hosam):

String input = "FOO[BAR]", extracted;
extracted = input.substring(input.indexOf('['),input.lastIndexOf(']'));

Like this its work if you want to parse some string which is coming from mYearInDB.toString() =[2013] it will give 2013

Matcher n = MY_PATTERN.matcher("FOO[BAR]"+mYearInDB.toString());
while (n.find()) {
 extracredYear  = n.group(1);
 // s now contains "BAR"
    }
    System.out.println("Extrated output is : "+extracredYear);

I'd define that I want a maximum number of non-] characters between [ and ]. These need to be escaped with backslashes (and in Java, these need to be escaped again), and the definition of non-] is a character class, thus inside [ and ] (i.e. [^\\]]). The result:

FOO\\[([^\\]]+)\\]

This is a working example :

RegexpExample.java

package org.regexp.replace;

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexpExample
{
    public static void main(String[] args)
    {
        String string = "var1[value1], var2[value2], var3[value3]";
        Pattern pattern = Pattern.compile("(\\[)(.*?)(\\])");
        Matcher matcher = pattern.matcher(string);

        List<String> listMatches = new ArrayList<String>();

        while(matcher.find())
        {
            listMatches.add(matcher.group(2));
        }

        for(String s : listMatches)
        {
            System.out.println(s);
        }
    }
}

It displays :

value1
value2
value3

This regexp works for me:

form\[([^']*?)\]

example:

form[company_details][0][name]
form[company_details][0][common_names][1][title]

output:

Match 1
1.  company_details
Match 2
1.  company_details

Tested on http://rubular.com/


assuming that no other closing square bracket is allowed within, /FOO\[([^\]]*)\]/


import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public static String get_match(String s, String p) {
    // returns first match of p in s for first group in regular expression 
    Matcher m = Pattern.compile(p).matcher(s);
    return m.find() ? m.group(1) : "";
}

get_match("FOO[BAR]", "\\[(.*?)\\]")  // returns "BAR"

public static List<String> get_matches(String s, String p) {
    // returns all matches of p in s for first group in regular expression 
    List<String> matches = new ArrayList<String>();
    Matcher m = Pattern.compile(p).matcher(s);
    while(m.find()) {
        matches.add(m.group(1));
    }
    return matches;
}

get_matches("FOO[BAR] FOO[CAT]", "\\[(.*?)\\]")) // returns [BAR, CAT]

String input = "FOO[BAR]";
String result = input.substring(input.indexOf("[")+1,input.lastIndexOf("]"));

This will return the value between first '[' and last ']'

Foo[Bar] => Bar

Foo[Bar[test]] => Bar[test]

Note: You should add error checking if the input string is not well formed.


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to regex

Why my regexp for hyphenated words doesn't work? grep's at sign caught as whitespace Preg_match backtrack error regex match any single character (one character only) re.sub erroring with "Expected string or bytes-like object" Only numbers. Input number in React Visual Studio Code Search and Replace with Regular Expressions Strip / trim all strings of a dataframe return string with first match Regex How to capture multiple repeated groups?

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript