[java] List of all special characters that need to be escaped in a regex

I am trying to create an application that matches a message template with a message that a user is trying to send. I am using Java regex for matching the message. The template/message may contain special characters.

How would I get the complete list of special characters that need to be escaped in order for my regex to work and match in the maximum possible cases?

Is there a universal solution for escaping all special characters in Java regex?

This question is related to java regex

The answer is


According to the String Literals / Metacharacters documentation page, they are:

<([{\^-=$!|]})?*+.>

Also it would be cool to have that list refereed somewhere in code, but I don't know where that could be...


  • Java characters that have to be escaped in regular expressions are:
    \.[]{}()<>*+-=!?^$|
  • Two of the closing brackets (] and }) are only need to be escaped after opening the same type of bracket.
  • In []-brackets some characters (like + and -) do sometimes work without escape.

although the answer is for Java, but the code can be easily adapted from this Kotlin String extension I came up with (adapted from that @brcolow provided):

private val escapeChars = charArrayOf(
    '<',
    '(',
    '[',
    '{',
    '\\',
    '^',
    '-',
    '=',
    '$',
    '!',
    '|',
    ']',
    '}',
    ')',
    '?',
    '*',
    '+',
    '.',
    '>'
)

fun String.escapePattern(): String {
    return this.fold("") {
      acc, chr ->
        acc + if (escapeChars.contains(chr)) "\\$chr" else "$chr"
    }
}

fun main() {
    println("(.*)".escapePattern())
}

prints \(\.\*\)

check it in action here https://pl.kotl.in/h-3mXZkNE


Assuming that you have and trust (to be authoritative) the list of escape characters Java regex uses (would be nice if these characters were exposed in some Pattern class member) you can use the following method to escape the character if it is indeed necessary:

private static final char[] escapeChars = { '<', '(', '[', '{', '\\', '^', '-', '=', '$', '!', '|', ']', '}', ')', '?', '*', '+', '.', '>' };

private static String regexEscape(char character) {
    for (char escapeChar : escapeChars) {
        if (character == escapeChar) {
            return "\\" + character;
        }
    }
    return String.valueOf(character);
}

The Pattern.quote(String s) sort of does what you want. However it leaves a little left to be desired; it doesn't actually escape the individual characters, just wraps the string with \Q...\E.

There is not a method that does exactly what you are looking for, but the good news is that it is actually fairly simple to escape all of the special characters in a Java regular expression:

regex.replaceAll("[\\W]", "\\\\$0")

Why does this work? Well, the documentation for Pattern specifically says that its permissible to escape non-alphabetic characters that don't necessarily have to be escaped:

It is an error to use a backslash prior to any alphabetic character that does not denote an escaped construct; these are reserved for future extensions to the regular-expression language. A backslash may be used prior to a non-alphabetic character regardless of whether that character is part of an unescaped construct.

For example, ; is not a special character in a regular expression. However, if you escape it, Pattern will still interpret \; as ;. Here are a few more examples:

  • > becomes \> which is equivalent to >
  • [ becomes \[ which is the escaped form of [
  • 8 is still 8.
  • \) becomes \\\) which is the escaped forms of \ and ( concatenated.

Note: The key is is the definition of "non-alphabetic", which in the documentation really means "non-word" characters, or characters outside the character set [a-zA-Z_0-9].


on the other side of the coin, you should use "non-char" regex that looks like this if special characters = allChars - number - ABC - space in your app context.

String regepx = "[^\\s\\w]*";

To escape you could just use this from Java 1.5:

Pattern.quote("$test");

You will match exacty the word $test


On @Sorin's suggestion of the Java Pattern docs, it looks like chars to escape are at least:

\.[{(*+?^$|

Combining what everyone said, I propose the following, to keep the list of characters special to RegExp clearly listed in their own String, and to avoid having to try to visually parse thousands of "\\"'s. This seems to work pretty well for me:

final String regExSpecialChars = "<([{\\^-=$!|]})?*+.>";
final String regExSpecialCharsRE = regExSpecialChars.replaceAll( ".", "\\\\$0");
final Pattern reCharsREP = Pattern.compile( "[" + regExSpecialCharsRE + "]");

String quoteRegExSpecialChars( String s)
{
    Matcher m = reCharsREP.matcher( s);
    return m.replaceAll( "\\\\$0");
}