[java] How to replace special characters in a string?

I have a string with lots of special characters. I want to remove all those, but keep alphabetical characters.

How can I do this?

This question is related to java string

The answer is


Here is a function I used to remove all possible special characters from the string

let name = name.replace(/[&\/\\#,+()$~%!.„'":*‚^_¤?<>|@ª{«»§}©®™ ]/g, '').toLowerCase();

You can use the following method to keep alphanumeric characters.

replaceAll("[^a-zA-Z0-9]", "");

And if you want to keep only alphabetical characters use this

replaceAll("[^a-zA-Z]", "");

You can get unicode for that junk character from charactermap tool in window pc and add \u e.g. \u00a9 for copyright symbol. Now you can use that string with that particular junk caharacter, don't remove any junk character but replace with proper unicode.


That depends on what you mean. If you just want to get rid of them, do this:
(Update: Apparently you want to keep digits as well, use the second lines in that case)

String alphaOnly = input.replaceAll("[^a-zA-Z]+","");
String alphaAndDigits = input.replaceAll("[^a-zA-Z0-9]+","");

or the equivalent:

String alphaOnly = input.replaceAll("[^\\p{Alpha}]+","");
String alphaAndDigits = input.replaceAll("[^\\p{Alpha}\\p{Digit}]+","");

(All of these can be significantly improved by precompiling the regex pattern and storing it in a constant)

Or, with Guava:

private static final CharMatcher ALNUM =
  CharMatcher.inRange('a', 'z').or(CharMatcher.inRange('A', 'Z'))
  .or(CharMatcher.inRange('0', '9')).precomputed();
// ...
String alphaAndDigits = ALNUM.retainFrom(input);

But if you want to turn accented characters into something sensible that's still ascii, look at these questions:


string Output = Regex.Replace(Input, @"([ a-zA-Z0-9&, _]|^\s)", "");

Here all the special characters except space, comma, and ampersand are replaced. You can also omit space, comma and ampersand by the following regular expression.

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9_]|^\s)", "");

Where Input is the string which we need to replace the characters.


You can use basic regular expressions on strings to find all special characters or use pattern and matcher classes to search/modify/delete user defined strings. This link has some simple and easy to understand examples for regular expressions: http://www.vogella.de/articles/JavaRegularExpressions/article.html


Following the example of the Andrzej Doyle's answer, I think the better solution is to use org.apache.commons.lang3.StringUtils.stripAccents():

package bla.bla.utility;

import org.apache.commons.lang3.StringUtils;

public class UriUtility {
    public static String normalizeUri(String s) {
        String r = StringUtils.stripAccents(s);
        r = r.replace(" ", "_");
        r = r.replaceAll("[^\\.A-Za-z0-9_]", "");
        return r;
    }
}

For spaces use "[^a-z A-Z 0-9]" this pattern


I am using this.

s = s.replaceAll("\\W", ""); 

It replace all special characters from string.

Here

\w : A word character, short for [a-zA-Z_0-9]

\W : A non-word character


Replace any special characters by

replaceAll("\\your special character","new character");

ex:to replace all the occurrence of * with white space

replaceAll("\\*","");

*this statement can only replace one type of special character at a time