[java] How to get UTF-8 working in Java webapps?

I need to get UTF-8 working in my Java webapp (servlets + JSP, no framework used) to support äöå etc. for regular Finnish text and Cyrillic alphabets like ??? for special cases.

My setup is the following:

  • Development environment: Windows XP
  • Production environment: Debian

Database used: MySQL 5.x

Users mainly use Firefox2 but also Opera 9.x, FF3, IE7 and Google Chrome are used to access the site.

How to achieve this?

This question is related to java mysql tomcat encoding utf-8

The answer is


Faced the same issue on Spring MVC 5 + Tomcat 9 + JSP.
After the long research, came to an elegant solution (no need filters and no need changes in the Tomcat server.xml (starting from 8.0.0-RC3 version))

  1. In the WebMvcConfigurer implementation set default encoding for messageSource (for reading data from messages source files in the UTF-8 encoding.

    @Configuration
    @EnableWebMvc
    @ComponentScan("{package.with.components}")
    public class WebApplicationContextConfig implements WebMvcConfigurer {
    
        @Bean
        public MessageSource messageSource() {
            final ResourceBundleMessageSource messageSource = new ResourceBundleMessageSource();
    
            messageSource.setBasenames("messages");
            messageSource.setDefaultEncoding("UTF-8");
    
            return messageSource;
        }
    
        /* other beans and methods */
    
    }
    
  2. In the DispatcherServletInitializer implementation @Override the onStartup method and set request and resource character encoding in it.

    public class DispatcherServletInitializer extends AbstractAnnotationConfigDispatcherServletInitializer {
    
        @Override
        public void onStartup(final ServletContext servletContext) throws ServletException {
    
            // https://wiki.apache.org/tomcat/FAQ/CharacterEncoding
            servletContext.setRequestCharacterEncoding("UTF-8");
            servletContext.setResponseCharacterEncoding("UTF-8");
    
            super.onStartup(servletContext);
        }
    
        /* servlet mappings, root and web application configs, other methods */
    
    }
    
  3. Save all message source and view files in UTF-8 encoding.

  4. Add <%@ page contentType="text/html;charset=UTF-8" %> or <%@ page pageEncoding="UTF-8" %> in each *.jsp file or add jsp-config descriptor to web.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee"
     xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_3_0.xsd"
     id="WebApp_ID" version="3.0">
        <display-name>AppName</display-name>
    
        <jsp-config>
            <jsp-property-group>
                <url-pattern>*.jsp</url-pattern>
                <page-encoding>UTF-8</page-encoding>
            </jsp-property-group>
        </jsp-config>
    </web-app>
    

I want also to add from here this part solved my utf problem:

runtime.encoding=<encoding>

For my case of displaying Unicode character from message bundles, I don't need to apply "JSP page encoding" section to display Unicode on my jsp page. All I need is "CharsetFilter" section.


Previous responses didn't work with my problem. It was only in production, with tomcat and apache mod_proxy_ajp. Post body lost non ascii chars by ? The problem finally was with JVM defaultCharset (US-ASCII in a default instalation: Charset dfset = Charset.defaultCharset();) so, the solution was run tomcat server with a modifier to run the JVM with UTF-8 as default charset:

JAVA_OPTS="$JAVA_OPTS -Dfile.encoding=UTF-8" 

(add this line to catalina.sh and service tomcat restart)

Maybe you must also change linux system variable (edit ~/.bashrc and ~/.profile for permanent change, see https://perlgeek.de/en/article/set-up-a-clean-utf8-environment)

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

export LANGUAGE=en_US.UTF-8


Nice detailed answer. just wanted to add one more thing which will definitely help others to see the UTF-8 encoding on URLs in action .

Follow the steps below to enable UTF-8 encoding on URLs in firefox.

  1. type "about:config" in the address bar.

  2. Use the filter input type to search for "network.standard-url.encode-query-utf8" property.

  3. the above property will be false by default, turn that to TRUE.
  4. restart the browser.

UTF-8 encoding on URLs works by default in IE6/7/8 and chrome.


Some time you can solve problem through MySQL Administrator wizard. In

Startup variables > Advanced >

and set Def. char Set:utf8

Maybe this config need restart MySQL.


About CharsetFilter mentioned in @kosoant answer ....

There is a build in Filter in tomcat web.xml (located at conf/web.xml). The filter is named setCharacterEncodingFilter and is commented by default. You can uncomment this ( Please remember to uncomment its filter-mapping too )

Also there is no need to set jsp-config in your web.xml (I have test it for Tomcat 7+ )


I think you summed it up quite well in your own answer.

In the process of UTF-8-ing(?) from end to end you might also want to make sure java itself is using UTF-8. Use -Dfile.encoding=utf-8 as parameter to the JVM (can be configured in catalina.bat).


I'm with a similar problem, but, in filenames of a file I'm compressing with apache commons. So, i resolved it with this command:

convmv --notest -f cp1252 -t utf8 * -r

it works very well for me. Hope it help anyone ;)


This is for Greek Encoding in MySql tables when we want to access them using Java:

Use the following connection setup in your JBoss connection pool (mysql-ds.xml)

<connection-url>jdbc:mysql://192.168.10.123:3308/mydatabase</connection-url>
<driver-class>com.mysql.jdbc.Driver</driver-class>
<user-name>nts</user-name>
<password>xaxaxa!</password>
<connection-property name="useUnicode">true</connection-property>
<connection-property name="characterEncoding">greek</connection-property>

If you don't want to put this in a JNDI connection pool, you can configure it as a JDBC-url like the next line illustrates:

jdbc:mysql://192.168.10.123:3308/mydatabase?characterEncoding=greek

For me and Nick, so we never forget it and waste time anymore.....


To add to kosoant's answer, if you are using Spring, rather than writing your own Servlet filter, you can use the class org.springframework.web.filter.CharacterEncodingFilter they provide, configuring it like the following in your web.xml:

 <filter>
    <filter-name>encoding-filter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
       <param-name>encoding</param-name>
       <param-value>UTF-8</param-value>
    </init-param>
    <init-param>
       <param-name>forceEncoding</param-name>
       <param-value>FALSE</param-value>
    </init-param>
 </filter>
 <filter-mapping>
    <filter-name>encoding-filter</filter-name>
    <url-pattern>/*</url-pattern>
 </filter-mapping>

One other point that hasn't been mentioned relates to Java Servlets working with Ajax. I have situations where a web page is picking up utf-8 text from the user sending this to a JavaScript file which includes it in a URI sent to the Servlet. The Servlet queries a database, captures the result and returns it as XML to the JavaScript file which formats it and inserts the formatted response into the original web page.

In one web app I was following an early Ajax book's instructions for wrapping up the JavaScript in constructing the URI. The example in the book used the escape() method, which I discovered (the hard way) is wrong. For utf-8 you must use encodeURIComponent().

Few people seem to roll their own Ajax these days, but I thought I might as well add this.


In case you have specified in connection pool (mysql-ds.xml), in your Java code you can open the connection as follows:

DriverManager.registerDriver(new com.mysql.jdbc.Driver());
Connection conn = DriverManager.getConnection(
    "jdbc:mysql://192.168.1.12:3308/mydb?characterEncoding=greek",
    "Myuser", "mypass");

Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to mysql

Implement specialization in ER diagram How to post query parameters with Axios? PHP with MySQL 8.0+ error: The server requested authentication method unknown to the client Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver' phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' is not supported How to resolve Unable to load authentication plugin 'caching_sha2_password' issue Connection Java-MySql : Public Key Retrieval is not allowed How to grant all privileges to root user in MySQL 8.0 MySQL 8.0 - Client does not support authentication protocol requested by server; consider upgrading MySQL client

Examples related to tomcat

Jersey stopped working with InjectionManagerFactory not found The origin server did not find a current representation for the target resource or is not willing to disclose that one exists. on deploying to tomcat Spring boot: Unable to start embedded Tomcat servlet container Tomcat 404 error: The origin server did not find a current representation for the target resource or is not willing to disclose that one exists Spring Boot application in eclipse, the Tomcat connector configured to listen on port XXXX failed to start Kill tomcat service running on any port, Windows Tomcat 8 is not able to handle get request with '|' in query parameters? 8080 port already taken issue when trying to redeploy project from Spring Tool Suite IDE 403 Access Denied on Tomcat 8 Manager App without prompting for user/password Difference between Xms and Xmx and XX:MaxPermSize

Examples related to encoding

How to check encoding of a CSV file UnicodeEncodeError: 'ascii' codec can't encode character at special name Using Javascript's atob to decode base64 doesn't properly decode utf-8 strings What is the difference between utf8mb4 and utf8 charsets in MySQL? The character encoding of the plain text document was not declared - mootool script UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 23: ordinal not in range(128) How to encode text to base64 in python UTF-8 output from PowerShell Set Encoding of File to UTF8 With BOM in Sublime Text 3 Replace non-ASCII characters with a single space

Examples related to utf-8

error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte Changing PowerShell's default output encoding to UTF-8 'Malformed UTF-8 characters, possibly incorrectly encoded' in Laravel Encoding Error in Panda read_csv Using Javascript's atob to decode base64 doesn't properly decode utf-8 strings What is the difference between utf8mb4 and utf8 charsets in MySQL? what is <meta charset="utf-8">? Pandas df.to_csv("file.csv" encode="utf-8") still gives trash characters for minus sign UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 23: ordinal not in range(128) Android Studio : unmappable character for encoding UTF-8