[mysql] Incorrect string value: '\xF0\x9F\x8E\xB6\xF0\x9F...' MySQL

I am trying to store a tweet in my MYSQL table. The tweet is:

quiero que me escuches, no te burles no te rias, anoche tuve un sueño que te fuiste de mi vida 🎶🎶

The final two characters are both 'MULTIPLE MUSICAL NOTES' (U+1F3B6), for which the UTF-8 encoding is 0xf09f8eb6.

The tweet_text field in my table is encoded in utf8mb4. But when I try to store the tweet in that column I get the following error message:

Incorrect string value: '\xF0\x9F\x8E\xB6\xF0\x9F...' for column 'tweet_text' at row 1.

What is going wrong? How can I fix this? I need to store multiple languages as well and this character set works for all languages but not for the special characters like emoticons and emojis.

This is my create table statement:

CREATE TABLE `twitter_status_data` (
  `unique_status_id` bigint(20) NOT NULL AUTO_INCREMENT,
  `metadata_result_type` text CHARACTER SET utf8,
  `created_at` text CHARACTER SET utf8 NOT NULL COMMENT 'UTC time when this Tweet was    created.',
  `id` bigint(20) unsigned NOT NULL COMMENT 'Unique tweet identifier',
  `id_str` text CHARACTER SET utf8 NOT NULL,
  `tweet_text` text COMMENT 'Actual UTF-8 text',
  `user_id_str` text CHARACTER SET utf8,
  `user_name` text COMMENT 'User''s name',
  `user_screen_name` text COMMENT 'Twitter handle',
  `coordinates` text CHARACTER SET utf8,
  PRIMARY KEY (`unique_status_id`),
  KEY `user_id_index` (`user_id`),
  FULLTEXT KEY `tweet_text_index` (`tweet_text`)
) ENGINE=InnoDB AUTO_INCREMENT=82451 DEFAULT CHARSET=utf8mb4;

This question is related to mysql twitter utf-8 emoticons

The answer is


Change database charset and collation

ALTER DATABASE
    database_name
    CHARACTER SET = utf8mb4
    COLLATE = utf8mb4_unicode_ci;

change specific table's charset and collation

ALTER TABLE
    table_name
    CONVERT TO CHARACTER SET utf8mb4
    COLLATE utf8mb4_unicode_ci;

change connection charset in mysql driver

before

charset=utf8&parseTime=True&loc=Local

after

charset=utf8mb4&collation=utf8mb4_unicode_ci&parseTime=True&loc=Local

From this article https://hackernoon.com/today-i-learned-storing-emoji-to-mysql-with-golang-204a093454b7


FOR SQLALCHEMY AND PYTHON

The encoding used for Unicode has traditionally been 'utf8'. However, for MySQL versions 5.5.3 on forward, a new MySQL-specific encoding 'utf8mb4' has been introduced, and as of MySQL 8.0 a warning is emitted by the server if plain utf8 is specified within any server-side directives, replaced with utf8mb3. The rationale for this new encoding is due to the fact that MySQL’s legacy utf-8 encoding only supports codepoints up to three bytes instead of four. Therefore, when communicating with a MySQL database that includes codepoints more than three bytes in size, this new charset is preferred, if supported by both the database as well as the client DBAPI, as in:

e = create_engine(
    "mysql+pymysql://scott:tiger@localhost/test?charset=utf8mb4")
All modern DBAPIs should support the utf8mb4 charset.

enter link description here


According to the create table statement, the default charset of the table is already utf8mb4. It seems that you have a wrong connection charset.

In Java, set the datasource url like this: jdbc:mysql://127.0.0.1:3306/testdb?useUnicode=true&characterEncoding=utf-8.

"?useUnicode=true&characterEncoding=utf-8" is necessary for using utf8mb4.

It works for my application.


I had hit the same problem and learnt the following-

Even though database has a default character set of utf-8, it's possible for database columns to have a different character set in MySQL. Modified dB and the problematic column to UTF-8:

mysql> ALTER DATABASE MyDB CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci'

mysql> ALTER TABLE database.table MODIFY COLUMN column_name VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;

Now creating new tables with:

> CREATE TABLE My_Table_Name (
    twitter_id_str VARCHAR(255) NOT NULL UNIQUE,
    twitter_screen_name VARCHAR(512) CHARACTER SET utf8 COLLATE utf8_unicode_ci,
    .....
  ) CHARACTER SET utf8 COLLATE utf8_unicode_ci;

It may be obvious, but it still was surprising to me, that SET NAMES utf8 is not compatible with utf8mb4 encoding. So for some apps changing table/column encoding was not enough. I had to change encoding in app configuration.

Redmine (ruby, ROR)

In config/database.yml:

production:
  adapter: mysql2
  database: redmine
  host: localhost
  username: redmine
  password: passowrd
  encoding: utf8mb4

Custom Yii application (PHP)

In config/db.php:

return [
    'class' => yii\db\Connection::class,
    'dsn' => 'mysql:host=localhost;dbname=yii',
    'username' => 'yii',
    'password' => 'password',
    'charset' => 'utf8mb4',
],

If you have utf8mb4 as a column/table encoding and still getting errors like this, make sure that you have configured correct charset for DB connection in your application.


Examples related to mysql

Implement specialization in ER diagram How to post query parameters with Axios? PHP with MySQL 8.0+ error: The server requested authentication method unknown to the client Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver' phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' is not supported How to resolve Unable to load authentication plugin 'caching_sha2_password' issue Connection Java-MySql : Public Key Retrieval is not allowed How to grant all privileges to root user in MySQL 8.0 MySQL 8.0 - Client does not support authentication protocol requested by server; consider upgrading MySQL client

Examples related to twitter

Twitter - How to embed native video from someone else's tweet into a New Tweet or a DM How to get user's high resolution profile picture on Twitter? How to extract hours and minutes from a datetime.datetime object? Incorrect string value: '\xF0\x9F\x8E\xB6\xF0\x9F...' MySQL Twitter - share button, but with image How to access elements of a JArray (or iterate over them) Simplest PHP example for retrieving user_timeline with Twitter API version 1.1 Twitter API returns error 215, Bad Authentication Data bower command not found How do I fix twitter-bootstrap on IE?

Examples related to utf-8

error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte Changing PowerShell's default output encoding to UTF-8 'Malformed UTF-8 characters, possibly incorrectly encoded' in Laravel Encoding Error in Panda read_csv Using Javascript's atob to decode base64 doesn't properly decode utf-8 strings What is the difference between utf8mb4 and utf8 charsets in MySQL? what is <meta charset="utf-8">? Pandas df.to_csv("file.csv" encode="utf-8") still gives trash characters for minus sign UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 23: ordinal not in range(128) Android Studio : unmappable character for encoding UTF-8

Examples related to emoticons

Incorrect string value: '\xF0\x9F\x8E\xB6\xF0\x9F...' MySQL