09 February 2012

Mysql 5.1/5.5 and Unicode

I noticed that Mysql 5.1. and 5.5 has a difference with respect to the support for Unicode characters. Mysql 5.1 only supports the Basic Multilingual Plane (BMP), that is code points between 0 and 65535 (U+0000 to U+FFFF). Characters whose code points are greater than 65535 (U+FFFF) are called supplementary characters and support for these are included in Mysql 5.5. Read more at http://dev.mysql.com/doc/refman/5.5/en/charset-unicode.html.

Java String support both Basic Multilingual Plane and supplementary characters so this may cause some problems if a String is send directly into Mysql. Typically you will see this error message:
java.sql.SQLException: Incorrect string value: '\xC2\x9F' for column 'APN' at row 1


A little warning: even in Mysql 5.5 the "utf8" "charset" does NOT support supplementary characters. Use utf8mb4 instead.

No comments:

Post a Comment