How To Dump And Import Utf 8 Mysql Databases In A Safe Way

  • Published: October 17, 2016


If you have a MySQL database with its character set encoded in UTF-8, then you need to treat it in a special way when dumping and restoring the database to be able to read its special characters.

Dumping a database

The common and quickest way of dumping a database with mysqldump does not treat utf-8 encoding right.

Avoid doing it like:

This is the correct way to dump a MySQL database safely:

$ mysqldump -uroot -p database -r utf8.dump

The -r or, the same,--result-file option, will create the output in ASCII format.

--result-file=file_name, -r file_name Direct output to the named file. The result file is created and its previous contents overwritten, even if an error occurs while generating the dump. This option should be used on Windows to prevent newline ā€œ\nā€ characters from being converted to ā€œ\r\nā€ carriage return/newline sequences.

Restoring a database

Usually you will restore a database simply inserting the data with an input file descriptor <, but this won’t handle utf-8 encoding properly, so this should be avoided:

$ mysql -uroot -p --default-character-set=utf8 database
mysql>  SET names 'utf8'
mysql>  SOURCE utf8.dump

Related Definitions

Character set

A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set. Let's make the distinction clear with an example of an imaginary character set. Suppose that we have an alphabet with four letters: A, B, a, b. We give each letter a number: A = 0, B = 1, a = 2, b = 3. The letter A is a symbol, the number 0 is the encoding for A, and the combination of all four letters and their encodings is a character set.
Character Sets and Collations in General MySQL Docs


UTF-8 (Unicode Transformation Format with 8-bit units) is an alternative way to store Unicode data. It is implemented according to RFC 3629, which describes encoding sequences that take from one to four bytes.
The utf8 Character Set (3-Byte UTF-8 Unicode Encoding) MySQL Docs


Marcelo Canina
I'm Marcelo Canina, a developer from Uruguay. I build websites and web-based applications from the ground up and share what I learn here.
comments powered by Disqus
Except as otherwise noted, the content of this page is licensed under CC BY-NC-ND 4.0 ·