I recently had to code a little web application that would interface with a MySQL database and display data on a webpage via PHP MySQL queries. The application had to support many international languages – the most difficult of which to deal with was Russian. Many forums suggested changing my character encoding to CP-1251, which is a standard Russian encoding in Windows. I needed support for all international languages, though, and using multiple character encodings wasn’t a headache I wanted to get myself into. It turns out that you can display all these characters using a UTF-8 encoding, provided you get the PHP and MySQL right. Read on to find out how to display Russian and other languages with PHP and MySQL.
Switching to UTF-8: MySQL
MySQL uses ISO-8859-1 (latin1_swedish_ci) as its default source encoding, but we will want to use utf8_general_ci, so let’s change our database, tables, and fields to reflect this.
Change Database Collation
In phpMyAdmin, just click the Operations tab and change the collation to utf8_general_ci.
Change Table(s) Collation
You can follow the same procedure described above to also change the collation for each of your tables in the database, but here is a useful script to do so. This is great especially if you have a lot of tables!
[syntax type=”html|php|js|css”]
<?php
$con = mysql_connect(‘localhost’,’USER_DBUSER’,’PASSWORD’);
if(!$con) echo “Could not connect to database.”;
mysql_select_db(‘USER_DBUSER’);
$result = mysql_query(‘show tables’);
while($tables = mysql_fetch_array($result)) {
foreach ($tables as $key => $value) {
mysql_query(“ALTER TABLE $value COLLATE utf8_general_ci”);
}
}
echo “Collation has been changed.”;
?>
[/syntax]
Change Field(s) Collation
To change the collation of your fields, go to the table and click the Structure tab. Then you can select every field, edit it, and change the collation to utf8_general_ci. Note that this only needs to be done for fields that will hold strings.
Switching to UTF-8: PHP
PHP also uses ISO-8859-1 as its default source encoding, so we need to switch to utf-8 in our code too. Immediately after connecting to your database, insert this line to make the change.
[syntax type=”html|php|js|css”]<?php mysql_set_charset(‘utf8’); ?>[/syntax]
Another option, and the one most often cited on forums about this topic, is to use the following line:
[syntax type=”html|php|js|css”]<?php mysql_query(“SET NAMES ‘utf8′”); ?>[/syntax]
However, the PHP documentation says that the second method (via mysql_query) is not recommended.
Displaying UTF-8 in the Browser
Now, to have the browser display the correct characters, we need to ensure that the headers are correct. This can be done using a header() function in PHP or via meta tags in the HTML itself. Note that the header function in PHP will trump the meta tag in HTML.
Method 1: PHP header() Function
Add this to your PHP:
[syntax type=”html|php|js|css”]<?php header(“Content-type: text/html; charset=utf-8”); ?>[/syntax]
Method 2: HTML <head> Tags
Add this between your <head> tags in HTML
[syntax type=”html|php|js|css”]<meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″ />[/syntax]