10.6 UTF8 for Metadata
The metadata is the data about the data. Anything that describes the
database, as opposed to being the contents of the database, is
metadata. Thus column names, database names, usernames, version names,
and most of the string results from `SHOW' are metadata.
Representation of metadata must satisfy these requirements:
* All metadata must be in the same character set. Otherwise, `SHOW'
wouldn't work properly because different rows in the same column
would be in different character sets.
* Metadata must include all characters in all languages. Otherwise,
users wouldn't be able to name columns and tables in their own
In order to satisfy both requirements, MySQL stores metadata in a
Unicode character set, namely UTF8. This will not cause any disruption
if you never use accented characters. But if you do, you should be
aware that metadata is in UTF8.
This means that the `USER()', `CURRENT_USER()', and `VERSION()'
functions will have the UTF8 character set by default. So will any
synonyms, such the `SESSION_USER()' and `SYSTEM_USER()' synonyms for
The server sets the `character_set_system' system variable to the name
of the metadata character set:
mysql> SHOW VARIABLES LIKE 'character_set_system';
| Variable_name | Value |
| character_set_system | utf8 |
Storage of metadata using Unicode does _not_ mean that the headers of
columns and the results of `DESCRIBE' functions will be in the
`character_set_system' character set by default. When you say `SELECT
column1 FROM t', the name `column1' itself will be returned from the
server to the client in the character set as determined by the `SET
NAMES' statement. More specifically, the character set used is
determined by the value of the `character_set_results' system variable.
If this variable is set to `NULL', no conversion is performed and the
server returns metadata using its original character set (the set
indicated by `character_set_system').
If you want the server to pass metadata results back in a non-UTF8
character set, then use `SET NAMES' to force the server to perform
character set conversion ( Charset-connection), or else set the
client to do the conversion. It is always more efficient to set the
client to do the conversion, but this option will not be available for
many clients until late in the MySQL 4.x product cycle.
If you are just using, for example, the `USER()' function for
comparison or assignment within a single statement, don't worry. MySQL
will do some automatic conversion for you.
SELECT * FROM Table1 WHERE USER() = latin1_column;
This will work because the contents of `latin1_column' are
automatically converted to UTF8 before the comparison.
INSERT INTO Table1 (latin1_column) SELECT USER();
This will work because the contents of `USER()' are automatically
converted to `latin1' before the assignment. Automatic conversion is
not fully implemented yet, but should work correctly in a later version.
Although automatic conversion is not in the SQL standard, the SQL
standard document does say that every character set is (in terms of
supported characters) a "subset" of Unicode. Since it is a well-known
principle that "what applies to a superset can apply to a subset," we
believe that a collation for Unicode can apply for comparisons with
automatically generated byinfo2html