Note:This topic was updated for 9.3.1.
PostgreSQL database clusters (the PostgreSQL instance) are created by default to use the standard C locale functionality of the operating system on which PostgreSQL is installed. A C locale is the most flexible when an international environment is needed.
The locale is composed of the following six subcategories:
LC_COLLATE—The character sort order
LC_CTYPE—The character classification
LC_MESSAGES—The language for messages
LC_MONETARY—Determines the formatting of currency
LC_NUMERIC—Determines number formatting
LC_TIME—Determines date and time formatting
The first two subcategories, LC_COLLATE and LC_CTYPE, cannot be changed after the PostgreSQL instance is created. To learn how to alter the other locale settings, consult the PostgreSQL documentation.
In addition to the locale, designate a character set for your databases. You specify a character set for the PostgreSQL database cluster, but you can also specify different character sets for each database. To do this, specify the encoding option (-E) when issuing the createdb command. If you do not specify a different character set when you create the database, the character set of the PostgreSQL template database is used.
ESRI recommends you use the UTF-8 character set because it allows you to store all supported encodings. However, be aware that UTF-8 character encoding requires more storage space than most other character sets.
It is not recommended to designate a database character encoding that contradicts the locale set for the database cluster. The locale you set implies that a corresponding character set is used for your databases. For example, if your database cluster is set to the Spanish (Spain) locale, it is expected the encoding for your databases is LATIN1 (Western European languages), LATIN9 (LATIN1 plus Euro and accents), WIN1252 (Western European languages, Windows only), or UTF-8 (all languages).
If you plan to create databases with different character encodings than the locale set for the database cluster, you should use C for the database cluster locale.
The character set for each database is stored in the pg_database system catalog; therefore, to find out what character encoding a database is using, you can query this catalog.