You are here:
Geodatabases and ArcSDE
Building a geodatabase
Defining the properties of a geodatabase
Configuration keywords in geodatabases
When you create a dataset in a file geodatabase, you can choose a configuration keyword to customize how the data is stored. Each keyword optimizes storage for a particular type of data, slightly improving storage efficiency and performance. There are four keywords available that cannot be customized.
In most cases, you will specify the DEFAULTS keyword. DEFAULTS works well in all cases except if you want to store a large raster dataset that exceeds 1 TB in size, in which case you would specify the MAX_FILE_SIZE_256TB keyword. If you don't specify any configuration keyword, DEFAULTS is used.
Text storage: UTF8 vs. UTF16
UTF8 is the most efficient storage format if your text data is in English; another Western European language; or any other language that uses the Latin alphabet such as Polish, Turkish, or Indonesian. UTF8 stores each nonaccented Latin character in just 1 byte, and each accented character or any other character not found in the Latin alphabet in a variable number of bytes, ranging from 2 to 6. Since UTF8 stores the vast majority of text characters in just 1 byte, it results in lower storage requirements and improved performance for these languages.
UTF16 is the most efficient storage format for text data in a non-Latin alphabet such as Chinese, Japanese, Korean, Russian, Greek, or Arabic. For these languages, this format uses just 2 bytes per character. The UTF8 representation of the same character might use up to 6 bytes, which would increase storage requirements and slightly slow performance for these languages. This method of storing text is only available with the TEXT_UTF16 keyword, which comes with a 1 TB size limit.
This keyword stores datasets that are less than 4 GB in size slightly more efficiently than the DEFAULTS keyword, although the size savings is relatively insignificant at 1 byte per record, or about 1 MB per million records. As an example, all the roads in California (2,092,079 records) store as 312 MB with the DEFAULTS keyword and 310 MB with the MAX_FILE_SIZE_4GB keyword.
This keyword restricts a dataset to a maximum size of 4 GB, so specify it only if you know a feature class or raster dataset will never need to grow beyond this size.
Specifying the MAX_FILE_SIZE_256TB configuration keyword allows you to create a dataset that is up to 256 TB in size. You would normally only specify this keyword to store a large raster dataset.
NOTE: Although the file geodatabase will allow you to store datasets of this size, be sure you have enough disk space to do so.