ArcGIS Server Banner

Configuring an Oracle database to support XML columns

Configuring an Oracle database to support XML columns

Release 9.3 E-mail This TopicPrintable VersionGive Us feedback

About configuring an Oracle database to support XML columns

ArcSDE XML columns require that Oracle Text for Oracle 10g and Oracle9i be installed and that user accounts granted privileges to use the Oracle Text components. Once this is accomplished, the database and ArcSDE must be properly configured and maintained to optimize search results and performance. In particular, the database and ArcSDE must be properly configured to support searching XML columns written in languages other than English.

XML documents are stored in the database as an Oracle Large Object (LOB). The LOB storage and caching parameters, the tablespace's block size, and the buffer cache configuration all affect the speed at which LOB data is retrieved, which in turn, affects performance when searching. It is also important to manage the text indexes properly to maintain good performance. ArcSDE for Oracle includes two scripts to help you configure your database to store XML data. See Tuning an Oracle instance for XML data storage for details.

These steps are described in the following sections.

How to configure an Oracle database to support XML columns

Install and configure the text component for Oracle

  1. Follow the instructions in the Oracle documentation to install the appropriate text component for your version of the database.
  2. Oracle Text is installed by default with a typical installation of Oracle 10g and Oracle9i. However, if you upgraded an earlier version of the database for which the text component was not installed, Oracle Text may not be installed as part of the upgrade. If this is the case, it must be installed as an additional component after the upgrade is completed.

  3. Follow the instructions in the Oracle documentation to verify the text component has been installed and is working properly.
  4. Log in to SQL*Plus as the ctxsys user to grant privileges to the user who will own the XML column by issuing the command:
  5. sql>grant execute on ctx_ddl to <mdUsername>;

    <mdUsername> is the user who will own the column. For ArcIMS metadata services, <mdUsername> is the user specified in the service's ArcXML configuration file.

  6. If you plan to query a table using the contents of an XML column, check the default lexer that will be used for linguistic analysis with text indexes using SQL*Plus—this is particularly important when the column will contain XML documents with text written in a language other than English. If custom settings are not provided in the ArcSDE DBTUNE table, the default Oracle Text parameters will determine how text indexes are created for ArcSDE XML columns.
  7. The default objects used for text indexes are owned by the ctxsys user and are created for the database installation, not for individual databases. Their properties are set based on the language settings used when Oracle is installed; see the sections about system-defined preferences and system parameters in the Oracle Text Reference for your version of the database to learn how these properties are set.

    Check the default text settings to determine if custom preferences are required for your XML column.

    1. Query the CTX_PARAMETERS view to determine what the default settings currently are for your Oracle installation. The parameters that affect ArcSDE XML columns are DEFAULT_LEXER, DEFAULT_STOPLIST, DEFAULT_STORAGE, DEFAULT_WORDLIST, and DEFAULT_INDEX_MEMORY.
    2. select * from ctx_parameters;

      By default these parameters are typically set to the system-defined preferences with the same name that are owned by the ctxsys user or an appropriate value. For example, the DEFAULT_LEXER is typically set to use the preference CTXSYS.DEFAULT_LEXER. However, someone in your organization may have previously configured Oracle to use a custom preference by default instead.

    3. Query the CTX_PREFERENCES view to learn about the available preferences. This is how you can learn which lexer is being used by the CTXSYS.DEFAULT_LEXER preference, for example. Any custom preferences created by other users will also be described.
    4. select * from ctx_preferences;

      For many Western European languages and for languages where a language-specific lexer is not provided, the CTXSYS.DEFAULT_LEXER preference will typically use the BASIC_LEXER lexer, the CTXSYS.DEFAULT_WORDLIST preference will often use the BASIC_WORDLIST wordlist, and so on.

    5. To successfully index and search some Western European languages, specific wordlist settings are required. If your language is French you will want to make sure the stemmer and fuzzy_match attributes for the BASIC_WORDLIST preference are set to French, for example. Query the CTX_PREFERENCE_VALUES view to ensure the system-defined preferences are correctly set for the language of your XML documents.
    6. select * from ctx_preference_values;

    7. Compare the results from the above queries to the allowed settings for lexers, wordlists, and other text objects documented in the Oracle Text Reference for your version of the database.

    If the current default settings are not correct for the language of your XML documents and existing custom text preferences are not available, you must create custom text preferences to correctly index and search your XML documents and reference those preferences using the ArcSDE DBTUNE XML_IDX_INDEX_TEXT parameter before creating your XML column.

Configure the database and DBTUNE parameters for an XML column

Setting text indexing parameters

After checking the current text settings in Oracle and consulting the Oracle Text Reference, you may want to customize the settings that will be used to index your XML documents. This requires creating the appropriate preferences in Oracle and then referencing those parameters with the ArcSDE DBTUNE parameter XML_IDX_INDEX_TEXT. Then, use the DBTUNE keyword containing the custom XML_IDX_INDEX_TEXT parameter when creating an XML column. For an ArcIMS Metadata Service you can reference a DBTUNE parameter in its ArcXML configuration file.

Setting storage parameters for XML columns

Many different DBTUNE parameters can be used to define the location in which objects associated with an XML column should be stored.

An XML column will always have an XML document table, which stores the XML documents. A text index is always created on all content in these documents to support finding all documents containing a given word. Storage parameters for this table can be defined as follows.

An XML column may optionally have an XML index table, which stores the information in individual XML elements that can be searched. A text index is always created for this table to support searching descriptive text. Storage parameters for this table can be defined as follows.

Typical storage parameters can be defined for all of the above tables and indexes with the exception of the specialized storage parameters for the large objects. However, storage parameters for an XML column's text indexes must be defined differently.

With the default DBTUNE parameters, four tables are created to manage each text index. An Oracle text preference must be created to define storage parameters for these tables, then the storage preference must be referenced in the XML_IDX_INDEX_TEXT parameter. Both text indexes must use the same storage preference.

The following example creates a text preference named "XML_STORAGE" that places any tables created for an XML column's text indexes into a tablespace named "TXTIDX".

BEGIN
CTX_DDL.CREATE_PREFERENCE('XML_STORAGE', 'BASIC_STORAGE');
CTX_DDL.SET_ATTRIBUTE('XML_STORAGE', 'I_TABLE_CLAUSE',
        'TABLESPACE TXTIDX STORAGE (INITIAL 1K)'); 
CTX_DDL.SET_ATTRIBUTE('XML_STORAGE', 'K_TABLE_CLAUSE',
        'TABLESPACE TXTIDX STORAGE (INITIAL 1K)'); 
CTX_DDL.SET_ATTRIBUTE('XML_STORAGE', 'R_TABLE_CLAUSE',
        'TABLESPACE TXTIDX STORAGE (INITIAL 1K) LOB
        (DATA) STORE AS (CACHE)');
CTX_DDL.SET_ATTRIBUTE('XML_STORAGE', 'N_TABLE_CLAUSE',
        'TABLESPACE TXTIDX STORAGE (INITIAL 1K)'); 
CTX_DDL.SET_ATTRIBUTE('XML_STORAGE', 'I_INDEX_CLAUSE',
        'TABLESPACE TXTIDX STORAGE (INITIAL 1K) COMPRESS 2');
CTX_DDL.SET_ATTRIBUTE('XML_STORAGE', 'P_TABLE_CLAUSE',
        'TABLESPACE TXTIDX STORAGE (INITIAL 1K)'); 
END;
/

To use this storage preference when creating an XML column, add it to the XML_IDX_INDEX_TEXT parameter. For example, if the geodatabase administrator logged in as the user "sde" when they created the storage preference, reference it in the XML_IDX_INDEX_TEXT parameter as follows.

LEXER sde.WORLDLEXER STORAGE sde.XML_STORAGE

Consult the Oracle Text Reference for more information about creating a custom storage preference. Be sure to always include the storage clauses recommended by Oracle for text indexes.

Configuring LOB storage

XML documents are stored as large objects (LOBs) in the XML document table in the xml_doc and xml_doc_val columns, and in the XML index table in the text_tag column. It is important to configure these columns accurately to achieve the best possible search performance.

Using in-line or out-of-line storage

LOBs are stored inline if the LOB data is stored in the same block as the rest of the data in the row. However, in-line storage is only possible if the LOB data is less than 4KB in size. With out-of-line storage, the data is stored in the LOB segment and only the LOB locator is stored with the rest of the data in the row.

By default, Oracle stores LOB data inline. You can specify whether LOB data associated with an XML column is stored inline or out of line using the ArcSDE DBTUNE parameters XML_DOC_LOB_STORAGE and XML_DOC_VAL_LOB_STORAGE for the XML document table and XML_IDX_TEXT_TAG_STORAGE for the XML index table.

When LOB data is stored out of line for an XML column, by default, ArcSDE places that data in the same tablespace as the XML document table. The LOB data may be moved to a different tablespace than the one containing the XML document table by using the XML DBTUNE parameters XML_DOC_STORAGE, XML_DOC_VAL_STORAGE, and XML_IDX_TEXT_TAG_STORAGE.

A typical XML document that contains metadata describing a GIS resource will be greater than 4KB in size. Tests show XML columns associated with ArcIMS Metadata Services perform best when the LOB data is stored out of line in a separate tablespace from the XML document table. Specify out-of-line storage for the XML column by setting these parameters to "DISABLE STORAGE IN ROW". After configuring another tablespace to store the LOB data, add that tablespace's name to the parameter values.

An ArcIMS Metadata Service may contain gazetteer data instead of typical metadata XML documents; this place name information is very small, typically less than 100 bytes in size. XML columns containing small documents will perform best when the LOB data is stored inline. The default DBTUNE keyword IMS_GAZETTEER specifies in-line storage for LOB data by setting the XML_DOC_STORAGE and XML_DOC_VAL_STORAGE parameters to "ENABLE STORAGE IN ROW".

When defining storage for XPath indexes, you should store the data out of line. This is because these indexes can be of varying sizes and, therefore, some might be stored inline and others out of line. Having to search both inline and out of line can slow down the indexes. Only store the data inline if you are certain all data is of the same size and small.

The XML_IDX_TEXT_TAG_STORAGE parameter does not have a default value in the DEFAULTS keyword and is not included in the IMS_GAZETTEER keyword. Specify the following storage information for the XML_IDX_TEXT_TAG_STORAGE parameter in the DBTUNE table so XPath index storage is out of line:

NOCACHE NOLOGGING CHUNK 4K PCTVERSION 5 DISABLE STORAGE IN ROW

When appropriate, use a similar value with "ENABLE STORAGE IN ROW" to store the data inline.

For more information about inline and out of line storage, see Oracle's LOB performance guidelines.

Updating an XML column's text indexes

By default, when XML documents are stored in an XML column the text indexes associated with the column will not be automatically updated to include text from the new documents. Until the text indexes have been updated, any XML documents that have not been indexed can't be found by a search.

ArcSDE does not provide any utilities that can be used to manually trigger updates to an XML column's text indexes. For XML columns associated with an ArcIMS Metadata Service, ArcIMS provides configuration options and a command-line utility can be used to update the text indexes.

Set the optimizer parameters for Oracle9i

For Oracle9i, a few optimizer parameters should be set in the database for searches to perform well. Setting these parameters will not adversely affect other client software that accesses your ArcSDE database. You do not need to set these parameters for newer versions of Oracle.

For Oracle9i, the following parameter must be added to your Oracle server's init.ora file:

OPTIMIZER_MODE=CHOOSE

Only the optimizer mode must be set; the other parameters used with Oracle8i aren't required.

NOTE: Beginning with ArcSDE 9.2, Oracle 8i is not supported. However, if you are using ArcSDE 9.1 or lower with Oracle 8i, the following parameters must be added to your Oracle server's init.ora file:

OPTIMIZER_INDEX_CACHING = 90
OPTIMIZER_INDEX_COST_ADJ = 10
OPTIMIZER_MODE=CHOOSE

Refer to the Oracle8i documentation for a complete discussion of these parameters. In general, setting these parameters as specified directs the Oracle8i optimizer to favor index scans and nested loop execution plans over a full table scan or a sort merge.

See Also

  • Configuring a database to support XML columns
  • XML columns in the geodatabase