Minimize disk I/O contention in DB2
Minimize disk I/O contention in DB2
|
Release 9.3 |
|
Disk I/O contention within a DB2 database is minimized by properly arranging the components of the database throughout the file system. Ultimately, the database administrator must reduce the possibility of one process waiting for another to complete its I/O request. This is often referred to as waiting on I/O.
The following are recommendations to help you avoid disk I/O contention.
- For temporary user table spaces, use SMS; for all other table spaces, use DMS.
There are two types of table space storage models in DB2: System Managed Space (SMS) and Database Managed Space (DMS). ESRI recommends that you use SMS table spaces for user temporary table spaces and DMS for other table spaces for better performance especially if your data is expected to grow on a regular basis.
NOTE: At DB2 9, you will require the use of an additional DB2 global temporary table (DECLARE GLOBAL TEMPORARY TABLE). Per DB2 documentation, declaring global temp tables requires "either SYSADM or DBADM privileges" or "USE privilege on a USER TEMPORARY table space".
For DB2 9, you should consider using Automatic Storage features for databases as well as incorporating the Automatic Sizing attributes for table spaces. In DB2 9, DMS table spaces have the ability to provision additional storage as physical requirements increase.
- Set up buffer pools.
Setting up the buffer pools is critical to performance. By default, DB2 provides a single buffer pool named IBMDEFAULTBP. In DB2 9, by default, the IBMDEFAULTBP buffer pool is configured for automatic sizing. The database manager adjusts the size of this buffer pool in response to workload requirements.
You should create a separate buffer pool for each table space. The database snapshot should be reviewed to check the buffer pool physical read values. The buffer pool should be large enough that a snapshot of a map redraw results in a small number of physical reads.
To determine the efficiency of a buffer pool, calculate its buffer pool hit ratio (BPHR). An ideal BPHR, if possible, is somewhere over 90%. The formula is as follows:
BPHR (%) = (1 - (("Buffer pool data physical reads"
+ "Buffer pool index physical reads")
/ ("Buffer pool data logical reads"
+ "Buffer pool index logical reads"))) * 100
Another method for reviewing buffer pool hit ratios is through the use of the SYSIBMADM.BP_HITRATIO administrative view. This view can return hit ratios, including total hit ratio, data hit ratio, XDA hit ratio, and index hit ratio for all buffer pools.
SELECT SUBSTR(DB_NAME,1,8) AS DB_NAME,
SUBSTR(BP_NAME,1,140) AS BP_NAME,
TOTAL_HIT_RATIO_PERCENT,
DATA_HIT_RATIO_PERCENT,
INDEX_HIT_RATIO_PERCENT,
XDA_HIT_RATIO_PERCENT, DBPARTITIONNUM
FROM SYSIBMADM.BP_HITRATIO
ORDER BY DBPARTITIONNUM
Establish the threshold table size.
As a rule, store small tables together in the same table space and large tables by themselves in their own table spaces. Decide how large a table must be before it requires its own table space. Generally, the threshold corresponds in part to the maximum container size. Tables capable of filling the maximum size container should be stored in their own table space. Tables approaching this limit should also be considered. Follow the same policy for indexes.
Separate the tables and indexes into those that require their own table spaces and those that will be grouped together. Never store large or frequently accessed tables and their indexes together in the same table space.
If you are using DB2 9, you should consider incorporating Automatic Storage attributes for table spaces.
Store small tables and indexes by access.
Base the decision of which small tables to store together in the same table space on expected access. Store tables of high access in one table space and tables of low access in another. Doing so allows you to position the containers of the high access table spaces with low access containers. This same rule applies to indexes. They, too, should be divided by access.
It is recommended that the DB2 logs have their own disks apart from indexes and data. ESRI recommends that you set the operating system's kernel parameter MAXUPROC value to a higher value than the default so all processes owned by the ArcSDE and DB2 instance users can run when invoked.
For AIX operating systems, turn off db2set DB2_MMAP_READ and db2set DB2_MMAP_WRITE.
ESRI recommends turning off the following variables on AIX to improve performance.
These variables are used in conjunction with each other to allow DB2 to use a multifunction, multiservice access platform (MMAP) as an alternative method of I/O. This is used to avoid operating system locks when multiple processes are writing to different sections of the same file. Default is ON.
db2set DB2_MMAP_READ=OFF
db2set DB2_MMAP_WRITE=OFF
Separate the ArcSDE system table spaces STATES, STATE_LINEAGES, and MVTABLES_MODIFIED tables and their indexes if you are extensively using versioned editing.
The ArcSDE system table spaces store the ArcSDE and geodatabase system tables and indexes created by the ArcSDE sdesetup command. The number and placement of the table spaces depend on what you intend to use the ArcSDE database for.
The placement of these tables and their indexes is controlled by the storage parameters of the DBTUNE DATA_DICTIONARY configuration keyword. The DATA_DICTIONARY keyword is used exclusively for the creation of the ArcSDE and geodatabase system tables.
Versioned databases that support ArcGIS applications have a highly active state tree. The state tree maintains the states of all editing operations that have occurred on tables registered as versioned. Four ArcSDE system tables—STATES, STATE_LINEAGES, MVTABLES_MODIFIED, and VERSIONS—maintain the transaction information of the versioned database's state tree.
In an active versioned database, the STATES_LINEAGE table can easily grow beyond one million records, occupying more than 26 MB of table space. The STATES table is much smaller, storing approximately 5,000 records, occupying about 2 MB of table space. The MVTABLES_MODIFIED table typically has approximately 50,000 records occupying about 1 MB of table space. The VERSIONS table is usually quite small, with less than 100 rows occupying about 64 KB.
For highly active editing ArcGIS applications, the STATES, STATE_LINEAGES, and MVTABLES_MODIFIED tables and their indexes need to be created in separate table spaces and positioned across the file system to minimize disk I/O contention.
NOTE: If you are not going to use a versioned database, reduce the extent sizes of the STATE_LINEAGES, STATES, and MVTABLES_MODIFIED tables and their indexes to 40 KB. Create two 5 MB table spaces on separate disk drives—one for the tables and one for the indexes.
Create large enough table spaces for raster tables, and create a separate table space for the raster blocks table (BLK).
The raster blocks table can be quite large. Be sure to create a separate, large table space to store this table. You can create another table space to store the remaining raster tables.
When you create the table spaces for the raster block table, it is recommended you use an extent size of 64 KB and a page size of 32 KB. The extent size specifies the number of page-size pages that will be written to a container before proceeding to the next container.
NOTE: The extent size is defined at the time of table space creation and cannot be easily modified after.
When loading large amounts of data, such as raster data, set the database to use circular logging.
Circular logging uses a ring of log files to record changes to the database. When the last log in the ring is filled, the first file is reused. Since transaction information can be overwritten, when you use circular logging, roll-forward recovery is not possible; however, recovery is generally not needed during data loads. Circular logging is the default logging model.
If you need to use roll-forward recovery, use archive logging. Archive logging does not overwrite log files; rather, it creates additional logs to record all transactions since the last backup. Note that you will need more disk space to use archive logging since logs are not reused as in circular logging.
Even after tuning your database to reduce disk I/O contention, there may still be times when you run into deadlocks. For information on deadlocks see Dealing with deadlocks in a DB2 database.