Indexes in geodatabases in SQL Server

Indexes in geodatabases in SQL Server
Release 9.3

Every data table in ArcSDE for Microsoft SQL Server has a clustered index. The clustered key in an ArcSDE table is almost always the join key in a query such as the business table's spatial column or the f table's FID column. The SQL Server query processor favors clustered indices because a clustered index orders the table data at its leaf level based on the index key. A normal index would have a pointer to the data page. Clustered indices are b-tree structures containing pages with index rows holding a key value and a pointer to either a data page or a lower-level index page. All searches begin at the top level of the b-tree, the root node. This page is found in the sysindexes table in the root field. Within this page are pointers to intermediate pages, themselves holding ranges of values and pointers to other pages.

B-tree diagram of a clustered index

Here is an example of the contents of the root node (shortened for brevity's sake). The Page Header displays 0:0 for both m_nextPage and m_prevPage, indicating that it is a root node. Also, two data slots are displayed, each with a key value and a pointer to another page. The pages pointed to are intermediate pages, themselves containing more key values and pointers.

PAGE HEADER:
Page @0x19A8A000
m_pageId = (1:4713)       m_headerVersion = 1       m_type = 2
m_typeFlagBits = 0x0      m_level = 0               m_flagBits = 0x0
m_objId = 1526296497      m_indexId = 1             m_prevPage = (0:0)
m_nextPage = (0:0)        pminlen = 11              m_slotCnt = 133
m_freeCnt = 5968          m_freeData = 1958         m_reservedCnt = 0
m_lsn = (69:1958:6)       m_xactReserved = 0        m_xdesId = (0:0)
m_ghostRecCnt = 0         m_tornBits = 0            
DATA:
Slot 0, Offset 0x60
Record Type = INDEX_RECORD                          
Record Attributes =  NULL_BITMAP                    
19a8a060: 00000116  00126800  02000100      0000 .....h........
Slot 1, Offset 0x6e
Record Type = INDEX_RECORD                          
Record Attributes =  NULL_BITMAP                    
19a8a06e: 00001816  00126a00  02000100      0000 .....j........

Since a cluster orders the storage of a table, you cannot specify different file groups for the cluster and its table. For example, if you specify in the dbtune table that the feature table's clustered index should go on file group f_idx and the table should reside on f_storage, your table will be stored with the index on f_idx.

Index maintenance

When you initially create a table with a clustered index and specify a fillfactor for that index, each data page is filled to that fill percentage when you initially insert rows. Later, when you edit this table with INSERT, DELETE, and UPDATE statements, the amount of data contained in each page will tend to increase.

If a data page fills to 100 percent and that particular page incurs an UPDATE or an INSERT, the page splits approximately in half. Half of the page moves into a newly allocated extent not contiguous to its former location. This will fragment your tables and increase query time as the disk spindles must travel to more locations to fetch data.

You can use the fillfactor clause in the SDE_dbtune table to delay page splits by limiting the fill percentage of a data page when it is initially allocated. Any ensuing INSERTs or UPDATEs will add to the fill percentage of a data page. Monitor page splits with DBCC SHOWCONTIG, and when your tables become fragmented, rebuild your clustered indexes or run DBCC INDEXDEFRAG (SQL Server 2000 only) or DBCC DBREINDEX.

Rebuilding a clustered index will reorder its table and defragment it. Data pages are reset to their initial fillfactor setting. The simplest way to rebuild an index is to place an ArcSDE feature class into load_only_io then back to normal_io. However, this won't work on versioned data.

If your data is multiversioned, you don't need to worry about monitoring the ArcSDE business table for extent fragmentation. Instead, your delta tables (adds and deletes), feature tables, and spatial index will become fragmented. You can issue a DBCC DBREINDEX statement or invoke the DBCC INDEXDEFRAG command.

Here is an example of DBCC SHOWCONTIG run against a feature class that has incurred several edits-in this case, many updates and deletes. The output shown below reflects only fragmentation against the spatial index.

dbcc showcontig('hydro.s5')
DBCC SHOWCONTIG scanning 's5' table...
Table: 's5' (1622296839); index ID: 1, database ID: 7
TABLE level scan performed.
 Pages Scanned................................: 57
 Extents Scanned..............................: 11
 Extent Switches..............................: 36
 Avg. Pages per Extent........................: 5.2
 Scan Density [Best Count:Actual Count].......: 21.62% [8:37]
 Logical Scan Fragmentation ..................: 33.33%
 Extent Scan Fragmentation ...................: 27.27%
 Avg. Bytes Free per Page.....................: 3780.6
 Avg. Page Density (full).....................: 53.29%

For an exact explanation of these numbers, see the SQL Server Books Online under DBCC SHOWCONTIG. In this case, 36 extent switches are far too many to cover 57 pages. The best count reflects an optimal amount of extent switches, which in this case, would be 8. Both extent scan fragmentation and logical scan fragmentation should be as close to 0 as possible. These measurements reflect the ordering of pages and extents in the index allocation map. Running either of these commands on this table will defragment the table.

dbcc indexdefrag (sde,'vtest.s5',s5_pk)
dbcc dbreindex('hydro.s5')
DBCC SHOWCONTIG scanning 's5' table...
Table: 's5' (1622296839); index ID: 1, database ID: 7
TABLE level scan performed.
 Pages Scanned................................: 41
 Extents Scanned..............................: 6
 Extent Switches..............................: 5
 Avg. Pages per Extent........................: 6.8
 Scan Density [Best Count:Actual Count].......: 100.00% [6:6]
 Logical Scan Fragmentation ..................: 4.88%
 Extent Scan Fragmentation ...................: 33.33%
 Avg. Bytes Free per Page.....................: 2096.6
 Avg. Page Density (full).....................: 74.10%

DBCC DBREINDEX and DBCC INDEXDEFRAG differ in that INDEXDEFRAG can be run online. It will not block queries or updates. For relatively unfragmented tables, DBCC INDEXDEFRAG should run faster than DBREINDEX. For more severely fragmented data, use DBREINDEX.

You only need to worry about business table fragmentation after running a full compress, when the delta tables have been flushed into the business table. Therefore, you should rebuild your business tables' clustered indexes after running a full compress.

Indexes in geodatabases in SQL Server

Indexes in geodatabases in SQL Server

B-tree diagram of a clustered index

Index maintenance

See Also