BLOB data storage in Oracle geodatabases
BLOB data storage in Oracle geodatabases
|
Release 9.3 |
|
Note:This topic was updated for 9.3.1.
BLOB is a database management system (DBMS) industry acronym for binary large object. BLOB columns were implemented several years ago by Oracle Corporation to replace LONG RAW technology for storing binary data.
The architecture of the BLOB data type is divided into three basic components: the BLOB column, the LOB segment, and the LOB index. The BLOB column stores the LOB locator (36 bytes) and binary data in row if it is less than 3,965 bytes and in-row storage has not been disabled for the column.
NOTE: Tests by ESRI have shown that allowing storage in row provides the best performance, so you are advised not to disable in-row storage.
If the binary data exceeds 3,964 bytes, the in-row storage space of the BLOB column is not allocated, and the LOB locator references the binary data stored in the LOB segment.
Therefore, a value stored in a BLOB column with in-row storage enabled is always at least 36 bytes (the space allocated to the LOB locator) and may be as large as 4,000 bytes (the combined space allocated to the LOB locator and the maximum space that can be allocated to binary data stored in row).
The LOB segment is divided into chunks. Chunks must be a multiple of the Oracle data block size. For example, if the data block size is 8K, the LOB segment can be created with a minimum chunk size of 8K. If the length of the data stored within the LOB segment is 5,000 bytes, it is stored in the LOB segment since it exceeds 3,964 bytes and the chunk size is 8K or 8,192 bytes. In this case, 3,192 bytes of the LOB segment chunk remains unused. Transferring data from LONG RAW to BLOB can result in more space being required—perhaps as much as a 30 percent increase due to the unused space in the LOB segment. This is unavoidable if your data exceeds the 3,964-byte in-row storage threshold of the BLOB column.
The 8K chunk size experiences the best I/O performance while wasting the least amount of space. The 16K chunk size wastes more space than an 8K chunk size. Therefore, to avoid the loss of space, you are advised to either re-create the database that currently has a 16K data block size with an 8K data block size or, if that is not possible, create LOB segments in table spaces that have been created with an 8K block size. To do this, you need to allocate an 8K buffer cache in the Oracle System Global Area (SGA).
Chunk sizes of 4K and 2K have been found to waste less space, but the increase in I/O cost does not warrant using them.
The LOB index is only used if the number of chunks addressed by the LOB locator exceeds 12; otherwise, the first 12 chunks are addressed by the LOB locator.
The following three figures illustrate the three possible storage cases of binary data stored in a BLOB column. In the first case, 3,000 bytes of binary data are stored in row, since 3,000 bytes is less than the 3,965-byte in-row storage threshold. If in-row storage is not disabled for the BLOB column, the LOB segment and the LOB index are not used. Typically, this results in a faster fetch of the BLOB data due to the reduced number of I/O operations since Oracle does not need to access the LOB segment or the LOB index.
The next figure illustrates the second case, in which the binary data is larger than 3,964 bytes (in this case, the data is 81,920 bytes) and cannot fit in row. Therefore, the LOB locator references the binary data that is stored in the LOB segment. Since the binary data does not occupy more than 12 chunks in the LOB segment, the LOB locator stores its addresses. In this case, the LOB index is not used.
In the final illustration, the binary data is so large (106,496 bytes) that the LOB index is required. In this case, the binary data exceeds the in-row storage plus requires more than 12 chunks within the LOB segment to store it. For data this large, the LOB locator references the LOB index to obtain the location of the chunks within the LOB segment. This case is extremely rare for vector data and can be avoided for raster data.
Setting the DBTUNE parameters to store BLOB columns
The storage parameters of the DBTUNE table control how ArcGIS creates tables and indexes in Oracle. Some of the storage parameters also determine which data type is used when a table is created. For details on the DBTUNE table, see The dbtune file and the DBTUNE table. For general information on DBTUNE storage parameters, see DBTUNE configuration parameter name-configuration string pairs.
The ArcSDE DBTUNE storage parameters, GEOMETRY_STORAGE, RASTER_STORAGE, and ATTRIBUTE_BINARY, determine which Oracle data type is used to store ArcSDE data.
Note that beginning with ArcSDE 9.2, the RASTER_BINARY_TYPE storage parameter was replaced with the RASTER_STORAGE parameter. If you currently have ArcSDE 9.1 installed, substitute all references to RASTER_STORAGE with RASTER_BINARY_TYPE. The RASTER_BINARY_TYPE parameter accepts only two values: LONGRAW and BLOB.
The GEOMETRY_STORAGE parameter controls the storage of vector data that is stored in a feature class. The RASTER_STORAGE parameter controls the storage of raster data that is stored in a raster dataset, raster catalog, or raster attribute. Finally, the ATTRIBUTE_BINARY parameter controls the storage of all other binary data that is not vector or raster.
To create BLOB columns, the parameters must be set as follows within a given DBTUNE keyword:
GEOMETRY_STORAGE SDELOB
RASTER_STORAGE BLOB
ATTRIBUTE_BINARY BLOB
ESRI recommends the following LOB storage parameters for vector and raster data:
- Always enable in-row storage because most geographic information system (GIS) data fits within the 3,964-byte in-row threshold. Performance is best when data is stored in row.
- Enable cache since ArcSDE data is frequently read.
- Since ArcSDE does not perform updates on BLOB data but instead performs only inserts and deletes, set the PCT_VERSION to 0 as there is no need to maintain older versions of the data within the LOB segment.
- You should not use a chunk size less than 8K. Chunk sizes of 2K and 4K increase the amount of I/O because the Oracle server process must fetch more chunks. You will probably find that an 8K chunk size wastes less space than 16K. If you use a chunk size of 2K or 4K, you will find that it wastes less space, but tests have found that the display time for most raster and vector data increases dramatically over storing in an 8K chunk size. Since the chunk size must always be a multiple of the data block size, the best data block size to use for storing GIS data in BLOBs is 8K.
The following is an example of how the raster DBTUNE storage parameters have been modified to accommodate a raster blocks table stored as a BLOB data type.
RASTER_STORAGE "BLOB"
BLK_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE RASTER
LOB (BLOCK_DATA) STORE AS
(TABLESPACE RASTER_LOB_SEGMENT
CACHE PCTVERSION 0)"
AUX_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE RASTER
LOB (OBJECT) STORE AS
(TABLESPACE RASTER
CACHE PCTVERSION 0)"
If the raster block pixel data is less than 3,965 bytes, it is stored within the BLOCK_DATA column in the RASTER table space. However, if it exceeds this threshold, it is stored in the LOB segment in the RASTER_LOB_SEGMENT table space. The LOB index is only used if the number of chunks exceeds 12. This is unlikely to happen for ArcSDE data. Consider a LOB segment with a chunk size of 8K. Before the LOB index is used, the ArcSDE binary data needs to exceed 96K.
The following is an example of how the vector DBTUNE storage parameters have been modified to accommodate the feature table stored in a BLOB data type:
GEOMETRY_STORAGE "SDELOB"
F_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE VECTOR
LOB (POINTS) STORE AS
(TABLESPACE VECTOR_LOB_SEGMENT
CACHE PCTVERSION 0)"
If the feature's binary data is less than 3,965 bytes, it is stored within the POINTS column in the VECTOR tablespace. However, if it exceeds this threshold, it is stored in the LOB segment in the VECTOR_LOB_SEGMENT tablespace.
ATTRIBUTE_BINARY "BLOB"
B_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE BIZZTABS
LOB (DOCUMENT) STORE AS
(TABLESPACE BIZZ_LOB_SEGMENT
CACHE PCTVERSION 0)"
A_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE BIZZTABS
LOB (DOCUMENT) STORE AS
(TABLESPACE BIZZ_LOB_SEGMENT
CACHE PCTVERSION 0)"
In this example, if the business table's binary data is less than 3,965 bytes, it is stored within the business table's BLOB column in the BIZZTABS table space. However, if it exceeds this threshold, it is stored in the LOB segment in the BIZZ_LOB_SEGMENT table space. The BLOB column in this example is DOCUMENT. If the above B_STORAGE DBTUNE parameter is used to create a table that does not have a DOCUMENT column, the following error is returned by Oracle:
ORA-00904: "DOCUMENT": invalid identifier
Therefore, it is not wise to add B_STORAGE or A_STORAGE parameters referencing a specific BLOB column to the DEFAULTS keyword, since the business table must contain these columns. Instead, create separate DBTUNE keywords and add these storage parameters to the keywords. The keyword that contains the storage parameter is referenced during the creation of the table. It should also be noted that storage parameters of the DEFAULTS keyword are used if they are not included with a specific keyword. Due to this fact, it is not necessary to add a particular storage parameter within a keyword if its configuration string is identical to the storage parameter under the DEFAULTS keyword. For instance, if all the storage parameters except B_STORAGE and A_STORAGE of a new keyword, ROADS, have the same configuration string as those of the DEFAULTS keyword, you only need to create the B_STORAGE and A_STORAGE parameters under the ROADS keyword. All other storage parameters are read from the DEFAULTS keyword since they are not found in the ROADS keyword.
For information on DBTUNE keywords and the DEFAULTS keyword, see DBTUNE configuration keywords and The DEFAULTS keyword, respectively.