Previous | Table of Contents | Next

Page 348

FIG. 14.2
Conventional and direct path load methods.

Using Conventional Path Load

The conventional path uses SQL INSERT statements and in-memory bond array buffers to load data into Oracle tables. This process competes with all other processes for memory resources within the SGA. This can slow down loader performance if the database already has the overhead of supporting many concurrent processes.

Another overhead of using conventional path is that the loader process must scan the database for partially filled blocks of the table being loaded and attempt to fill those blocks. This is efficient for daily transactional processing, but it is an additional overhead that conventional path loading has.

Page 349

There are cases when conventional path is the preferred and sometimes mandatory method over the direct path:

Using Direct Path Load

Instead of using SQL INSERT statements and bind array buffers, direct path formats the input data into Oracle data blocks and writes them directly into the database. It is important to note that direct path loading always inserts data above the table's high-water mark. This eliminates the time spent scanning for partially filled blocks. If direct path is being used and does not complete, and loader is restarted from the beginning, the data that is loaded should be truncated or the table dropped and re-created. Simply deleting the data does not reset the high-water mark, and you will be using more storage than is necessary.

An understanding of what happens to a table's indexes using the direct path method is important to maximize the efficiency of direct path loading. In Oracle8, indexes are marked unusable when the loaded data becomes visible (the high-water mark is moved and committed), as this is the point that the indexes are out-of-date with respect to the data that it indexes. As each index is brought up-to-date, the index unusable state is cleared. This allows queries to take place against the table while direct path loading is in progress. However, a query executing while the load is in the finishing stage (moving the high-water mark and updating indexes) may fail if the query requires an index that is in an unusable state.

During the load, the new index values are written to temporary segments, and at the end of the load they are sorted and merged with the old index values, at which time the index becomes usable again. This can significantly increase the need for temp space. In order to avoid the additional need for temp space to do the sort, you can presort the data in the order of the index and use the keywords SORTED INDEXES in the control file.

In the previous section on conventional path loading, the restrictions that would prevent the use of direct path were listed. Loading a relatively small number of rows into a large table with indexes and/or referential and check constraints is not a restriction against using direct path, but it will probably be more efficient to use conventional path. In the case of the indexes, it would probably be quicker for conventional path to update the indexes as it is loading the data, rather than doing a large sort/merge to create the new indexes. With referential and check

Page 350

constraints, direct path requires that these constraints be disabled before the load. After all the data is loading and constraints are re-enabled, the entire data's table is checked against these constraints, not only the data that was loaded.

Using SQL*Loader Performance Tips

Below are some additional tips on increasing the performance of SQL*Loader:

  1. Use positional fields over delimited fields, which require Loader to scan the data to search for the delimiters. Positional fields are quicker because Loader only has to do simple pointer arithmetic.
  2. Specify maximum lengths for terminated by fields to make each bind array insert more efficient.
  3. Pre-allocate enough storage. As data is being loaded and more space is required in the table, Oracle allocates more extents to hold the data. This operation can be expensive if it is being done constantly during the load. Calculating or estimating the storage requirements prior to the load enables you to create the necessary storage up front.
  4. Avoid using NULLIF and DEFAULTIF clauses in the control file if possible. Each clause causes the column to be evaluated for every row being loaded.
  5. Split data files and run concurrent conventional path loads.
  6. Reduce the number of commits by using the command-line parameter ROWS.
  7. Avoid unnecessary character set conversions. Ensure that the client's NLS_LANG environment is the same as the server's.
  8. Use direct path whenever possible.
  9. When using direct path, presort data for the largest index of the table and use the SORTED INDEXES clause.
  10. When using direct path, use the parallel direct path option when possible.
  11. Minimize the use of redo logs during direct path loads. There are three ways of doing this with different levels of control:
    Disable archiving of the database.
    Use the keyword UNRECOVERABLE in the control file.
    Alter the table and/or index with the NOLOG attribute.

Previous | Table of Contents | Next