But what about the first search? Can we pre-load the database cache in order to make sure that most of what we need is already in memory before the first search?
This paper explores some techniques which make this possible.
This paper assumes a good understanding of the various underlying tables of an Oracle Text index. If you are not aware of the functions of the various tables ($I, $K etc), then the paper should still offer some useful techniques.
A table or index may be forced to reside in the keep cache using the commands
ALTER TABLE <tablename> STORAGE (BUFFER POOL KEEP);
ALTER INDEX <indexname> STORAGE (BUFFER_POOL KEEP);
If we have unlimited memory available, then we would want to put all of the following into the keep pool:
If our table uses local partition indexes, we would need to store these tables for each partition. Additionally, if our application uses functional lookups in the text table we would need the $K table, and if we're using prefix indexing we would need the $P table.
However, in real life it very likely that all of these tables will exceed the amount of memory we can afford to allocate to the keep pool. We would therefore need to select only certain tables. If we are using local partition indexes, and our queries are normally sorted by the partition key, it might make sense to just cache the tables for the first few partitions.
Alternatively, if our table isn't partitioned, it might be better to cache certain tables and not others.
In decreasing order of importance, we could rank them as
$I, base table and $K are perhaps equally important, - though the sizes are clearly not comparable, and different usage scenarios would tend to favor different approaches. In the normal case (with little or no functional invocation), we would suggest base-table, $I, $K in that order. If the index is badly fragmented, but we're still not using functional invocation, then we'd flip the base table and $I. If there's a large amount of functional invocation being used, $K goes to the head of the list.
For testing purposes, I built a local partition index on a table called TEST with one million rows - each row consisting of just the unique key ID, and a filename pointing to an external XML file.
The output of CTX_REPORT for this table is as follows:
SQL> set long 50000
SQL> select ctx_report.index_size('test_idx') from dual;
CTX_REPORT.INDEX_SIZE('TEST_IDX')
--------------------------------------------------------------------------------
===========================================================================
INDEX SIZE FOR TESTUSER.TEST_IDX
===========================================================================
...
TOTALS FOR INDEX TESTUSER.TEST_IDX
---------------------------------------------------------------------------
CTX_REPORT.INDEX_SIZE('TEST_IDX')
--------------------------------------------------------------------------------
TOTAL BLOCKS ALLOCATED: 206608
TOTAL BLOCKS USED: 198235
TOTAL BYTES ALLOCATED: 1,692,532,736 (1,614.13 MB)
TOTAL BYTES USED: 1,623,941,120 (1,548.71 MB)
So we can see that the total index size is about 1.6GB. Since the machine has "only" 2GB of memory, we can't really afford to put all of that into memory. So we're going to cache a subset.
SQL> connect sys as sysdba
SQL> alter system set db_cache_size = 1G;
SQL> shutdown
SQL> startup
SQL> connect testuser/testuser
SQL> select /* FIRST_ROWS */ * from (
select id, filename from test
where contains (filename, 'district and region) > 0
order by id desc )
where rownum <= 20;
...
After running this query, we can find out what's in the buffer cache by running the following SQL as a dba:
COLUMN OBJECT_NAME FORMAT A30
COLUMN OBJECT_TYPE FORMAT A15
COLUMN OWNER FORMAT A15
COLUMN NUMBER_OF_BLOCKS FORMAT 999,999,999,999
SELECT o.OBJECT_NAME, o.OBJECT_TYPE, o.OWNER, COUNT(*) NUMBER_OF_BLOCKS
FROM DBA_OBJECTS o, V$BH bh
WHERE o.DATA_OBJECT_ID = bh.OBJD
AND o.OWNER NOT IN
('SYS', 'SYSTEM', 'SYSMAN', 'XDB', 'IX', 'WMSYS', 'CTXSYS')
GROUP BY o.OBJECT_NAME, o.OWNER, o.OBJECT_TYPE
ORDER BY COUNT(*);
After running the previous CONTAINS query, this gives me:
OBJECT_NAME OBJECT_TYPE OWNER NUMBER_OF_BLOCKS
------------------------------ --------------- --------------- ----------------
DR#TEST_IDX0009$R TABLE TESTUSER 2
DR#TEST_IDX0010$R TABLE TESTUSER 2
SYS_LOB0000056864C00002$$ LOB TESTUSER 4
DR#TEST_IDX0009$X INDEX TESTUSER 12
DR#TEST_IDX0010$X INDEX TESTUSER 12
SYS_LOB0000056876C00002$$ LOB TESTUSER 17
TEST TABLE PARTITION TESTUSER 22
DR#TEST_IDX0010$I TABLE TESTUSER 459
DR#TEST_IDX0009$I TABLE TESTUSER 494
9 rows selected.
Firstly, we can see from the last two lines that the majority of blocks are from two $I tables - those for the last two partitions of the table (since it has ten partitions). Then, moving up a bit, we see 22 blocks from the base table TEST. Note that this doesn't tell us which partitions of the table these blocks come from, but it's reasonable to assume that they come from the last two partitions as well. There are also a few blocks from the $R table.
We'll be covering the SYS_LOB... objects in a moment. For now, we're going to ignore them as tracking them down will be a useful exercise coming up.
We need to load both the tables, and the indexes for indexed tables. We can do this by running queries with appropriate hints. Rather than actually fetching every row which would require lots of traffic between the kernel and client, we can make use of aggregate functions so that the rows are fetched only to the kernel:
SELECT /*+ FULL(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO)) FROM DR#TEST_IDX0010$I ITAB;
SELECT /*+ FULL(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO)) FROM DR#TEST_IDX0009$I ITAB;
SELECT /*+ INDEX(ITAB) */ SUM(LENGTH(TOKEN_TEXT)) FROM DR#TEST_IDX0010$I ITAB;
SELECT /*+ INDEX(ITAB) */ SUM(LENGTH(TOKEN_TEXT)) FROM DR#TEST_IDX0009$I ITAB;
SELECT SUM(ROW_NO) FROM DR#TEST_IDX0010$R;
SELECT SUM(ROW_NO) FROM DR#TEST_IDX0009$R;
SELECT /*+ FULL(BTAB) */ SUM(ID) FROM TEST BTAB WHERE ID >= 900000;
Also note that the $R loading is incomplete. This is related to those system LOB objects we saw earlier - and we'll be coming back to that later.
Running these queries on my system took just over a minute. That's to load one-fifth of the partitions for a million-row table. Here's the output after rebooting the machine to avoid any effects of disk caching:
SUM(TOKEN_COUNT) SUM(LENGTH(TOKEN_INFO))
---------------- -----------------------
6963707 37472389
Elapsed: 00:00:27.31
SUM(TOKEN_COUNT) SUM(LENGTH(TOKEN_INFO))
---------------- -----------------------
6587719 35349528
Elapsed: 00:00:27.77
SUM(LENGTH(TOKEN_TEXT))
-----------------------
8994078
Elapsed: 00:00:03.74
SUM(LENGTH(TOKEN_TEXT))
-----------------------
8855116
Elapsed: 00:00:03.21
SUM(ID)
----------
9.5001E+10
Elapsed: 00:00:01.74
SUM(ROW_NO)
-----------
231
Elapsed: 00:00:00.12
SUM(ROW_NO)
-----------
231
OBJECT_NAME OBJECT_TYPE OWNER NUMBER_OF_BLOCKS
------------------------------ --------------- --------------- ----------------
DR#TEST_IDX0009$R TABLE TESTUSER 2
DR#TEST_IDX0010$R TABLE TESTUSER 2
TEST TABLE PARTITION TESTUSER 612
DR#TEST_IDX0009$X INDEX TESTUSER 5,905
DR#TEST_IDX0010$X INDEX TESTUSER 6,018
DR#TEST_IDX0009$I TABLE TESTUSER 14,607
DR#TEST_IDX0010$I TABLE TESTUSER 15,090
7 rows selected.
We're going to try to put all those tables into the keep pool, so we need to know how much space they're taking to decide how big to make our keep pool. We can sum up the NUMBER_OF_BLOCKS column above, or use:
SELECT COUNT(*) TOTAL_BLOCKS, COUNT(*)*8/1024 MEGABYTES
FROM DBA_OBJECTS o, V$BH bh
WHERE o.DATA_OBJECT_ID = bh.OBJD
AND o.OWNER NOT IN
('SYS', 'SYSTEM', 'SYSMAN', 'XDB', 'IX', 'WMSYS', 'CTXSYS')
ORDER BY COUNT(*);
TOTAL_BLOCKS MEGABYTES
------------ ----------
49503 386.742188
ALTER SYSTEM SET DB_CACHE_SIZE = 500M SCOPE=SPFILE;
-- Down from 1G before
ALTER SYSTEM SET DB_KEEP_CACHE_SIZE = 400M SCOPE=SPFILE;
And then of course I'll bounce the database.
alter table DR#TEST_IDX0010$I storage (buffer_pool keep);
alter table DR#TEST_IDX0009$I storage (buffer_pool keep);
alter index DR#TEST_IDX0010$X storage (buffer_pool keep);
alter index DR#TEST_IDX0009$X storage (buffer_pool keep);
alter table DR#TEST_IDX0010$R storage (buffer_pool keep);
alter table DR#TEST_IDX0009$R storage (buffer_pool keep);
alter table DR#TEST_IDX0010$R storage (buffer_pool keep);
alter table DR#TEST_IDX0009$R storage (buffer_pool keep);
alter table TEST modify partition p10 storage (buffer_pool keep);
alter table TEST modify partition p9 storage (buffer_pool keep);
We'll run these statements, bounce our database, then run our pre-loading queries again.
It would be handy if we could use the query against V$BH to make sure that the tables ARE in fact in the keep pool, but this information is not stored in V$BH - the various buffer caches are considered together.
@?/rdbms/admin/utlxplan.sql -- avoid tkprof plan table error
alter session set sql_trace=true;
-- Use a bind variable to cut down on recursive SQL from optimizer:
variable bnd varchar2(4000)
exec :bnd := 'district and region'
select /* FIRST_ROWS */ * from (
select id, filename from test
where contains (filename, :bnd) > 0
order by id desc )
where rownum <= 20;
tkprof mysid_ora_01234.trc traceout.txt sort=exeela table=testuser.plan_table
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.02 0.03 0 402 0 0
Fetch 3 0.00 0.11 34 65 0 20
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5 0.02 0.15 34 467 0 20
Now if we look at the end of the file, we will see:
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.00 0 0 0 0
Execute 3 0.03 0.06 0 402 0 1
Fetch 3 0.00 0.11 34 65 0 20
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 8 0.03 0.17 34 467 0 21
Misses in library cache during parse: 2
Misses in library cache during execute: 2
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 41 0.02 0.01 0 0 0 0
Execute 66 0.08 0.08 0 543 0 7
Fetch 59 0.00 0.02 2 190 0 61
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 166 0.10 0.12 2 733 0 68
select /*+ rule */ bucket, endpoint, col#, epvalue
from
histgrm$ where obj#=:1 and intcol#=:2 and row#=:3 order by bucket
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.01 2 6 0 11
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5 0.00 0.01 2 6 0 11
That leaves us with 32 unexplained disk reads from our main SQL statement.
To figure out where these come from, we need to dig a bit deeper.
We'll bounce the database again, re-preload the cache, but this time set an event flag before running our query:
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';
To extract just the reads, we could do:
grep WAIT sid_ora_1234.trc | grep read
WAIT #15: nam='db file sequential read' ela= 53 p1=1 p2=55741 p3=1
WAIT #15: nam='db file sequential read' ela= 29 p1=1 p2=55740 p3=1
WAIT #1: nam='db file sequential read' ela= 35 p1=7 p2=168890 p3=1
WAIT #1: nam='db file scattered read' ela= 63 p1=7 p2=191027 p3=2
WAIT #1: nam='db file scattered read' ela= 53 p1=7 p2=191030 p3=3
WAIT #1: nam='db file scattered read' ela= 68 p1=7 p2=191034 p3=4
WAIT #1: nam='db file scattered read' ela= 50 p1=7 p2=191038 p3=3
WAIT #1: nam='db file sequential read' ela= 7 p1=7 p2=191067 p3=1
WAIT #1: nam='db file sequential read' ela= 6 p1=7 p2=206475 p3=1
WAIT #1: nam='db file sequential read' ela= 23 p1=7 p2=206506 p3=1
WAIT #1: nam='db file scattered read' ela= 59 p1=7 p2=206552 p3=3
WAIT #1: nam='db file scattered read' ela= 104 p1=7 p2=206555 p3=6
WAIT #1: nam='db file sequential read' ela= 16 p1=7 p2=150156 p3=1
WAIT #1: nam='db file sequential read' ela= 8 p1=7 p2=150162 p3=1
WAIT #1: nam='db file sequential read' ela= 8 p1=7 p2=169084 p3=1
WAIT #1: nam='db file sequential read' ela= 9 p1=7 p2=184356 p3=1
WAIT #1: nam='db file sequential read' ela= 9 p1=7 p2=184461 p3=1
WAIT #1: nam='db file sequential read' ela= 23 p1=7 p2=77 p3=1
WAIT #1: nam='db file sequential read' ela= 21 p1=7 p2=5434 p3=1
WAIT #1: nam='db file sequential read' ela= 19 p1=7 p2=5545 p3=1
WAIT #1: nam='db file sequential read' ela= 20 p1=7 p2=5614 p3=1
In order to translate these file and block numbers into useful information, we need to run check DBA_EXTENTS, like this:
column owner format a30
column segment_name format a30
select owner, segment_name
from dba_extents
where &1 = file_id
and &2 between block_id and block_id + blocks - 1
/
seg.sql
(together with
SET VERIFY OFF and any necessary column formatting) and then do:
cat $ORACLE_HOME/admin/$ORACLE_SID/udump/`ls -tr $ORACLE_HOME/admin/$ORACLE_SID/udump | tail -1` \
| sed -n -e 's/^WAIT.*read.*p1=\(.*\).*p2=\(.*\) .*/\1 \2/p' \
| xargs -l1 sqlplus -s "/ as sysdba" @seg.sql
The start of the output might look like this:
OWNER SEGMENT_NAME
---------- ------------------------------
SYS C_OBJ#_INTCOL#
OWNER SEGMENT_NAME
---------- ------------------------------
TESTUSER SYS_IL0000056876C00002$$
OWNER SEGMENT_NAME
---------- ------------------------------
TESTUSER SYS_LOB0000056876C00002$$
select segment_name, segment_type from dba_segments
where segment_name in ('SYS_IL0000056876C00002$$', 'SYS_LOB0000056876C00002$$');
SEGMENT_NAME SEGMENT_TYPE
------------------------------ ------------------
SYS_IL0000056876C00002$$ LOBINDEX
SYS_LOB0000056876C00002$$ LOBSEGMENT
select table_name from dba_indexes where owner = 'TESTUSER' and index_name = 'SYS_IL0000056876C00002$$';
TABLE_NAME
------------------------------
DR#TEST_IDX0010$R
select table_name from dba_lobs where owner = 'TESTUSER' and segment_name = 'SYS_LOB0000056876C00002$$';
TABLE_NAME
------------------------------
DR#TEST_IDX0010$R
In order to understand that we need to know a bit more about LOBs. LOBs can be either In-Line or Out-of-Line, meaning that the actual LOB is stored either in the same segment as the rest of the table blocks, or is stored its own segment. By default, lobs are stored in-line. However, if the LOB size exceeds 4K, it will automatically be stored out-of-line. You can force all blocks to be stored out-of-line using the LOB storage clause "DISABLE STORAGE IN ROW".
In the case of the $I tables, the LOB lengths are restricted by the Text code to 4K, to make sure they can always be stored in-line to minimize I/O requirements. Hence there should never be any requirement for separate LOB segments for the $I tables.
However, in the case of $R tables, things are different. The $R table consists of a BLOB (or BLOBs) which are basically a list of ROWIDs for the base table, ordered by DOCID. In a small table with less than about 220 rows, this BLOB will be less than 4K and will be stored in-line. However when it exceeds that size, it will me moved to out-of-line storage in a new segment.
To ensure that we get the $R LOB segments pre-cached in the keep pool, we need to do two things. First, we need to change our ALTER TABLE statements for the $R tables:
alter table DR#TEST_IDX0010$R storage (buffer_pool keep) modify lob (data) (storage (buffer_pool keep));
alter table DR#TEST_IDX0009$R storage (buffer_pool keep) modify lob (data) (storage (buffer_pool keep));
Secondly, we need to force the LOB blocks to be loaded, as well as the main table blocks for the $R. This is a little more complicated. We might think we could do:
SELECT SUM(ROW_NO), SUM(LENGTH(DATA)) FROM DR#TEST_IDX0010$R; /* WRONG!! */
set serveroutput on size 1000000
create or replace procedure loadAllDollarR (idx_name varchar2) is
v_idx_name varchar2(30) := upper(idx_name);
type c_type is ref cursor;
c2 c_type;
s varchar2(2000);
b blob;
buff varchar2(100);
siz number;
off number;
cntr number;
begin
for c1 in (select table_name t from user_tables
where table_name like 'DR_'||v_idx_name||'%$R') loop
dbms_output.put_line('loading from table '||c1.t);
s := 'select data from '||c1.t;
open c2 for s;
loop
fetch c2 into b;
exit when c2%notfound;
siz := 10;
off := 1;
cntr := 0;
if dbms_lob.getlength(b) > 0 then
begin
loop
dbms_lob.read(b, siz, off, buff);
cntr := cntr + 1;
off := off + 4096;
end loop;
exception when no_data_found then
if cntr > 0 then
dbms_output.put_line('4K chunks fetched: '||cntr);
end if;
end;
end if;
end loop;
end loop;
end;
/
exec LoadAllDollarR('test_idx')
So now we have everything we need to preload the first two partitions of our index.
-- This script preloads the last two partitions from a 10-way
-- partitioned table and its text index. The table is called TEST,
-- the index is called TEST_IDX. It is owned by user TESTUSER.
connect testuser/testuser
--------------------------------------------------------
-- First make sure all tables go into the keep buffer --
--------------------------------------------------------
alter table DR#TEST_IDX0010$I storage (buffer_pool keep);
alter table DR#TEST_IDX0009$I storage (buffer_pool keep);
alter index DR#TEST_IDX0010$X storage (buffer_pool keep);
alter index DR#TEST_IDX0009$X storage (buffer_pool keep);
alter table DR#TEST_IDX0010$R storage (buffer_pool keep);
alter table DR#TEST_IDX0009$R storage (buffer_pool keep);
alter table DR#TEST_IDX0010$R storage (buffer_pool keep)
modify lob (data) (storage (buffer_pool keep));
alter table DR#TEST_IDX0009$R storage (buffer_pool keep)
modify lob (data) (storage (buffer_pool keep));
alter table TEST storage (buffer_pool keep);
set timing on
--------------------------------------------------------------
-- Then perform the necessary queries to preload the tables --
--------------------------------------------------------------
SELECT /*+ FULL(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO))
FROM DR#TEST_IDX0010$I ITAB;
SELECT /*+ FULL(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO))
FROM DR#TEST_IDX0009$I ITAB;
SELECT /*+ INDEX(ITAB) */ SUM(LENGTH(TOKEN_TEXT))
FROM DR#TEST_IDX0010$I ITAB;
SELECT /*+ INDEX(ITAB) */ SUM(LENGTH(TOKEN_TEXT))
FROM DR#TEST_IDX0009$I ITAB;
SELECT /*+ FULL(BTAB) */ SUM(ID)
FROM TEST WHERE ID >= 900000;
SELECT SUM(ROW_NO) FROM DR#TEST_IDX0010$R;
SELECT SUM(ROW_NO) FROM DR#TEST_IDX0009$R;
exec LoadAllDollarR('test_idx')
If we run this after bouncing the database and before running our test query, we should find that now there are NO reads from the $R tables at all. In fact, in my testcase I found the system reads went away as well, and my query now completed without any physical I/O at all.
The author welcomes feedback and corrections on this paper. Email him at roger.ford@oracle.com
Last modified: 31 August 2004 by Roger Ford