Oracle Database 12c Attribute Clusters and Zone Maps

Overview

Purpose

This tutorial covers Oracle Database 12c attribute clusters and zone maps, features designed to offer significant IO reduction for queries that would otherwise perform full table scans.

Time to Complete

Approximately 45 Minutes

Introduction

Attribute clustering is a table-level directive that clusters data in close physical proximity based on the content of certain columns. Storing data that logically belongs together in close physical proximity can greatly reduce the amount of data to be processed and can lead to better performance of certain queries in the workload.

A zone map is a independent access structure that can be built for a table. During table and index scans, zone maps enable you to prune disk blocks of a table and (potentially full partitions of a partitioned table) based on predicates on table columns. Zone maps do this by maintaining a list of minimum and maximum column values for each zone (or range) of blocks in a table. Zone maps do this for partitions and sub-partitions too. If columns with common attributes are clustered together, it becomes possible to minimize the number of zones that need to be scanned in order to find a particular predicate match. For this reason, the effectiveness of zone maps is improved if rows are clustered together using attribute clustering or if they are manually sorted on load (using, for example, an ETL process that includes a sort). Zone maps can be used with or without attribute clustering.

In contrast to traditional clustering methods, attribute clusters have the capability to cluster data in fact tables based on dimension table attribute values. This has wide application, but it is particularly useful in Data Warehousing, star schema environments. It is possible to reduce significantly the number of fact table blocks that need to be scanned during joins that filter on dimension attribute values, including dimension attribute value hierarchies. Zone maps can be used as an alternative to bitmap indexes.

Scenario

This tutorial implements a simple scenario to demonstrate how the features in Oracle Database 12c are designed to offer significant IO reduction for queries that would otherwise perform full table scans. You can find below a brief overview of tasks to be performed in this tutorial:

Create a SALES_SOURCE fact (this is used as a source of data for the fact tables created later in this tutorial).
Create two dimension tables: PRODUCTS and LOCATIONS.
Create a join attribute clustered table called SALES_AC, clustering it using product and location attribute values.
For comparative purposes, create a non-attribute clustered table called SALES.
Examine the behavior of index range scans on the attribute clustered fact table without using zone maps.
Remove fact table indexes and observe IO pruning using zone maps.
Create a partitioned table, SALES_P, and observe zone and partition pruning (for example, using predicates on columns that are not included in the partition key).
Invalidating and refreshing zone maps.

Important: Note that the generated data used in the tutorial is pseudo-random, so your query results will not match the example output exactly. A 16K block size was used when creating the sample output so the database statistic values you see will reflect differences in proportion to the block size you are using.

Hardware and Software Requirements

Oracle Database 12c
Zone maps require Oracle Exadata

Prerequisites

Before starting this tutorial, you should:

Have access to or have Installed Oracle Database 12c with a sample ORCL database, the SYS user with SYSDBA privilege and OS authentication (so that you can execute the sqlplus / as sysdba command).
Have adequate space allocated to USERS tablespace. This example uses the USERS tablespace, included in Oracle Database 12c. To demonstrate these features adequately, reasonably large tables are required
so approximately 1GB is required in the USERS tablespace.
Have downloaded and unzipped the 12c_aczm.zip file (which is in the files subdirectory of this tutorial) into a working directory.

Assumptions

This tutorial assumes that when you begin executing steps for a topic, you complete the entire topic before going to another one.

You can also re-execute the tutorial setup from step number one (01_setup12c.sql): it will drop the aczm12c user and all its objects and then recreates it.

Setup

Set up the tutorial by running the following scripts. These scripts do not form part of the tutorial, they create the database user and the source table data.

01_setup12c.sql
02_table_create.sql
03_dim_fill.sql
04_source_fill.sql

Attribute Clustering

In this section, you will create a couple of fact tables. SALES fact table will not have attribute clustering or a zone map. It will be used to compare against SALES_AC table, which will have attribute clustering and/or zone maps. Accomplish the task by following the steps below or you can use automated script 05_create_fact.sql.

The first step involves connecting to the Attribute Clusters/Zone Map Schema and creating the SALES fact table.

SQL> connect aczm12c/oracle_4U
Connected.
SQL>

SQL> CREATE TABLE sales
  2  AS
  3  SELECT * FROM sales_source
  4  WHERE 1 = -1
  5  /

Table created.

The second step involves creating SALES_AC fact table. The data will be the same as SALES but it will be used to demonstrate attribute clustering and zone maps in comparison to the standard SALES table.

View Code
```
SQL> CREATE TABLE sales_ac
  2  AS
  3  SELECT * FROM sales_source
  4  WHERE 1 = -1
  5  /

Table created. 
```
The third step involves the enabling of linear ordered attribute clustering on SALES_AC table. The code will simply order rows by location_id, product_id to see the effects of attribute clustering in isolation, the code will not create a zone map.

View Code
```
SQL> ALTER TABLE sales_ac
  2  ADD CLUSTERING BY LINEAR ORDER (location_id, product_id)
  3  WITHOUT MATERIALIZED ZONEMAP
  4  /

Table altered. 
```

Next step involves the inserting of data into standard table SALES. The insert execution plan is a simple insert.

SQL> INSERT /*+ APPEND */ INTO sales SELECT * FROM sales_source
  2  /

1952120 rows created.

Elapsed: 00:00:04.17
SQL> 
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor)
  2  /

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  6magg9dfwqvjt, child number 0
-------------------------------------
INSERT /*+ APPEND */ INTO sales SELECT * FROM sales_source

Plan hash value: 1422891252

-------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                 |              |       |       |   132 (100)|          |
|   1 |  LOAD AS SELECT                  |              |       |       |            |          |
|   2 |   OPTIMIZER STATISTICS GATHERING |              |  1952K|   100M|   132  (16)| 00:00:01 |
|   3 |    TABLE ACCESS STORAGE FULL     | SALES_SOURCE |  1952K|   100M|   132  (16)| 00:00:01 |
-------------------------------------------------------------------------------------------------


15 rows selected.

Elapsed: 00:00:00.12
SQL> COMMIT
  2  /

Commit complete.

Elapsed: 00:00:00.02
SQL>

The next step involves the Inserting data into attribute clustered table. You must use a direct path operation to make use of attribute clustering. In real systems you will probably insert in multiple batches: each batch of inserts will be ordered appropriately. Later on, if you want to re-order all rows into tightly grouped zones you can, for example, use partitioning and MOVE PARTITION to do this. Increased elapsed time is likely due to the sort that is transparently performed to cluster the data as it is inserted into the SALES_AC table.

View Code
```
 SQL> INSERT /*+ APPEND */ INTO sales_ac SELECT * FROM sales_source
  2  /

1952120 rows created.

Elapsed: 00:00:05.64
SQL>
```

Observe the addition of "SORT ORDER BY" in the execution plan.

 SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor)
  2  /

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  8wzqwqgwwynya, child number 0
-------------------------------------
INSERT /*+ APPEND */ INTO sales_ac SELECT * FROM sales_source

Plan hash value: 3351020411

---------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                 |              |       |       |       | 12803 (100)|          |
|   1 |  LOAD AS SELECT                  |              |       |       |       |            |          |
|   2 |   OPTIMIZER STATISTICS GATHERING |              |  1952K|   100M|       | 12803   (1)| 00:00:02 |
|   3 |    SORT ORDER BY                 |              |  1952K|   100M|   149M| 12803   (1)| 00:00:02 |
|   4 |     TABLE ACCESS STORAGE FULL    | SALES_SOURCE |  1952K|   100M|       |   132  (16)| 00:00:01 |
---------------------------------------------------------------------------------------------------------


16 rows selected.

Elapsed: 00:00:00.03
SQL> COMMIT
  2  /

Commit complete.

Elapsed: 00:00:00.02
SQL>

The final step involves the gathering of table statistics.

SQL> EXECUTE dbms_stats.gather_table_stats(ownname=>NULL,tabname=>'sales');

PL/SQL procedure successfully completed.

SQL> EXECUTE dbms_stats.gather_table_stats(ownname=>NULL,tabname=>'sales_ac');

PL/SQL procedure successfully completed.

Attribute clusters can be used without zone maps. By themselves, there is no scan IO pruning (other than via Exadata storage indexes). However, index range scans can benefit from improved performance where index columns match attribute cluster columns. In many cases, attribute clusters bring common data values together and make them local to one another. This can benefit compression ratios for row-based compression in particular.

The full potential of attribute clusters are realized when used in conjunction with zone maps, Exadata storage indexes and In-Memory min/max pruning. However, they also improve index clustering. This is demonstrated here. The steps can be performed manually or the script 06_ac_only.sql can be used.

The first step involves connecting to the Attribute Clusters/Zone Map Schema.

View Code
```
SQL> connect aczm12c/oracle_4U
Connected.
SQL> 
```

The second step involves the creating of indexes on location id for the standard SALES table and the attribute clustered SALES_AC table.

SQL> 
SQL> CREATE INDEX sales_loc_i ON sales (location_id)
  2  /

Index created.

SQL> 
SQL> CREATE INDEX sales_ac_loc_i ON sales_ac (location_id)
  2  /

Index created.

SQL>

Observe the improved value of "Average Data Blocks Per Key" for the attribute clustered SALES_AC table. This will result in fewer consistent gets for table lookups from index range scans.

SQL> 
SQL> SELECT index_name, clustering_factor,avg_data_blocks_per_key
  2  FROM   user_indexes
  3  WHERE  index_name LIKE 'SALES%LOC%'
  4  ORDER BY index_name
  5  /

INDEX_NAME                               CLUSTERING_FACTOR AVG_DATA_BLOCKS_PER_KEY
---------------------------------------- ----------------- -----------------------
SALES_AC_LOC_I                                        7950                       2
SALES_LOC_I                                        1877968                     597

2 rows selected.

SQL>

Confirm that index range scans are occurring in both query examples Hints are used in this case because the table is relatively small so Exadata may choose a bloom filter plan.

SQL> SELECT /*+ INDEX(sales sales_loc_i) */ SUM(amount)
  2  FROM   sales
  3  JOIN   locations  ON (sales.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  brscsf8u3nmym, child number 0
-------------------------------------
SELECT /*+ INDEX(sales sales_loc_i) */ SUM(amount) FROM   sales JOIN
locations  ON (sales.location_id = locations.location_id) WHERE
locations.state  = 'California' AND    locations.county = 'Alpine
County'

Plan hash value: 1895572434

---------------------------------------------------------------------------------------------
| Id  | Operation                     | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             |       |       |   602 (100)|          |
|   1 |  SORT AGGREGATE               |             |     1 |    39 |            |          |
|   2 |   NESTED LOOPS                |             |   621 | 24219 |   602   (0)| 00:00:01 |
|   3 |    NESTED LOOPS               |             |   621 | 24219 |   602   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL | LOCATIONS   |     1 |    30 |     2   (0)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN          | SALES_LOC_I |   621 |       |     2   (0)| 00:00:01 |
|   6 |    TABLE ACCESS BY INDEX ROWID| SALES       |   621 |  5589 |   600   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - storage(("LOCATIONS"."COUNTY"='Alpine County' AND
              "LOCATIONS"."STATE"='California'))
       filter(("LOCATIONS"."COUNTY"='Alpine County' AND
              "LOCATIONS"."STATE"='California'))
   5 - access("SALES"."LOCATION_ID"="LOCATIONS"."LOCATION_ID")

Note
-----
   - dynamic statistics used: dynamic sampling (level=7)
   - this is an adaptive plan


35 rows selected.

SQL> 
SQL> SELECT /*+ INDEX(sales_ac sales_ac_loc_i) */ SUM(amount)
  2  FROM   sales_ac
  3  JOIN   locations  ON (sales_ac.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dn138c8mn4h90, child number 0
-------------------------------------
SELECT /*+ INDEX(sales_ac sales_ac_loc_i) */ SUM(amount) FROM
sales_ac JOIN   locations  ON (sales_ac.location_id =
locations.location_id) WHERE  locations.state  = 'California' AND
locations.county = 'Alpine County'

Plan hash value: 918547280

------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                |       |       |     7 (100)|          |
|   1 |  SORT AGGREGATE               |                |     1 |    39 |            |          |
|   2 |   NESTED LOOPS                |                |   621 | 24219 |     7   (0)| 00:00:01 |
|   3 |    NESTED LOOPS               |                |   621 | 24219 |     7   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL | LOCATIONS      |     1 |    30 |     2   (0)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN          | SALES_AC_LOC_I |   621 |       |     2   (0)| 00:00:01 |
|   6 |    TABLE ACCESS BY INDEX ROWID| SALES_AC       |   621 |  5589 |     5   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - storage(("LOCATIONS"."COUNTY"='Alpine County' AND
              "LOCATIONS"."STATE"='California'))
       filter(("LOCATIONS"."COUNTY"='Alpine County' AND
              "LOCATIONS"."STATE"='California'))
   5 - access("SALES_AC"."LOCATION_ID"="LOCATIONS"."LOCATION_ID")

Note
-----
   - dynamic statistics used: dynamic sampling (level=7)
   - this is an adaptive plan


35 rows selected.

SQL>

Run two test queries to cache all relevant data.

SQL> SELECT SUM(amount)
  2  FROM   sales
  3  JOIN   locations  ON (sales.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.

SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  JOIN   locations  ON (sales_ac.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.

SQL>

Run queries again and observe the reduced number of consistent gets for the attribute cluster example.

SQL> SET AUTOTRACE ON STATISTICS
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  JOIN   locations  ON (sales.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       7979  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  JOIN   locations  ON (sales_ac.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.


Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
         45  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> SET AUTOTRACE OFF
SQL>

Drop the test indexes.

SQL> DROP INDEX sales_loc_i
  2  /

Index dropped.

SQL> 
SQL> DROP INDEX sales_ac_loc_i
  2  /

Index dropped.

SQL>

Linear and Interleaved Zone Maps

In this section you will explore how the zone maps prune IO without the use of indexes. You will also discover the benifits of using zone maps over conventional methods. Perform the below mentioned steps manually or you can use the automated script 07_zm_lin_inter.sql.

Connect to the Attribute Clusters/Zone Map Schema and clean up the indexes, zonemaps and clustering if any from the previous runs.

SQL> connect aczm12c/oracle_4U
Connected.
SQL> 
SQL> DROP INDEX sales_loc_i
  2  /
DROP INDEX sales_loc_i
           *
ERROR at line 1:
ORA-01418: specified index does not exist


SQL> DROP INDEX sales_ac_loc_i
  2  /
DROP INDEX sales_ac_loc_i
           *
ERROR at line 1:
ORA-01418: specified index does not exist


SQL> DROP MATERIALIZED ZONEMAP sales_ac_zmap
  2  /
DROP MATERIALIZED ZONEMAP sales_ac_zmap
*
ERROR at line 1:
ORA-12003: materialized view or zonemap "ACZM12C"."SALES_AC_ZMAP" does not exist


SQL> DROP MATERIALIZED ZONEMAP zmap$_sales_ac
  2  /
DROP MATERIALIZED ZONEMAP zmap$_sales_ac
*
ERROR at line 1:
ORA-12003: materialized view or zonemap "ACZM12C"."ZMAP$_SALES_AC" does not exist


SQL> ALTER TABLE sales_ac DROP CLUSTERING
  2  /

Table altered.

Create a zone map on the SALES_AC table using the same attribute clustering as before. We do not need to re-organize or move the data in the table as we are using the same clustering as before.

View Code
```
SQL> ALTER TABLE sales_ac
  2  ADD CLUSTERING BY LINEAR ORDER (location_id, product_id)
  3  WITH MATERIALIZED ZONEMAP
  4  /

Table altered.

SQL>
```

Observe the differences in the plan between the conventional SALES table and the attribute clusterd table SALES_AC with a zone map.

SQL> SELECT SUM(amount)
  2  FROM   sales
  3  WHERE  location_id = 50
  4  /

SUM(AMOUNT)
-----------
  629560.92

1 row selected.

SQL> 
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d4538r3rcjr6t, child number 0
-------------------------------------
SELECT SUM(amount) FROM   sales WHERE  location_id = 50

Plan hash value: 1047182207

------------------------------------------------------------------------------------
| Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |       |       |       |   134 (100)|          |
|   1 |  SORT AGGREGATE            |       |     1 |     9 |            |          |
|*  2 |   TABLE ACCESS STORAGE FULL| SALES |   621 |  5589 |   134  (17)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - storage("LOCATION_ID"=50)
       filter("LOCATION_ID"=50)


20 rows selected.

SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  WHERE  location_id = 50
  4  /

SUM(AMOUNT)
-----------
  629560.92

1 row selected.

SQL> 
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  319c3616hqy3f, child number 0
-------------------------------------
SELECT SUM(amount) FROM   sales_ac WHERE  location_id = 50

Plan hash value: 1269548508

----------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |          |       |       |   134 (100)|          |
|   1 |  SORT AGGREGATE                         |          |     1 |     9 |            |          |
|*  2 |   TABLE ACCESS STORAGE FULL WITH ZONEMAP| SALES_AC |   621 |  5589 |   134  (17)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - storage("LOCATION_ID"=50)
       filter((SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN
              BITAND(zm."ZONE_STATE$",1)=1 THEN 1 ELSE CASE WHEN (zm."MIN_1_LOCATION_ID" > :1 OR
              zm."MAX_1_LOCATION_ID" < :2) THEN 3 ELSE 2 END END FROM "ACZM12C"."ZMAP$_SALES_AC" zm WHERE
              zm."ZONE_LEVEL$"=0 ORDER BY zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),50,50)<3 AND
              "LOCATION_ID"=50))


24 rows selected.

SQL>

Observe the IO differences for the zone mapped table. You may see on Exadata that storage indexes eliminate some of the IO too since this feature can work in combination with zone maps. Use an attribute cluster with LINEAR ordering. Since we are using attribute cluster with LINEAR ordering we should use predicates on location_id or location_id and product_id.

**Conventional scan**
SQL> SET AUTOTRACE ON STATISTICS
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  WHERE  location_id = 50
  4  /

SUM(AMOUNT)
-----------
  629560.92

1 row selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       7164  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**With zone map**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  WHERE  location_id = 50
  4  /

SUM(AMOUNT)
-----------
  629560.92

1 row selected.


Statistics
----------------------------------------------------------
        257  recursive calls
          0  db block gets
        441  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         21  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**Conventional scan**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  WHERE  location_id = 50
  4  AND    product_id = 10
  5  /

SUM(AMOUNT)
-----------
   15760.58

1 row selected.


Statistics
----------------------------------------------------------
          9  recursive calls
          0  db block gets
       7401  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**With zone map**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  WHERE  location_id = 50
  4  AND    product_id = 10
  5  /

SUM(AMOUNT)
-----------
   15760.58

1 row selected.


Statistics
----------------------------------------------------------
        209  recursive calls
          0  db block gets
        652  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          9  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

A predicate on product_id alone will not prune effectively. INTERLEAVED ordering removes this limitation.


**Conventional scan**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  WHERE  product_id = 10
  4  /

SUM(AMOUNT)
-----------
 72137333.1

1 row selected.


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       7164  consistent gets
          0  physical reads
          0  redo size
        549  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**With zone map - but no pruning on product_id alone**

SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  WHERE  product_id = 10
  4  /

SUM(AMOUNT)
-----------
 72137333.1

1 row selected.


Statistics
----------------------------------------------------------
        139  recursive calls
          0  db block gets
       7311  consistent gets
          0  physical reads
          0  redo size
        549  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          9  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> SET AUTOTRACE OFF
SQL>

Drop the attribute cluster because we created the zone map at the same time as the cluster, the zone map will be dropped too.

View Code
```
SQL> ALTER TABLE sales_ac DROP CLUSTERING
  2  /

Table altered.

SQL> TRUNCATE TABLE sales_ac
  2  /

Table truncated.

SQL>  
```

Enable interleaved join attribute clustering on SALES_AC table.

SQL> ALTER TABLE sales_ac
  2  ADD CLUSTERING sales_ac
  3  JOIN locations ON (sales_ac.location_id = locations.location_id)
  4  JOIN products  ON (sales_ac.product_id = products.product_id)
  5  BY INTERLEAVED ORDER ((locations.state, locations.county),products.product_name)
  6  WITHOUT MATERIALIZED ZONEMAP
  7  /

Table altered.

Create the zone map sales_ac_zmap.

SQL> CREATE MATERIALIZED ZONEMAP sales_ac_zmap
  2  AS
  3  SELECT SYS_OP_ZONE_ID(s.rowid),
  4         MIN(l.state) min_state,
  5         MAX(l.state) max_state,
  6         MIN(l.county) min_county,
  7         MAX(l.county) max_county,
  8         MIN(p.product_name) min_prod,
  9         MAX(p.product_name) max_prod
 10  FROM sales_ac s,
 11       locations l,
 12       products p
 13  WHERE s.location_id = l.location_id(+)
 14  AND   s.product_id = p.product_id(+)
 15  GROUP BY SYS_OP_ZONE_ID(s.rowid)
 16  /

Materialized zonemap created.

SQL>

Insert data and observe that the sorts and joins are performed to cluster data in the SALES_AC table. The direct path insert operation will maintain the zone map for us.

SQL> INSERT /*+ APPEND */ INTO sales_ac SELECT * FROM sales_source
  2  /

1952120 rows created.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor)
  2  /

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  8wzqwqgwwynya, child number 0
-------------------------------------
INSERT /*+ APPEND */ INTO sales_ac SELECT * FROM sales_source

Plan hash value: 2808338825

---------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                 |              |       |       |       | 21290 (100)|          |
|   1 |  LOAD AS SELECT                  |              |       |       |       |            |          |
|   2 |   OPTIMIZER STATISTICS GATHERING |              |  1952K|   180M|       | 21290   (1)| 00:00:02 |
|   3 |    SORT ORDER BY                 |              |  1952K|   180M|   202M| 21290   (1)| 00:00:02 |
|*  4 |     HASH JOIN RIGHT OUTER        |              |  1952K|   180M|       |   147  (22)| 00:00:01 |
|   5 |      TABLE ACCESS STORAGE FULL   | LOCATIONS    |  3143 | 94290 |       |     2   (0)| 00:00:01 |
|*  6 |      HASH JOIN RIGHT OUTER       |              |  1952K|   124M|       |   140  (19)| 00:00:01 |
|   7 |       TABLE ACCESS STORAGE FULL  | PRODUCTS     |    28 |   364 |       |     2   (0)| 00:00:01 |
|   8 |       TABLE ACCESS STORAGE FULL  | SALES_SOURCE |  1952K|   100M|       |   132  (16)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("SALES_SOURCE"."LOCATION_ID"="LOCATIONS"."LOCATION_ID")
   6 - access("SALES_SOURCE"."PRODUCT_ID"="PRODUCTS"."PRODUCT_ID")


26 rows selected.

SQL> COMMIT
  2  /

Commit complete.

SQL> 
SQL> EXECUTE dbms_stats.gather_table_stats(ownname=>NULL,tabname=>'sales_ac');

PL/SQL procedure successfully completed.

SQL> 
 
SQL>

Compare the number of consistent gets of the zone map table against the standard table.


**Conventional**
SQL> SET AUTOTRACE ON STATISTICS
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  JOIN   locations  ON (sales.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  /

SUM(AMOUNT)
-----------
 35782231.8

1 row selected.


Statistics
----------------------------------------------------------
         29  recursive calls
          0  db block gets
       7222  consistent gets
          0  physical reads
          0  redo size
        549  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**With zone map**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  JOIN   locations  ON (sales_ac.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  /

SUM(AMOUNT)
-----------
 35782231.8

1 row selected.


Statistics
----------------------------------------------------------
        150  recursive calls
          0  db block gets
       4833  consistent gets
          0  physical reads
        124  redo size
        549  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**Conventional**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  JOIN   locations  ON (sales.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.


Statistics
----------------------------------------------------------
         61  recursive calls
          0  db block gets
       7281  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**With zone map**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  JOIN   locations  ON (sales_ac.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.


Statistics
----------------------------------------------------------
        152  recursive calls
          0  db block gets
       4851  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

Since interleaved ordering was used to cluster the table, predicates can be used in various combinations. In particular, pruning is still effective if product_name is used alone. Predicates for location dimensions do not need to be included.


**Conventional**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  JOIN   products  ON (sales.product_id = products.product_id)
  4  WHERE  products.product_name  = 'DATEPALM'
  5  /

SUM(AMOUNT)
-----------
 36446695.9

1 row selected.


Statistics
----------------------------------------------------------
         29  recursive calls
          0  db block gets
       7207  consistent gets
          0  physical reads
          0  redo size
        549  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**With zone map**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  JOIN   products  ON (sales_ac.product_id = products.product_id)
  4  WHERE  products.product_name  = 'DATEPALM'
  5  /

SUM(AMOUNT)
-----------
 36446695.9

1 row selected.


Statistics
----------------------------------------------------------
        149  recursive calls
          0  db block gets
       2729  consistent gets
          0  physical reads
          0  redo size
        549  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> SET AUTOTRACE OFF

Join and Index Pruning

A join zone map is defined on a table that has an outer join to one or more other tables and maintains the minimum and maximum values of some columns in the other tables; these join conditions are common in master-detail relationships as well as in star schemas between fact and dimension tables.

Joins are defined between a fact table and dimension tables with frequently-used predicates on the dimension hierarchy columns. Fact table rows can be ordered by dimension attribute values, pruning zones that are excluded by predicates on attribute values.

In this section you will explore the benefits of join zone map over conventional non-zone map methods. Perform the below mentioned steps manually or you can use automated script 08_zm_prune.sql

Connect to the Attribute Clusters/Zone Map Schema and clean up the zonemaps and clustering (if any) from the previous runs.


SQL> connect aczm12c/oracle_4U
Connected.
SQL> 
SQL> ALTER TABLE sales_ac DROP CLUSTERING
  2  /

Table altered.

SQL> DROP MATERIALIZED ZONEMAP sales_ac_zmap
  2  /

Materialized zonemap dropped.

SQL> DROP MATERIALIZED ZONEMAP  zmap$_sales_ac
  2  /
DROP MATERIALIZED ZONEMAP  zmap$_sales_ac
*
ERROR at line 1:
ORA-12003: materialized view or zonemap "ACZM12C"."ZMAP$_SALES_AC" does not exist


SQL>

Cluster the SALES_AC table again.

SQL> ALTER TABLE sales_ac
  2  ADD CLUSTERING sales_ac
  3  JOIN locations ON (sales_ac.location_id = locations.location_id)
  4  JOIN products  ON (sales_ac.product_id = products.product_id)
  5  BY INTERLEAVED ORDER ((locations.state, locations.county), products.product_name, sales_ac.location_id)
  6  WITHOUT MATERIALIZED ZONEMAP
  7  /

Table altered.

SQL>

Since you have changed the clustering columns, you need to re-organize the table. This can be achieved using a move operation.

View Code
```
SQL> ALTER TABLE sales_ac MOVE
  2  /

Table altered. 
```

Create the zone map sales_ac_zmap.

SQL> CREATE MATERIALIZED ZONEMAP sales_ac_zmap
  2  AS
  3  SELECT SYS_OP_ZONE_ID(s.rowid),
  4         MIN(l.state) min_state,
  5         MAX(l.state) max_state,
  6         MIN(l.county) min_county,
  7         MAX(l.county) max_county,
  8         MIN(p.product_name) min_prod,
  9         MAX(p.product_name) max_prod,
 10         MIN(s.location_id) min_loc,
 11         MAX(s.location_id) max_loc
 12  FROM sales_ac s,
 13       locations l,
 14       products p
 15  WHERE s.location_id = l.location_id(+)
 16  AND   s.product_id = p.product_id(+)
 17  GROUP BY SYS_OP_ZONE_ID(s.rowid)
 18  /

Materialized zonemap created.

SQL>

Reduced IO should be observed with zone mapped table when compared againest the non-zone mapped table (sales).

SQL> SET AUTOTRACE ON STATISTICS
SQL> 

**Conventional**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  JOIN   locations  ON (sales.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.


Statistics
----------------------------------------------------------
        218  recursive calls
          0  db block gets
      11884  consistent gets
          0  physical reads
        124  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**With zone map**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  JOIN   locations  ON (sales_ac.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.


Statistics
----------------------------------------------------------
        952  recursive calls
          0  db block gets
       7143  consistent gets
          0  physical reads
        328  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         13  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**Conventional**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  WHERE  location_id = 1000
  4  /

SUM(AMOUNT)
-----------
  621174.93

1 row selected.


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       7164  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**With zone map**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  WHERE  location_id = 1000
  4  /

SUM(AMOUNT)
-----------
  621174.93

1 row selected.


Statistics
----------------------------------------------------------
        112  recursive calls
          0  db block gets
       5603  consistent gets
          0  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> SET AUTOTRACE OFF
SQL>

Verify the plan table for Scan and join pruning.

SQL> SELECT SUM(amount)
  2  FROM   sales_ac
  3  JOIN   locations  ON (sales_ac.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.

SQL> 
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d6p2mb4qnacur, child number 4
-------------------------------------
SELECT SUM(amount) FROM   sales_ac JOIN   locations  ON
(sales_ac.location_id = locations.location_id) WHERE  locations.state
= 'California' AND    locations.county = 'Alpine County'

Plan hash value: 3649541064

-------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |           |       |       |   127 (100)|          |
|   1 |  SORT AGGREGATE                           |           |     1 |    39 |            |          |
|*  2 |   HASH JOIN                               |           |   612 | 23868 |   127  (19)| 00:00:01 |
|   3 |    JOIN FILTER CREATE                     | :BF0000   |     1 |    30 |     2   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL             | LOCATIONS |     1 |    30 |     2   (0)| 00:00:01 |
|   5 |    JOIN FILTER USE                        | :BF0000   |  1952K|    16M|   120  (16)| 00:00:01 |
|*  6 |     TABLE ACCESS STORAGE FULL WITH ZONEMAP| SALES_AC  |  1952K|    16M|   120  (16)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("SALES_AC"."LOCATION_ID"="LOCATIONS"."LOCATION_ID")
   4 - storage(("LOCATIONS"."COUNTY"='Alpine County' AND "LOCATIONS"."STATE"='California'))
       filter(("LOCATIONS"."COUNTY"='Alpine County' AND "LOCATIONS"."STATE"='California'))
   6 - storage(SYS_OP_BLOOM_FILTER(:BF0000,"SALES_AC"."LOCATION_ID"))
       filter((SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN
              BITAND(zm."ZONE_STATE$",1)=1 THEN 1 ELSE CASE WHEN ((ORA_RAWCOMPARE(zm."MIN_LOC",:1,8)>0 OR
              ORA_RAWCOMPARE(zm."MAX_LOC",:2,8)<0)) THEN 3 ELSE 2 END END FROM "ACZM12C"."SALES_AC_ZMAP" zm
              WHERE zm."ZONE_LEVEL$"=0 ORDER BY zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),SYSVARCOL,SYSVARCOL)<3
              AND SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN
              BITAND(zm."ZONE_STATE$",1)=1 THEN 1 ELSE CASE WHEN (zm."MIN_STATE" > :1 OR zm."MAX_STATE" < :2
              OR zm."MIN_COUNTY" > :3 OR zm."MAX_COUNTY" < :4) THEN 3 ELSE 2 END END FROM
              "ACZM12C"."SALES_AC_ZMAP" zm WHERE zm."ZONE_LEVEL$"=0 ORDER BY
              zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),'California','California','Alpine County','Alpine
              County')<3 AND SYS_OP_BLOOM_FILTER(:BF0000,"SALES_AC"."LOCATION_ID")))

Note
-----
   - dynamic statistics used: dynamic sampling (level=7)
   - statistics feedback used for this statement


43 rows selected.

Create index on SALES_AC(LOCATION_ID) and SALES(LOCATION_ID).

SQL> CREATE  INDEX sales_ac_loc_i on sales_ac(location_id)
  2  /

Index created.

SQL> CREATE  INDEX sales_loc_i on sales(location_id)
  2  /

Index created.

Index rowids can be pruned by zone. Execute the following statements and verify index pruning using the plan table.

SQL> SELECT sum(amount)
  2  FROM   sales_ac
  3  WHERE  location_id = 1000
  4  AND    order_item_number = 1
  5  /

SUM(AMOUNT)
-----------
   62181.06

1 row selected.

SQL> 
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  c17umdz5bsb78, child number 0
-------------------------------------
SELECT sum(amount) FROM   sales_ac WHERE  location_id = 1000 AND
order_item_number = 1

Plan hash value: 1816803117

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                |       |       |    31 (100)|          |
|   1 |  SORT AGGREGATE                              |                |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED WITH ZO| SALES_AC       |    30 |   360 |    31   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                          | SALES_AC_LOC_I |   621 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter((SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN
              BITAND(zm."ZONE_STATE$",1)=1 THEN 1 ELSE CASE WHEN (zm."MIN_LOC" > :1 OR zm."MAX_LOC" < :2) THEN 3
              ELSE 2 END END FROM "ACZM12C"."SALES_AC_ZMAP" zm WHERE zm."ZONE_LEVEL$"=0 ORDER BY
              zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),1000,1000)<3 AND "ORDER_ITEM_NUMBER"=1))
   3 - access("LOCATION_ID"=1000)


25 rows selected.

SQL> 
SQL> DROP  INDEX sales_ac_loc_i
  2  /

Index dropped.

SQL> DROP  INDEX sales_loc_i
  2  /

Index dropped.

SQL>

Zone Maps on Partitioned Tables

Zone maps keep partition-level information as well as zone-level information. This makes it possible to partition elimination on columns that are not included in partition keys or even on dimension attribute values. The likelihood of partition elimination is dependent on the level of correlation between zone map column values and the values in the partition key column (or columns). Perform the following tasks as illustration of the Zone Maps on Partitioned Tables or you can use automated script 09_part_zm.sql.

Connect to the Attribute Clusters/Zone Map Schema and drop the partitioned sales table (SALES_P) if it exists.

View Code
```
SQL> connect aczm12c/oracle_4U
Connected.
SQL> 
SQL> DROP TABLE sales_p
  2  /

Table dropped.

SQL>  
```

Create a range partitioned sales_p table with join attribute clustering and a zone map.

SQL> CREATE TABLE sales_p (
  2    order_id       number(20)     not null
  3  , order_item_number  number(3)  not null
  4  , sale_date      date       not null
  5  , delivered      date
  6  , sale_agent     varchar2(100)  not null
  7  , product_id     number(10)     not null
  8  , amount         number(10,2)   not null
  9  , quantity       number(5)  not null
 10  , location_id    number(20)     not null
 11  , warehouse      varchar2(100)  not null
 12  )
 13  CLUSTERING sales_p
 14  JOIN locations ON (sales_p.location_id = locations.location_id)
 15  JOIN products  ON (sales_p.product_id = products.product_id)
 16  BY INTERLEAVED ORDER ((locations.state, locations.county),products.product_name, sales_p.delivered)
 17  WITH MATERIALIZED ZONEMAP
 18  PARTITION BY RANGE(sale_date) (
 19   PARTITION p1 VALUES LESS THAN (to_date('2005-01-01','YYYY-MM-DD'))
 20  ,PARTITION p2 VALUES LESS THAN (to_date('2010-01-01','YYYY-MM-DD'))
 21  )
 22  /

Table created.

SQL>

Insert data from sales_source into sales_p table.

SQL> INSERT /*+ APPEND */ INTO sales_p SELECT * FROM sales_source
  2  /

1952120 rows created.

SQL> COMMIT
  2  /

Commit complete.

SQL>

Gather statistics on the sales_p table and confirm that the query plan includes a zone map filter.

SQL> EXECUTE dbms_stats.gather_table_stats(ownname=>NULL,tabname=>'sales_p');

PL/SQL procedure successfully completed.

SQL> 

SQL> SELECT SUM(amount)
  2  FROM   sales_p
  3  JOIN   locations  ON (sales_p.location_id = locations.location_id)
  4  WHERE  locations.state  = 'California'
  5  AND    locations.county  = 'Alpine County'
  6  /

SUM(AMOUNT)
-----------
  590776.51

1 row selected.

SQL> 
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  apmzythd75fjb, child number 0
-------------------------------------
SELECT SUM(amount) FROM   sales_p JOIN   locations  ON
(sales_p.location_id = locations.location_id) WHERE  locations.state  =
'California' AND    locations.county  = 'Alpine County'

Plan hash value: 421480268

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |           |       |       |   139 (100)|          |       |       |
|   1 |  SORT AGGREGATE                            |           |     1 |    39 |            |          |       |       |
|*  2 |   HASH JOIN                                |           |   621 | 24219 |   139  (18)| 00:00:01 |       |       |
|   3 |    JOIN FILTER CREATE                      | :BF0000   |     1 |    30 |     2   (0)| 00:00:01 |       |       |
|*  4 |     TABLE ACCESS STORAGE FULL              | LOCATIONS |     1 |    30 |     2   (0)| 00:00:01 |       |       |
|   5 |    JOIN FILTER USE                         | :BF0000   |  1952K|    16M|   131  (15)| 00:00:01 |       |       |
|   6 |     PARTITION RANGE ITERATOR               |           |  1952K|    16M|   131  (15)| 00:00:01 |KEY(ZM)|KEY(ZM)|
|*  7 |      TABLE ACCESS STORAGE FULL WITH ZONEMAP| SALES_P   |  1952K|    16M|   131  (15)| 00:00:01 |KEY(ZM)|KEY(ZM)|
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("SALES_P"."LOCATION_ID"="LOCATIONS"."LOCATION_ID")
   4 - storage(("LOCATIONS"."COUNTY"='Alpine County' AND "LOCATIONS"."STATE"='California'))
       filter(("LOCATIONS"."COUNTY"='Alpine County' AND "LOCATIONS"."STATE"='California'))
   7 - storage(SYS_OP_BLOOM_FILTER(:BF0000,"SALES_P"."LOCATION_ID"))
       filter((SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN BITAND(zm."ZONE_STATE$",1)=1 THEN
              1 ELSE CASE WHEN (zm."MIN_1_STATE" > :1 OR zm."MAX_1_STATE" < :2 OR zm."MIN_2_COUNTY" > :3 OR zm."MAX_2_COUNTY"
              < :4) THEN 3 ELSE 2 END END FROM "ACZM12C"."ZMAP$_SALES_P" zm WHERE zm."ZONE_LEVEL$"=0 ORDER BY
              zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),'California','California','Alpine County','Alpine County')<3 AND
              SYS_OP_BLOOM_FILTER(:BF0000,"SALES_P"."LOCATION_ID")))

Note
-----
   - dynamic statistics used: dynamic sampling (level=7)


38 rows selected.

SQL>

Observe the zone id and the min and max order_id for each zone. The zone map state for each zone will be "0", which equates to "valid". Zone level "1" represents partitions and "0" represents zones. "Delivered" date correlates well with the partition key: "Sale Date". This is because we can expect a delivery to occur soon after a sale. So, because the delivered date correlates well with the partition key, each partition will contain a subset of "delivered" values. We should expect to be able to prune partitions from queries that filter on the "delivered" date. Execute the following statement.

SQL> SELECT zone_id$ ,
  2    min_4_delivered ,
  3    max_4_delivered ,
  4    zone_level$,
  5    zone_state$ ,
  6    zone_rows$
  7  FROM ZMAP$_SALES_P;

       ZONE_ID$ MIN_4_DELI MAX_4_DELI ZONE_LEVEL$ ZONE_STATE$ ZONE_ROWS$
--------------- ---------- ---------- ----------- ----------- ----------
   383850123339 2000-01-04 2002-01-29           0           0     188631
   383854317641 2009-01-04 2010-01-30           0           0      62699
   383854317640 2009-01-04 2010-01-30           0           0      62740
   383854317645 2009-01-03 2010-01-30           0           0     125392
   383854317642 2009-01-04 2010-01-30           0           0      62834
   383850123342 2000-01-04 2002-01-29           0           0     194035
   383850123335 2000-01-04 2002-01-29           0           0     141472
   383850123340 2000-01-04 2002-01-29           0           0     204370
   383850123325 2000-01-06 2002-01-26           0           0       5677
   383850123341 2000-01-04 2002-01-29           0           0     125737
   383850123337 2000-01-04 2002-01-29           0           0     188637
   383850123336 2000-01-04 2002-01-29           0           0     188748
   383854317629 2009-01-05 2010-01-27           0           0       1727
   383850123329 2000-01-04 2002-01-29           0           0      25707
   383854317643 2009-01-04 2010-01-29           0           0      62676
   383850123338 2000-01-04 2002-01-29           0           0     189105
   383854317639 2009-01-03 2010-01-30           0           0      47011
   383854317647 2009-01-04 2010-01-28           0           0      14120
   383854317633 2009-01-03 2010-01-29           0           0      13766
   383854317644 2009-01-04 2010-01-29           0           0      47036
          91517 2000-01-04 2002-01-29           1           0    1452119
          91518 2009-01-03 2010-01-30           1           0     500001

22 rows selected.

SQL>

Observe that Pstart, Pstop shows, KEY(ZM) are indicating the potential to prune partitions. Execute the following the statement.

SQL> SELECT SUM(amount)
  2  FROM   sales_p
  3  WHERE  delivered  between TO_DATE('18-SEP-00', 'DD-MON-YY') and TO_DATE('19-SEP-00', 'DD-MON-YY')
  4  /

SUM(AMOUNT)
-----------
 2800312.59

1 row selected.

SQL> 
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  gk1za5gqcsaa5, child number 0
-------------------------------------
SELECT SUM(amount) FROM   sales_p WHERE  delivered  between
TO_DATE('18-SEP-00', 'DD-MON-YY') and TO_DATE('19-SEP-00', 'DD-MON-YY')

Plan hash value: 1397730817

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |         |       |       |   155 (100)|          |       |       |
|   1 |  SORT AGGREGATE                           |         |     1 |    13 |            |          |       |       |
|*  2 |   FILTER                                  |         |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR               |         |  1972 | 25636 |   155  (28)| 00:00:01 |KEY(ZM)|KEY(ZM)|
|*  4 |     TABLE ACCESS STORAGE FULL WITH ZONEMAP| SALES_P |  1972 | 25636 |   155  (28)| 00:00:01 |KEY(ZM)|KEY(ZM)|
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_DATE('19-SEP-00','DD-MON-YY')>=TO_DATE('18-SEP-00','DD-MON-YY'))
   4 - storage(("DELIVERED"<=TO_DATE('19-SEP-00','DD-MON-YY') AND
              "DELIVERED">=TO_DATE('18-SEP-00','DD-MON-YY')))
       filter((SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN BITAND(zm."ZONE_STATE$",1)=1
              THEN 1 ELSE CASE WHEN (zm."MAX_4_DELIVERED" < :1 OR zm."MIN_4_DELIVERED" > :2) THEN 3 ELSE 2 END END FROM
              "ACZM12C"."ZMAP$_SALES_P" zm WHERE zm."ZONE_LEVEL$"=0 ORDER BY
              zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),TO_DATE('18-SEP-00','DD-MON-YY'),TO_DATE('19-SEP-00','DD-MON-YY'))<3
              AND "DELIVERED"<=TO_DATE('19-SEP-00','DD-MON-YY') AND "DELIVERED">=TO_DATE('18-SEP-00','DD-MON-YY')))

Note
-----
   - dynamic statistics used: dynamic sampling (level=7)


33 rows selected.

SQL>

Execute the following piece of code to observe the effects of IO pruning. Exadata storage indexes may effect the actual number of blocks read from storage cells. However, using zone maps will ensure that pruning can occur in all appropriate circumstances.

SQL> SET AUTOTRACE ON STATISTICS
SQL> 

**Conventional table**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales
  3  WHERE  delivered  between TO_DATE('18-SEP-00', 'DD-MON-YY') and TO_DATE('19-SEP-00', 'DD-MON-YY')
  4  /

SUM(AMOUNT)
-----------
 2800312.59

1 row selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       7165  consistent gets
          0  physical reads
        124  redo size
        549  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

**With zone map**
SQL> 
SQL> SELECT SUM(amount)
  2  FROM   sales_p
  3  WHERE  delivered  between TO_DATE('18-SEP-00', 'DD-MON-YY') and TO_DATE('19-SEP-00', 'DD-MON-YY')
  4  /

SUM(AMOUNT)
-----------
 2800312.59

1 row selected.


Statistics
----------------------------------------------------------
         61  recursive calls
          0  db block gets
       5972  consistent gets
          0  physical reads
          0  redo size
        549  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> SET AUTOTRACE OFF
SQL>

Zone Map Maintenance

Zone maps are based on tables and, therefore, any changes to the underlying tables impacts the state of the zone map. Depending on the operation performed on the table, some or all zones of a zone map are impacted.

When there is a change in the structure of base tables on which a zone map is based, for example, dropping a table column whose minimum and maximum values are maintained by the zone map, then the zone map becomes invalid.

Certain operations will invalidate zones. Maintenance operations are required to refresh zone maps.

Zone maps containing some stale zones can still be used by queries. Query data will continue to be returned as expected. Perform the below mentioned steps manually or you can use automated script 10_zm_maint.sql

Connect to the Attribute Clusters/Zone Map Schema

View Code
```
SQL> connect aczm12c/oracle_4U
Connected.
SQL>  
```

Delete operations do not invalidate zones or partitions because they will not invalidate MIN/MAX value ranges. Try deleting a row from sales_p table. You will observe that the state remains "0" for all zones and partitions.

SQL> DELETE FROM sales_p WHERE order_id = 10
  2  /

1 rows deleted.

SQL> COMMIT
  2  /

Commit complete.

SQL> 
SQL> 
SQL> SELECT zone_id$ ,
  2    zone_level$,
  3    zone_state$ ,
  4    zone_rows$
  5  FROM ZMAP$_SALES_P;

  ZONE_ID$ ZONE_LEVEL$ ZONE_STATE$ ZONE_ROWS$
---------- ----------- ----------- ----------
     91557           1           0    1452090
3.8385E+11           0           0      62699
3.8385E+11           0           0      62740
3.8385E+11           0           0     125392
3.8385E+11           0           0      62834
3.8402E+11           0           0     141469
3.8402E+11           0           0     188750
3.8402E+11           0           0       5677
3.8402E+11           0           0     194043
3.8402E+11           0           0     204359
3.8402E+11           0           0     125740
3.8402E+11           0           0     188627
3.8385E+11           0           0       1727
3.8402E+11           0           0      25707
3.8385E+11           0           0      62676
3.8402E+11           0           0     188633
3.8385E+11           0           0      47011
3.8385E+11           0           0      14120
3.8385E+11           0           0      13766
3.8385E+11           0           0      47036
3.8402E+11           0           0     189085
     91518           1           0     500001

22 rows selected.

Check the state of zone map by querying the user_zonemaps table.

SQL> 
SQL> SELECT stale
  2  FROM   user_zonemaps
  3  WHERE  zonemap_name = 'ZMAP$_SALES_P'
  4  /

STALE
-------
NO

1 row selected.

SQL>

Note that the zone map is not made stale.

Conventional path insert will invalidate relevant zones and partitions unless the zone map is set to refresh on commit. Insert a row into the sales_p table.

SQL> INSERT INTO sales_p
  2  SELECT 10,1,TO_DATE('01-JAN-2000','DD-MON-YYYY'),TO_DATE('02-JAN-2000','DD-MON-YYYY'),'JANE',23,20,2,67,'WINSTON SALEM'
  3  FROM   dual
  4  /

1 row created.

SQL> COMMIT
  2  /

Commit complete.

SQL>

Querying ZMAP$_SALES_P will show you that individual zones are now invalidated.

SQL> SELECT zone_id$ ,
  2    zone_level$,
  3    zone_state$ ,
  4    zone_rows$
  5  FROM ZMAP$_SALES_P;

  ZONE_ID$ ZONE_LEVEL$ ZONE_STATE$ ZONE_ROWS$
---------- ----------- ----------- ----------
     91557           1           1    1452090
3.8385E+11           0           0      62699
3.8385E+11           0           0      62740
3.8385E+11           0           0     125392
3.8385E+11           0           0      62834
3.8402E+11           0           0     141469
3.8402E+11           0           0     188750
3.8402E+11           0           0       5677
3.8402E+11           0           1     194043
3.8402E+11           0           0     204359
3.8402E+11           0           0     125740
3.8402E+11           0           0     188627
3.8385E+11           0           0       1727
3.8402E+11           0           0      25707
3.8385E+11           0           0      62676
3.8402E+11           0           0     188633
3.8385E+11           0           0      47011
3.8385E+11           0           0      14120
3.8385E+11           0           0      13766
3.8385E+11           0           0      47036
3.8402E+11           0           0     189085
     91518           1           0     500001

22 rows selected.

SQL>

Check the state of zone map by querying the user_zonemaps table again.

View Code
```
SQL> SELECT stale
  2  FROM   user_zonemaps
  3  WHERE  zonemap_name = 'ZMAP$_SALES_P'
  4  /

STALE
-------
NO

1 row selected.

SQL>  
```
As observed earlier,individual zones get invalidated after the insertion of the row but the zone map is not stale.
If zone map itself is not stale, a fast refresh is possible. Only stale zones are scanned to refresh the zone map. Refresh the zone map.

View Code
```
SQL> EXECUTE dbms_mview.refresh('ZMAP$_SALES_P', 'f');

PL/SQL procedure successfully completed.

SQL>  
```

Querying ZMAP$_SALES_P will show you that the zones are valid again.

SQL> SELECT zone_id$ ,
  2    zone_level$,
  3    zone_state$ ,
  4    zone_rows$
  5  FROM ZMAP$_SALES_P;

  ZONE_ID$ ZONE_LEVEL$ ZONE_STATE$ ZONE_ROWS$
---------- ----------- ----------- ----------
     91557           1           0    1452091
3.8385E+11           0           0      62699
3.8385E+11           0           0      62740
3.8385E+11           0           0     125392
3.8385E+11           0           0      62834
3.8402E+11           0           0     141469
3.8402E+11           0           0     188750
3.8402E+11           0           0       5677
3.8402E+11           0           0     194044
3.8402E+11           0           0     204359
3.8402E+11           0           0     125740
3.8402E+11           0           0     188627
3.8385E+11           0           0       1727
3.8402E+11           0           0      25707
3.8385E+11           0           0      62676
3.8402E+11           0           0     188633
3.8385E+11           0           0      47011
3.8385E+11           0           0      14120
3.8385E+11           0           0      13766
3.8385E+11           0           0      47036
3.8402E+11           0           0     189085
     91518           1           0     500001

22 rows selected.

SQL>

Remove the "test" row. Zones will remain valid.

SQL> DELETE FROM sales_p WHERE order_id = 10
  2  /

1 row deleted.

SQL> COMMIT
  2  /

Commit complete.

SQL>

Direct path operations will maintain the zone map. Insert a row into sales_p.

SQL> INSERT /*+ APPEND */ INTO sales_p
  2  SELECT 10,1,TO_DATE('01-JAN-2000','DD-MON-YYYY'),TO_DATE('02-JAN-2000','DD-MON-YYYY'),'JANE',23,20,2,67,'WINSTON SALEM'
  3  FROM   dual
  4  /

1 row created.

SQL> COMMIT
  2  /

Commit complete.

SQL>

Querying ZMAP$_SALES_P will show you that the zones are still valid.

SQL> SELECT zone_id$ ,
  2    zone_level$,
  3    zone_state$ ,
  4    zone_rows$
  5  FROM ZMAP$_SALES_P;

  ZONE_ID$ ZONE_LEVEL$ ZONE_STATE$ ZONE_ROWS$
---------- ----------- ----------- ----------
     91557           1           0    1452091
3.8385E+11           0           0      62699
3.8385E+11           0           0      62740
3.8385E+11           0           0     125392
3.8385E+11           0           0      62834
3.8402E+11           0           0     141469
3.8402E+11           0           0     188750
3.8402E+11           0           0       5677
3.8402E+11           0           0     194044
3.8402E+11           0           0     204359
3.8402E+11           0           0     125740
3.8402E+11           0           0     188627
3.8385E+11           0           0       1727
3.8402E+11           0           0      25707
3.8385E+11           0           0      62676
3.8402E+11           0           0     188633
3.8385E+11           0           0      47011
3.8385E+11           0           0      14120
3.8385E+11           0           0      13766
3.8385E+11           0           0      47036
3.8402E+11           0           0     189085
     91518           1           0     500001

22 rows selected.

SQL>

Updates to non-zone map columns (and columns not used to join with dimension tables) do not invalidate the zones or partitions (unless there is row movement when a partition key is updated). Perform an update operation sales_p table.

View Code
```
SQL> UPDATE sales_p SET amount = amount + 100
  2  WHERE location_id < 20
  3  /

11609 rows updated.

SQL> COMMIT
  2  /

Commit complete.

SQL>  
```