Skip to content

Low Cardinality Index Analysis (dlowcardind.sql)

This script identifies problematic low cardinality indexes by:

  • Finding indexes where the leading column has low selectivity
  • Calculating filter factor (distinct values / total rows) for each index
  • Filtering to show only indexes with < 15% selectivity and < 50 distinct values
  • Focusing on single-column indexes (position 1) for analysis
  • Highlighting potential candidates for index removal or redesign
rem dindex.sql
rem
rem
rem
ttitle 'Index Storage Parameters'
rem
col table_owner format a8 heading 'OWNER'
col table_type format a1 heading 'T'
col table_name format a20 heading 'TABLE NAME'
col status format a1 heading 'S'
col u format a1 heading 'U'
col index_name format a18 heading 'INDEX NAME'
col tablespace_name format a10 heading 'TABLESPACE'
col pct_free format 999 heading 'PCT|FREE'
col ini_trans format 99 heading 'INI|TRX'
col initial_extent format 999999 heading 'INITIAL|EXTENT|(K)'
col next_extent format 999999 heading 'NEXT|EXTENT|(K)'
col pct_increase format 999 heading 'PCT|INC'
col min_extents format 99 heading 'MIN|EXT'
col max_extents format 999 heading 'MAX|EXT'
col freelists format 999 heading 'FREE|LIST'
col BLEVEL format 9999 head 'BLEVEL'
col LEAF_BLOCKS format 999,999 heading 'LEAF|BOCKS'
col DISTINCT_KEYS format 999,999,999 heading 'DISTINCT|KEYS'
col AVG_LEAF_BLOCKS_PER_KEY format 9,999 heading 'AVG|LEAF|BLOCKS|PER|KEY'
col AVG_DATA_BLOCKS_PER_KEY format 9,999,999 heading 'AVG|DATA|BLOCKS|PER|KEY'
col CLUSTERING_FACTOR format 99,999,999 heading 'CLUSTERING|FACTOR'
col NUM_ROWS format 999,999,999 heading 'NUM ROWS'
col SAMPLE_SIZE format 999,999 heading 'SAMPLE'
col LAST_ANALYZED heading 'LAST|ANALYZED'
col FILTER_FACTOR format 9.999 HEADING 'FILTER|FACTOR'
-- col DEGREE VARCHAR2(40)
-- col INSTANCES VARCHAR2(40)
-- col PARTITIONED VARCHAR2(3)
-- col TEMPORARY VARCHAR2(1)
-- col GENERATED VARCHAR2(1)
-- col BUFFER_POOL VARCHAR2(7)
rem
break on table_owner skip 1 on table_name on table_type
rem
select i.table_owner,
i.table_name,
i.index_name, c.column_position, c.column_name, COL.NUM_DISTINCT, TAB.NUM_ROWS, (COL.NUM_DISTINCT / TAB.NUM_ROWS) FILTER_FACTOR
from sys.dba_ind_columns c, sys.dba_indexes i, DBA_TAB_COLUMNS COL, DBA_TABLES TAB
where i.table_owner like nvl( upper('&owner'), '%')
and i.table_name like nvl(upper('&table'),'%')
AND I.OWNER = TAB.OWNER
AND I.TABLE_NAME = TAB.TABLE_NAME
AND C.TABLE_OWNER = TAB.OWNER
AND C.TABLE_NAME = TAB.TABLE_NAME
and i.owner = c.index_owner
AND I.owner = COL.OWNER
AND I.TABLE_NAME = COL.TABLE_NAME
AND C.COLUMN_NAME = COL.COLUMN_NAME
and i.index_name = c.index_name
AND c.column_position = 1
AND TAB.NUM_ROWS > 0
AND (COL.NUM_DISTINCT / TAB.NUM_ROWS) < 0.15
AND COL.NUM_DISTINCT < 50
order by i.table_owner, i.table_name, 2, i.index_name, c.column_position
;
rem
SQL> @dlowcardind.sql
Enter value for owner: HR
Enter value for table: %
  • owner: Schema owner (use % for all schemas or specific schema name)
  • table: Table name pattern (use % for all tables or specific table name)
  • SELECT on SYS.DBA_IND_COLUMNS
  • SELECT on SYS.DBA_INDEXES
  • SELECT on SYS.DBA_TAB_COLUMNS
  • SELECT on SYS.DBA_TABLES
Index Storage Parameters
OWNER TABLE NAME INDEX NAME COLUMN_POSITION COLUMN_NAME NUM_DISTINCT NUM ROWS FILTER FACTOR
-------- -------------------- ------------------ --------------- --------------- ------------- ---------- -------------
HR EMPLOYEES EMP_STATUS_IDX 1 STATUS 3 107,000 0.000
EMPLOYEES EMP_GENDER_IDX 1 GENDER 2 107,000 0.000
SALES ORDERS ORD_PRIORITY_IDX 1 PRIORITY 5 45,000 0.000
ORDERS ORD_STATUS_IDX 1 ORDER_STATUS 8 45,000 0.000
INV PRODUCTS PROD_CATEGORY_IDX 1 CATEGORY 12 25,000 0.000
PRODUCTS PROD_ACTIVE_IDX 1 ACTIVE_FLAG 2 25,000 0.000
  • OWNER: Schema that owns the table and index
  • TABLE NAME: Name of the table being indexed
  • INDEX NAME: Name of the index with low cardinality
  • COLUMN_POSITION: Position in index (script focuses on position 1)
  • COLUMN_NAME: Name of the column being indexed
  • NUM_DISTINCT: Number of distinct values in the column
  • NUM ROWS: Total number of rows in the table
  • FILTER FACTOR: Selectivity ratio (NUM_DISTINCT / NUM_ROWS)

Why Low Cardinality Indexes Are Problematic

Section titled “Why Low Cardinality Indexes Are Problematic”
  1. Poor Selectivity

    • Oracle may choose full table scan over index scan
    • Index provides little filtering benefit
    • Wasted storage and maintenance overhead
  2. Performance Impact

    • Slower DML operations (INSERT, UPDATE, DELETE)
    • Increased buffer cache usage
    • Unnecessary index maintenance
  3. Storage Waste

    • Index consumes space without providing benefit
    • Additional I/O for index maintenance
    • Backup and recovery overhead
  • < 0.01 (1%): Extremely poor selectivity
  • 0.01 - 0.05 (1-5%): Very poor selectivity
  • 0.05 - 0.15 (5-15%): Poor selectivity (script threshold)
  • > 0.15 (15%+): Acceptable selectivity
-- Status flags (Y/N, T/F)
ACTIVE_FLAG: 2 distinct values (ACTIVE, INACTIVE)
-- Gender columns
GENDER: 2-3 distinct values (M, F, U)
-- Priority levels
PRIORITY: 3-5 distinct values (LOW, MEDIUM, HIGH)
-- Boolean indicators
IS_DELETED: 2 distinct values (0, 1)
SELECT i.owner, i.table_name, i.index_name,
s.last_analyzed, s.num_rows,
NVL(u.total_access_count, 0) access_count
FROM dba_indexes i
LEFT JOIN dba_index_usage u ON i.owner = u.owner AND i.index_name = u.index_name
JOIN dba_ind_statistics s ON i.owner = s.owner AND i.index_name = s.index_name
WHERE i.uniqueness = 'NONUNIQUE'
ORDER BY access_count, s.num_rows DESC;
SELECT i.owner, i.table_name, i.index_name,
ROUND(s.leaf_blocks * ts.block_size / 1024 / 1024, 2) size_mb,
s.distinct_keys, s.num_rows,
ROUND(s.distinct_keys / s.num_rows * 100, 2) selectivity_pct
FROM dba_indexes i, dba_ind_statistics s, dba_tablespaces ts
WHERE i.owner = s.owner
AND i.index_name = s.index_name
AND i.tablespace_name = ts.tablespace_name
AND s.distinct_keys < 50
AND s.num_rows > 1000
ORDER BY size_mb DESC;
  • Filter Factor < 0.05 AND Never Used AND Not Unique/Primary Key
  • Distinct Values < 10 AND Table Size > 10,000 rows
  • Boolean columns in large tables
  • Low cardinality leading column BUT Used in multi-column predicates
  • Status + Date combinations (e.g., STATUS = ‘ACTIVE’ AND CREATED_DATE > DATE)
  • Small tables (< 1000 rows) - overhead is minimal
  • Frequently updated status columns with specific query patterns
  • Columns used in ORDER BY with small result sets
  1. Composite Indexes
-- Instead of: CREATE INDEX idx_status ON orders(status);
-- Consider: CREATE INDEX idx_status_date ON orders(status, order_date);
  1. Function-Based Indexes
-- For specific value searches
CREATE INDEX idx_active_customers ON customers(CASE WHEN status = 'ACTIVE' THEN customer_id END);
  1. Partial Indexes (Oracle 12c+)
-- Index only specific values
CREATE INDEX idx_pending_orders ON orders(order_id) WHERE status = 'PENDING';
SELECT 'DROP INDEX ' || owner || '.' || index_name || ';' as drop_statement
FROM (
SELECT DISTINCT i.owner, i.index_name,
c.num_distinct, t.num_rows,
ROUND(c.num_distinct / t.num_rows, 4) filter_factor
FROM dba_indexes i, dba_ind_columns ic, dba_tab_columns c, dba_tables t
WHERE i.owner = ic.index_owner
AND i.index_name = ic.index_name
AND ic.column_position = 1
AND i.owner = c.owner
AND i.table_name = c.table_name
AND ic.column_name = c.column_name
AND i.owner = t.owner
AND i.table_name = t.table_name
AND i.uniqueness = 'NONUNIQUE'
AND t.num_rows > 1000
AND c.num_distinct < 20
AND c.num_distinct / t.num_rows < 0.05
AND i.owner NOT IN ('SYS', 'SYSTEM')
);
  1. Identify dependent queries
  2. Test with index invisible (Oracle 11g+)
  3. Monitor execution plans
  4. Check performance metrics
-- Make index invisible for testing
ALTER INDEX hr.emp_status_idx INVISIBLE;
-- Monitor for impact, then decide
-- Either make visible again or drop
  1. Regular Analysis

    • Run monthly on large schemas
    • Focus on tables with > 10,000 rows
    • Prioritize by space savings potential
  2. Before Removal

    • Check index usage statistics
    • Review application query patterns
    • Test performance impact
  3. Documentation

    • Document removed indexes
    • Track space savings
    • Monitor for performance regression