Skip to content

Composite Health Check Suite (health.sql)

This composite health check script provides comprehensive database analysis by:

  • Executing a series of specialized diagnostic scripts
  • Analyzing random I/O performance patterns
  • Checking scattered read (full table scan) performance
  • Monitoring CPU utilization and health
  • Assessing RAC interconnect performance (if applicable)
  • Examining enqueue (lock) wait statistics
set pause off;
set pagesize 200;
@random_io
@scattered_io
@cpu_health
@interconnect_health
@enqueue
SQL> @health.sql

This script requires the following individual scripts to be available:

  • random_io.sql - Random I/O analysis
  • scattered_io.sql - Scattered read analysis
  • cpu_health.sql - CPU utilization monitoring
  • interconnect_health.sql - RAC interconnect diagnostics
  • enqueue.sql - Enqueue wait analysis
  • SELECT on V$SYSTEM_EVENT
  • SELECT on V$SESSION_EVENT
  • SELECT on V$SYSSTAT
  • SELECT on V$OSSTAT
  • SELECT on V$INSTANCE
  • SELECT on GV$ views (for RAC environments)

Purpose: Analyze single block read performance

  • Key Metrics: db file sequential read waits
  • Indicators: Average wait time, total waits
  • Targets: < 10ms average for good performance

Purpose: Analyze multi-block read performance

  • Key Metrics: db file scattered read waits
  • Indicators: Multi-block read efficiency
  • Targets: Consistent with storage capabilities

Purpose: Assess CPU utilization patterns

  • Key Metrics: CPU time, wait ratios
  • Indicators: CPU bottlenecks, scheduling delays
  • Targets: CPU utilization < 80% sustained

Purpose: Monitor cluster interconnect performance

  • Key Metrics: gc cr/current block receive times
  • Indicators: Network latency, cluster efficiency
  • Targets: < 5ms for local interconnect

Purpose: Identify locking and contention issues

  • Key Metrics: Enqueue wait events
  • Indicators: Lock contention, blocking sessions
  • Targets: Minimal enqueue waits

Each component produces specific diagnostic output:

=== Random I/O Analysis ===
Event: db file sequential read
Total Waits: 1,234,567
Total Time: 5,678 seconds
Average Wait: 4.6 ms
=== Scattered I/O Analysis ===
Event: db file scattered read
Total Waits: 234,567
Total Time: 1,234 seconds
Average Wait: 5.3 ms
=== CPU Health Analysis ===
CPU Used by This Session: 45,678 seconds
DB Time: 123,456 seconds
CPU Ratio: 37%
=== Interconnect Health ===
gc cr block receive time: 2.1 ms avg
gc current block receive time: 1.8 ms avg
=== Enqueue Analysis ===
TX - Transaction Lock: 45 waits
TM - DML Lock: 12 waits
  • Random I/O: 5-15ms typical for disk storage
  • Scattered I/O: Should be higher than random I/O
  • CPU Ratio: 50-80% optimal for OLTP systems
  • Interconnect: < 5ms for properly configured RAC
  • High Random I/O times: Storage performance issues
  • Excessive CPU waits: CPU bottleneck or scheduling issues
  • High interconnect times: Network or cluster problems
  • Frequent enqueue waits: Locking contention
  1. Daily Health Checks

    • Run as part of morning diagnostics
    • Identify overnight performance issues
    • Establish baseline performance metrics
  2. Performance Troubleshooting

    • First-line diagnostic for performance issues
    • Quickly identify major bottleneck areas
    • Prioritize detailed analysis efforts
  3. Capacity Planning

    • Monitor resource utilization trends
    • Identify growing performance pressures
    • Plan infrastructure upgrades
-- Enhanced version with additional diagnostics
set pause off;
set pagesize 200;
@random_io
@scattered_io
@cpu_health
@interconnect_health
@enqueue
@memory_health
@tablespace_health
@session_health
-- Wrapper for automated execution
SPOOL /path/to/logs/health_check_$(date +%Y%m%d_%H%M).log
@health.sql
SPOOL OFF

Consider running before/after AWR snapshots:

EXEC DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT();
@health.sql
-- Wait for activity
EXEC DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT();

Create alerts based on output thresholds:

-- Example threshold monitoring
SELECT CASE
WHEN avg_wait > 20 THEN 'CRITICAL: High I/O waits'
WHEN avg_wait > 10 THEN 'WARNING: Elevated I/O waits'
ELSE 'OK'
END status
FROM (SELECT time_waited/total_waits avg_wait
FROM v$system_event
WHERE event = 'db file sequential read');

If individual component scripts are missing:

-- Create minimal replacement scripts
-- For missing random_io.sql:
SELECT 'Random I/O Health Check' title FROM dual;
SELECT event, total_waits, time_waited,
ROUND(time_waited/total_waits, 2) avg_wait_ms
FROM v$system_event
WHERE event = 'db file sequential read';

Ensure proper privileges for all V$ views:

GRANT SELECT ON V$SYSTEM_EVENT TO monitoring_user;
GRANT SELECT ON V$SYSSTAT TO monitoring_user;
GRANT SELECT ON V$OSSTAT TO monitoring_user;
#!/bin/bash
# Daily health check script
export ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1
export ORACLE_SID=ORCL
cd /path/to/scripts
sqlplus -s / as sysdba @health.sql > /logs/daily_health_$(date +%Y%m%d).log
  • Schedule as Database Control job
  • Set up email notifications for critical thresholds
  • Store results in repository for trending
  1. Regular Execution

    • Run daily during off-peak hours
    • Compare results over time
    • Document baseline performance
  2. Threshold Management

    • Establish environment-specific thresholds
    • Adjust for different workload patterns
    • Document expected performance ranges
  3. Response Procedures

    • Create runbooks for common issues
    • Define escalation procedures
    • Maintain contact lists for critical issues