Composite Health Check Suite (health.sql)
What This Script Does
Section titled “What This Script Does”This composite health check script provides comprehensive database analysis by:
- Executing a series of specialized diagnostic scripts
- Analyzing random I/O performance patterns
- Checking scattered read (full table scan) performance
- Monitoring CPU utilization and health
- Assessing RAC interconnect performance (if applicable)
- Examining enqueue (lock) wait statistics
Script
Section titled “Script”set pause off;set pagesize 200;@random_io@scattered_io@cpu_health@interconnect_health@enqueue
SQL> @health.sql
Prerequisites
Section titled “Prerequisites”This script requires the following individual scripts to be available:
random_io.sql
- Random I/O analysisscattered_io.sql
- Scattered read analysiscpu_health.sql
- CPU utilization monitoringinterconnect_health.sql
- RAC interconnect diagnosticsenqueue.sql
- Enqueue wait analysis
Required Privileges
Section titled “Required Privileges”- SELECT on V$SYSTEM_EVENT
- SELECT on V$SESSION_EVENT
- SELECT on V$SYSSTAT
- SELECT on V$OSSTAT
- SELECT on V$INSTANCE
- SELECT on GV$ views (for RAC environments)
Component Analysis
Section titled “Component Analysis”Random I/O Analysis
Section titled “Random I/O Analysis”Purpose: Analyze single block read performance
- Key Metrics: db file sequential read waits
- Indicators: Average wait time, total waits
- Targets: < 10ms average for good performance
Scattered I/O Analysis
Section titled “Scattered I/O Analysis”Purpose: Analyze multi-block read performance
- Key Metrics: db file scattered read waits
- Indicators: Multi-block read efficiency
- Targets: Consistent with storage capabilities
CPU Health Monitoring
Section titled “CPU Health Monitoring”Purpose: Assess CPU utilization patterns
- Key Metrics: CPU time, wait ratios
- Indicators: CPU bottlenecks, scheduling delays
- Targets: CPU utilization < 80% sustained
Interconnect Health (RAC)
Section titled “Interconnect Health (RAC)”Purpose: Monitor cluster interconnect performance
- Key Metrics: gc cr/current block receive times
- Indicators: Network latency, cluster efficiency
- Targets: < 5ms for local interconnect
Enqueue Analysis
Section titled “Enqueue Analysis”Purpose: Identify locking and contention issues
- Key Metrics: Enqueue wait events
- Indicators: Lock contention, blocking sessions
- Targets: Minimal enqueue waits
Sample Output Structure
Section titled “Sample Output Structure”Each component produces specific diagnostic output:
=== Random I/O Analysis ===Event: db file sequential readTotal Waits: 1,234,567Total Time: 5,678 secondsAverage Wait: 4.6 ms
=== Scattered I/O Analysis ===Event: db file scattered readTotal Waits: 234,567Total Time: 1,234 secondsAverage Wait: 5.3 ms
=== CPU Health Analysis ===CPU Used by This Session: 45,678 secondsDB Time: 123,456 secondsCPU Ratio: 37%
=== Interconnect Health ===gc cr block receive time: 2.1 ms avggc current block receive time: 1.8 ms avg
=== Enqueue Analysis ===TX - Transaction Lock: 45 waitsTM - DML Lock: 12 waits
Interpretation Guidelines
Section titled “Interpretation Guidelines”Performance Baselines
Section titled “Performance Baselines”- Random I/O: 5-15ms typical for disk storage
- Scattered I/O: Should be higher than random I/O
- CPU Ratio: 50-80% optimal for OLTP systems
- Interconnect: < 5ms for properly configured RAC
Warning Indicators
Section titled “Warning Indicators”- High Random I/O times: Storage performance issues
- Excessive CPU waits: CPU bottleneck or scheduling issues
- High interconnect times: Network or cluster problems
- Frequent enqueue waits: Locking contention
Common Use Cases
Section titled “Common Use Cases”-
Daily Health Checks
- Run as part of morning diagnostics
- Identify overnight performance issues
- Establish baseline performance metrics
-
Performance Troubleshooting
- First-line diagnostic for performance issues
- Quickly identify major bottleneck areas
- Prioritize detailed analysis efforts
-
Capacity Planning
- Monitor resource utilization trends
- Identify growing performance pressures
- Plan infrastructure upgrades
Customization Options
Section titled “Customization Options”Adding Additional Checks
Section titled “Adding Additional Checks”-- Enhanced version with additional diagnosticsset pause off;set pagesize 200;@random_io@scattered_io@cpu_health@interconnect_health@enqueue@memory_health@tablespace_health@session_health
Automated Scheduling
Section titled “Automated Scheduling”-- Wrapper for automated executionSPOOL /path/to/logs/health_check_$(date +%Y%m%d_%H%M).log@health.sqlSPOOL OFF
Integration with Monitoring
Section titled “Integration with Monitoring”AWR Integration
Section titled “AWR Integration”Consider running before/after AWR snapshots:
EXEC DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT();@health.sql-- Wait for activityEXEC DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT();
Alert Integration
Section titled “Alert Integration”Create alerts based on output thresholds:
-- Example threshold monitoringSELECT CASE WHEN avg_wait > 20 THEN 'CRITICAL: High I/O waits' WHEN avg_wait > 10 THEN 'WARNING: Elevated I/O waits' ELSE 'OK'END statusFROM (SELECT time_waited/total_waits avg_wait FROM v$system_event WHERE event = 'db file sequential read');
Troubleshooting Component Failures
Section titled “Troubleshooting Component Failures”Missing Scripts
Section titled “Missing Scripts”If individual component scripts are missing:
-- Create minimal replacement scripts-- For missing random_io.sql:SELECT 'Random I/O Health Check' title FROM dual;SELECT event, total_waits, time_waited, ROUND(time_waited/total_waits, 2) avg_wait_msFROM v$system_eventWHERE event = 'db file sequential read';
Permission Issues
Section titled “Permission Issues”Ensure proper privileges for all V$ views:
GRANT SELECT ON V$SYSTEM_EVENT TO monitoring_user;GRANT SELECT ON V$SYSSTAT TO monitoring_user;GRANT SELECT ON V$OSSTAT TO monitoring_user;
Automation and Scheduling
Section titled “Automation and Scheduling”Cron Integration
Section titled “Cron Integration”#!/bin/bash# Daily health check scriptexport ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1export ORACLE_SID=ORCLcd /path/to/scriptssqlplus -s / as sysdba @health.sql > /logs/daily_health_$(date +%Y%m%d).log
Enterprise Manager Integration
Section titled “Enterprise Manager Integration”- Schedule as Database Control job
- Set up email notifications for critical thresholds
- Store results in repository for trending
Best Practices
Section titled “Best Practices”-
Regular Execution
- Run daily during off-peak hours
- Compare results over time
- Document baseline performance
-
Threshold Management
- Establish environment-specific thresholds
- Adjust for different workload patterns
- Document expected performance ranges
-
Response Procedures
- Create runbooks for common issues
- Define escalation procedures
- Maintain contact lists for critical issues