I/O Wait Analysis for Current Sessions (VIO.SQL)
What This Script Does
Section titled “What This Script Does”This script provides focused I/O performance analysis by:
- Monitoring the two most critical I/O wait events across all RAC instances
- Calculating average wait times in milliseconds for accurate performance assessment
- Aggregating I/O statistics from all current sessions
- Comparing performance between instances in RAC environments
- Focusing specifically on database file read operations
Script
Section titled “Script”COLUMN "total_waits" FORMAT 999,999,999column "avg_wait_ms" format 9,999.9ttitle "IO Wait current sessions"selecte.inst_id,e.event event,--e.total_waits twaits,sum(e.total_waits) total_waits ,sum(e.time_waited_micro) / sum(e.total_waits) / 1000 avg_wait_msfrom Gv$session_event ewhere e.event like '%db file sequential%'or e.event like '%db file scat%'group by inst_id, eventorder by inst_id,event;
SQL> @VIO.SQL
Required Privileges
Section titled “Required Privileges”- SELECT on GV$SESSION_EVENT
Sample Output
Section titled “Sample Output”IO Wait current sessions
INST_ID EVENT TOTAL_WAITS AVG_WAIT_MS------- --------------------------------- ----------- ----------- 1 db file sequential read 5,234,567 8.7 1 db file scattered read 1,456,789 16.1 2 db file sequential read 4,987,432 10.5 2 db file scattered read 1,234,567 16.8 3 db file sequential read 5,123,456 8.0 3 db file scattered read 1,345,678 15.9
Key Output Columns
Section titled “Key Output Columns”- INST_ID: RAC instance number
- EVENT: I/O wait event name
- TOTAL_WAITS: Total number of I/O wait occurrences across all sessions
- AVG_WAIT_MS: Average wait time per I/O operation in milliseconds
Understanding I/O Events
Section titled “Understanding I/O Events”db file sequential read
Section titled “db file sequential read”- Purpose: Single block reads (index range scans, ROWID access)
- Common Operations: Index lookups, primary key access, nested loop joins
- Performance Target: < 10ms for traditional disk, < 5ms for SSD
db file scattered read
Section titled “db file scattered read”- Purpose: Multi-block reads (full table scans, fast full index scans)
- Common Operations: Table scans, hash joins, bulk operations
- Performance Target: < 20ms for traditional disk, < 10ms for SSD
Performance Analysis
Section titled “Performance Analysis”Baseline Performance Expectations
Section titled “Baseline Performance Expectations”- Excellent: Sequential reads < 5ms, Scattered reads < 10ms
- Good: Sequential reads 5-10ms, Scattered reads 10-20ms
- Acceptable: Sequential reads 10-15ms, Scattered reads 20-30ms
- Poor: Sequential reads > 15ms, Scattered reads > 30ms
Cross-Instance Comparison
Section titled “Cross-Instance Comparison”- Balanced Performance: Similar wait times across instances
- Instance Imbalance: Investigate storage configuration differences
- Consistent Ratios: Scattered reads should be higher than sequential reads
Common Performance Issues
Section titled “Common Performance Issues”High Sequential Read Times
Section titled “High Sequential Read Times”- Storage Performance: Slow disk subsystem
- Hot Blocks: Contention for popular data blocks
- Buffer Cache: Insufficient memory causing cache misses
High Scattered Read Times
Section titled “High Scattered Read Times”- Storage Configuration: Poor multi-block read performance
- Parameter Tuning: DB_FILE_MULTIBLOCK_READ_COUNT optimization needed
- Application Design: Excessive full table scans
Instance Performance Variance
Section titled “Instance Performance Variance”- Storage Differences: Uneven storage allocation across nodes
- Network Issues: Cluster interconnect problems affecting storage access
- Workload Distribution: Unbalanced application connections
Advanced Analysis
Section titled “Advanced Analysis”Detailed I/O Breakdown
Section titled “Detailed I/O Breakdown”-- Get more detailed I/O analysisSELECT inst_id, event, total_waits, ROUND(time_waited_micro/1000000, 2) total_wait_secs, ROUND(time_waited_micro/total_waits/1000, 2) avg_wait_ms, ROUND(total_waits * 100 / SUM(total_waits) OVER (), 2) pct_of_total_waitsFROM ( SELECT inst_id, event, SUM(total_waits) total_waits, SUM(time_waited_micro) time_waited_micro FROM gv$session_event WHERE event LIKE '%db file sequential%' OR event LIKE '%db file scat%' GROUP BY inst_id, event)ORDER BY inst_id, total_wait_secs DESC;
Compare with System Events
Section titled “Compare with System Events”-- Compare session events with system-wide statisticsSELECT se.inst_id, se.event, se.session_total_waits, se.session_avg_ms, sye.system_total_waits, sye.system_avg_msFROM ( SELECT inst_id, event, SUM(total_waits) session_total_waits, ROUND(SUM(time_waited_micro)/SUM(total_waits)/1000, 2) session_avg_ms FROM gv$session_event WHERE event LIKE '%db file sequential%' OR event LIKE '%db file scat%' GROUP BY inst_id, event) se,( SELECT inst_id, event, total_waits system_total_waits, ROUND(time_waited_micro/total_waits/1000, 2) system_avg_ms FROM gv$system_event WHERE event LIKE '%db file sequential%' OR event LIKE '%db file scat%') syeWHERE se.inst_id = sye.inst_idAND se.event = sye.eventORDER BY se.inst_id, se.event;
Historical Trend Analysis
Section titled “Historical Trend Analysis”-- Create table for trending (one-time setup)CREATE TABLE io_wait_history ( check_time DATE, inst_id NUMBER, event VARCHAR2(64), total_waits NUMBER, avg_wait_ms NUMBER);
-- Insert current snapshotINSERT INTO io_wait_historySELECT SYSDATE, inst_id, event, SUM(total_waits), SUM(time_waited_micro)/SUM(total_waits)/1000FROM gv$session_eventWHERE event LIKE '%db file sequential%' OR event LIKE '%db file scat%'GROUP BY inst_id, event;
Troubleshooting High I/O Waits
Section titled “Troubleshooting High I/O Waits”Immediate Investigation Steps
Section titled “Immediate Investigation Steps”- Check Storage Metrics: Disk utilization, queue depth, response times
- Review Current SQL: Identify statements causing high I/O
- Analyze Object Access: Find hot tables and indexes
Storage Analysis
Section titled “Storage Analysis”-- Check file-level I/O statisticsSELECT f.inst_id, ts.name tablespace_name, f.name file_name, f.phyrds, f.phywrts, ROUND(f.readtim/GREATEST(f.phyrds, 1)*10, 2) avg_read_ms, ROUND(f.writetim/GREATEST(f.phywrts, 1)*10, 2) avg_write_msFROM gv$filestat f, gv$tablespace tsWHERE f.ts# = ts.ts#AND f.inst_id = ts.inst_idAND f.phyrds > 0ORDER BY f.inst_id, avg_read_ms DESC;
SQL Analysis
Section titled “SQL Analysis”-- Find SQL statements with high I/OSELECT sql_id, child_number, executions, disk_reads, buffer_gets, ROUND(disk_reads/GREATEST(executions, 1), 2) disk_reads_per_exec, ROUND(buffer_gets/GREATEST(executions, 1), 2) buffer_gets_per_execFROM v$sqlWHERE disk_reads > 10000ORDER BY disk_reads DESC;
RAC-Specific Considerations
Section titled “RAC-Specific Considerations”Instance Load Balancing
Section titled “Instance Load Balancing”- Even Distribution: I/O should be balanced across instances
- Connection Affinity: Check application connection patterns
- Storage Affinity: Consider storage-aware connection routing
Cluster Performance
Section titled “Cluster Performance”- Interconnect Impact: Monitor cluster wait events
- Cache Fusion: Check for excessive block transfers
- Shared Storage: Ensure consistent performance across nodes
Performance Optimization
Section titled “Performance Optimization”Storage Level Optimizations
Section titled “Storage Level Optimizations”- Upgrade Storage: Consider SSD or faster storage tiers
- I/O Configuration: Optimize stripe size and RAID configuration
- ASM Tuning: Balance disk groups and allocation units
Database Level Optimizations
Section titled “Database Level Optimizations”- Buffer Cache: Increase size if hit ratios are low
- Multiblock Read Count: Tune DB_FILE_MULTIBLOCK_READ_COUNT
- Asynchronous I/O: Ensure DISK_ASYNCH_IO is enabled
Application Level Optimizations
Section titled “Application Level Optimizations”- SQL Tuning: Optimize queries to reduce I/O
- Indexing: Add appropriate indexes to reduce full scans
- Connection Pooling: Balance load across instances
Monitoring and Alerting
Section titled “Monitoring and Alerting”Performance Thresholds
Section titled “Performance Thresholds”-- Set up alerting for poor I/O performanceSELECT inst_id, event, avg_wait_ms, CASE WHEN avg_wait_ms > 30 THEN 'CRITICAL' WHEN avg_wait_ms > 20 THEN 'WARNING' WHEN avg_wait_ms > 10 THEN 'CAUTION' ELSE 'OK' END alert_levelFROM ( SELECT inst_id, event, SUM(time_waited_micro)/SUM(total_waits)/1000 avg_wait_ms FROM gv$session_event WHERE event LIKE '%db file sequential%' OR event LIKE '%db file scat%' GROUP BY inst_id, event)WHERE avg_wait_ms > 10;
Automated Monitoring
Section titled “Automated Monitoring”-- Script for automated monitoringSELECT TO_CHAR(SYSDATE, 'YYYY-MM-DD HH24:MI:SS') check_time, inst_id, event, total_waits, avg_wait_msFROM ( SELECT inst_id, event, SUM(total_waits) total_waits, ROUND(SUM(time_waited_micro)/SUM(total_waits)/1000, 2) avg_wait_ms FROM gv$session_event WHERE event LIKE '%db file sequential%' OR event LIKE '%db file scat%' GROUP BY inst_id, event);
Best Practices
Section titled “Best Practices”-
Regular Monitoring
- Check I/O performance during peak hours
- Compare patterns across different time periods
- Track trends for capacity planning
-
Baseline Establishment
- Document normal I/O wait times
- Establish performance expectations by workload
- Monitor for deviations from baseline
-
Proactive Management
- Set up automated alerts for performance degradation
- Regular storage health checks
- Plan capacity upgrades before issues occur