SQL*Loader Basics
SQL*Loader Basics
Section titled “SQL*Loader Basics”Oracle SQL*Loader is a high-performance bulk data loading utility that efficiently loads data from external files into Oracle database tables. It’s designed for loading large volumes of data with minimal overhead.
What is SQL*Loader?
Section titled “What is SQL*Loader?”SQL*Loader is Oracle’s premier bulk loading tool that provides:
- High-performance loading - Direct path and conventional path options
- Flexible data formats - Fixed, delimited, and variable record formats
- Data transformation - Built-in functions and expressions
- Error handling - Comprehensive logging and bad record management
- Loading modes - INSERT, APPEND, REPLACE, TRUNCATE
- Parallel loading - Multiple concurrent SQL*Loader sessions
SQL*Loader Architecture
Section titled “SQL*Loader Architecture”┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐│ Data Files │ │ SQL*Loader │ │ Oracle Database ││ (CSV, Fixed, │───►│ Process │───►│ Tables ││ Variable) │ │ │ │ │└─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ ▼ ▼ ▼┌─────────────────────────────────────────────────────────────────┐│ Control & Output Files ││ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││ │ Control File│ │ Log File │ │ Bad/Discard │ ││ │ (Metadata & │ │ (Execution │ │ Files │ ││ │ Commands) │ │ Log) │ │ (Errors) │ ││ └─────────────┘ └─────────────┘ └─────────────┘ │└─────────────────────────────────────────────────────────────────┘
Core Components
Section titled “Core Components”1. Input Data File
Section titled “1. Input Data File”Contains the actual data to be loaded:
101,John,Doe,MANAGER,50000,2023-01-15102,Jane,Smith,ANALYST,45000,2023-02-01103,Bob,Johnson,CLERK,35000,2023-03-01
2. Control File
Section titled “2. Control File”Defines how to load the data:
-- employees.ctlLOAD DATAINFILE 'employees.csv'INTO TABLE employeesFIELDS TERMINATED BY ','OPTIONALLY ENCLOSED BY '"'( employee_id, first_name, last_name, job_title, salary, hire_date DATE "YYYY-MM-DD")
3. SQL*Loader Command
Section titled “3. SQL*Loader Command”sqlldr userid=hr/password control=employees.ctl log=employees.log
Basic Loading Examples
Section titled “Basic Loading Examples”1. Simple CSV Load
Section titled “1. Simple CSV Load”Data File (employees.csv):
101,John,Doe,50000102,Jane,Smith,45000103,Bob,Johnson,35000
Control File (load_employees.ctl):
LOAD DATAINFILE 'employees.csv'INTO TABLE employeesFIELDS TERMINATED BY ','( employee_id INTEGER EXTERNAL, first_name CHAR(20), last_name CHAR(20), salary INTEGER EXTERNAL)
Command:
sqlldr hr/password control=load_employees.ctl log=load_employees.log
2. Fixed-Width Data Load
Section titled “2. Fixed-Width Data Load”Data File (fixed_employees.dat):
101John Doe 50000102Jane Smith 45000103Bob Johnson 35000
Control File (fixed_load.ctl):
LOAD DATAINFILE 'fixed_employees.dat'INTO TABLE employees( employee_id POSITION(1:3) INTEGER EXTERNAL, first_name POSITION(4:13) CHAR, last_name POSITION(14:23) CHAR, salary POSITION(24:28) INTEGER EXTERNAL)
3. Loading with Data Transformation
Section titled “3. Loading with Data Transformation”Control File (transform_load.ctl):
LOAD DATAINFILE 'employees.csv'INTO TABLE employeesFIELDS TERMINATED BY ','( employee_id, first_name UPPER(:first_name), last_name UPPER(:last_name), full_name ":first_name || ' ' || :last_name", salary, hire_date DATE "YYYY-MM-DD", created_date SYSDATE)
Loading Modes
Section titled “Loading Modes”1. INSERT (Default)
Section titled “1. INSERT (Default)”Loads data into empty tables:
LOAD DATAINFILE 'data.csv'INSERT INTO TABLE employeesFIELDS TERMINATED BY ','(employee_id, first_name, last_name)
2. APPEND
Section titled “2. APPEND”Adds data to existing table data:
LOAD DATAINFILE 'new_employees.csv'APPEND INTO TABLE employeesFIELDS TERMINATED BY ','(employee_id, first_name, last_name)
3. REPLACE
Section titled “3. REPLACE”Deletes existing data and loads new data:
LOAD DATAINFILE 'all_employees.csv'REPLACE INTO TABLE employeesFIELDS TERMINATED BY ','(employee_id, first_name, last_name)
4. TRUNCATE
Section titled “4. TRUNCATE”Truncates table and loads new data:
LOAD DATAINFILE 'complete_employees.csv'TRUNCATE INTO TABLE employeesFIELDS TERMINATED BY ','(employee_id, first_name, last_name)
Data Types and Formatting
Section titled “Data Types and Formatting”1. Numeric Data
Section titled “1. Numeric Data”LOAD DATAINFILE 'numbers.csv'INTO TABLE test_numbersFIELDS TERMINATED BY ','( id INTEGER EXTERNAL, price DECIMAL EXTERNAL, quantity NUMBER, percentage FLOAT EXTERNAL)
2. Date and Time Data
Section titled “2. Date and Time Data”LOAD DATAINFILE 'dates.csv'INTO TABLE test_datesFIELDS TERMINATED BY ','( id, birth_date DATE "MM/DD/YYYY", hire_timestamp TIMESTAMP "YYYY-MM-DD HH24:MI:SS", modified_date DATE "YYYY-MM-DD" DEFAULTIF modified_date=BLANKS)
3. Character Data with Special Handling
Section titled “3. Character Data with Special Handling”LOAD DATAINFILE 'text_data.csv'INTO TABLE test_textFIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'TRAILING NULLCOLS( id, description CHAR(100), notes CHAR(500) "UPPER(:notes)", status CHAR(10) DEFAULTIF status=BLANKS)
Error Handling
Section titled “Error Handling”1. Bad Records Management
Section titled “1. Bad Records Management”LOAD DATAINFILE 'employees.csv'BADFILE 'employees.bad' -- Records that couldn't be loadedDISCARDFILE 'employees.dis' -- Records that didn't meet criteriaINTO TABLE employeesFIELDS TERMINATED BY ','TRAILING NULLCOLS( employee_id, first_name, last_name, salary "DECODE(:salary, '', NULL, :salary)")
2. Error Limits
Section titled “2. Error Limits”LOAD DATAINFILE 'employees.csv'BADFILE 'employees.bad'ERRORS=100 -- Stop after 100 bad recordsROWS=1000 -- Commit every 1000 rowsINTO TABLE employeesFIELDS TERMINATED BY ','(employee_id, first_name, last_name, salary)
Advanced Features
Section titled “Advanced Features”1. Loading Multiple Tables
Section titled “1. Loading Multiple Tables”LOAD DATAINFILE 'employee_data.csv'INTO TABLE employeesWHEN (1:1) = 'E' -- Employee records start with 'E'FIELDS TERMINATED BY ','( record_type FILLER, -- Don't load this field employee_id, first_name, last_name)INTO TABLE departmentsWHEN (1:1) = 'D' -- Department records start with 'D'FIELDS TERMINATED BY ','( record_type FILLER, dept_id, dept_name, location)
2. Loading LOB Data
Section titled “2. Loading LOB Data”LOAD DATAINFILE 'documents.csv'INTO TABLE document_storeFIELDS TERMINATED BY ','( doc_id, title, content LOBFILE(CONSTANT '/docs/') TERMINATED BY EOF, created_date SYSDATE)
3. Using SQL Functions
Section titled “3. Using SQL Functions”LOAD DATAINFILE 'employees.csv'INTO TABLE employeesFIELDS TERMINATED BY ','( employee_id, first_name "INITCAP(:first_name)", last_name "UPPER(:last_name)", email ":first_name || '.' || :last_name || '@company.com'", hire_date "TO_DATE(:hire_date, 'YYYY-MM-DD')", department_id "DECODE(:department_id, 'HR', 10, 'IT', 20, 'FIN', 30, 99)")
Performance Optimization
Section titled “Performance Optimization”1. Direct Path Loading
Section titled “1. Direct Path Loading”# Enable direct path for better performancesqlldr hr/password control=employees.ctl direct=true
2. Parallel Loading
Section titled “2. Parallel Loading”#!/bin/bash# parallel_load.sh - Split large file and load in parallel
# Split large file into chunkssplit -l 100000 large_file.csv chunk_
# Load chunks in parallelfor chunk in chunk_*; do sqlldr hr/password control=load_template.ctl data=$chunk direct=true &done
# Wait for all processes to completewait
3. Optimal Settings
Section titled “3. Optimal Settings”LOAD DATAINFILE 'large_file.csv'INTO TABLE large_tableOPTIONS ( SKIP=1, -- Skip header row LOAD=1000000, -- Load only first 1M rows ERRORS=1000, -- Allow up to 1000 errors ROWS=10000, -- Commit every 10K rows BINDSIZE=1048576, -- 1MB bind array READSIZE=1048576 -- 1MB read buffer)FIELDS TERMINATED BY ','(employee_id, first_name, last_name, salary)
Monitoring and Logging
Section titled “Monitoring and Logging”1. Log File Analysis
Section titled “1. Log File Analysis”# Check load statisticsgrep -A 5 "Total logical records" employees.log
# Check for errorsgrep -i "error\|rejected\|ora-" employees.log
# Performance metricsgrep "elapsed time\|rows per second" employees.log
2. Database Monitoring
Section titled “2. Database Monitoring”-- Check current SQL*Loader sessionsSELECT sid, serial#, username, program, statusFROM v$sessionWHERE program LIKE '%sqlloader%';
-- Monitor table loading progressSELECT table_name, num_rowsFROM user_tablesWHERE table_name = 'EMPLOYEES';
Common Control File Templates
Section titled “Common Control File Templates”1. Standard CSV Template
Section titled “1. Standard CSV Template”LOAD DATAINFILE '*' -- Use same name as control fileAPPEND INTO TABLE <table_name>FIELDS TERMINATED BY ','OPTIONALLY ENCLOSED BY '"'TRAILING NULLCOLS( column1, column2, column3 DATE "YYYY-MM-DD")
2. Tab-Delimited Template
Section titled “2. Tab-Delimited Template”LOAD DATAINFILE 'data.txt'INTO TABLE <table_name>FIELDS TERMINATED BY X'09' -- Tab characterTRAILING NULLCOLS( column1, column2, column3)
3. Pipe-Delimited with NULL Handling
Section titled “3. Pipe-Delimited with NULL Handling”LOAD DATAINFILE 'data.txt'INTO TABLE <table_name>FIELDS TERMINATED BY '|'TRAILING NULLCOLS( column1 NULLIF column1 = 'NULL', column2 DEFAULTIF column2 = BLANKS, column3 "NVL(:column3, 'DEFAULT_VALUE')")
Best Practices
Section titled “Best Practices”1. Pre-Loading Preparation
Section titled “1. Pre-Loading Preparation”-- Disable constraints during loadALTER TABLE employees DISABLE CONSTRAINT emp_pk;ALTER TABLE employees DISABLE CONSTRAINT emp_fk;
-- Drop indexes for large loadsDROP INDEX emp_name_idx;
-- Load data with SQL*Loader
-- Re-enable constraintsALTER TABLE employees ENABLE CONSTRAINT emp_pk;ALTER TABLE employees ENABLE CONSTRAINT emp_fk;
-- Recreate indexesCREATE INDEX emp_name_idx ON employees(last_name, first_name);
2. Data Validation Script
Section titled “2. Data Validation Script”#!/bin/bashDATA_FILE="employees.csv"CONTROL_FILE="employees.ctl"LOG_FILE="employees.log"
# Count input recordsINPUT_COUNT=$(wc -l < $DATA_FILE)echo "Input records: $INPUT_COUNT"
# Run SQL*Loadersqlldr hr/password control=$CONTROL_FILE log=$LOG_FILE
# Check resultsLOADED_COUNT=$(grep "successfully loaded" $LOG_FILE | awk '{print $1}')ERROR_COUNT=$(grep "rejected" $LOG_FILE | awk '{print $1}')
echo "Records loaded: $LOADED_COUNT"echo "Records rejected: $ERROR_COUNT"
# Validate counts matchif [ $((LOADED_COUNT + ERROR_COUNT)) -ne $INPUT_COUNT ]; then echo "WARNING: Record count mismatch!" exit 1fi
Troubleshooting Common Issues
Section titled “Troubleshooting Common Issues”1. ORA-00001: Unique constraint violated
Section titled “1. ORA-00001: Unique constraint violated”-- Use APPEND mode or handle duplicatesLOAD DATAINFILE 'data.csv'APPEND INTO TABLE employeesWHEN employee_id NOT IN (SELECT employee_id FROM employees)FIELDS TERMINATED BY ','(employee_id, first_name, last_name)
2. Character set issues
Section titled “2. Character set issues”# Specify character setsqlldr hr/password control=data.ctl characterset=UTF8
3. Performance problems
Section titled “3. Performance problems”# Use direct path and increase buffer sizessqlldr hr/password control=data.ctl direct=true \ bindsize=1048576 readsize=1048576
Integration Examples
Section titled “Integration Examples”1. Automated Daily Load
Section titled “1. Automated Daily Load”#!/bin/bashDATE=$(date +%Y%m%d)DATA_DIR="/data/incoming"ARCHIVE_DIR="/data/archive"
for file in $DATA_DIR/*.csv; do if [ -f "$file" ]; then filename=$(basename "$file" .csv)
# Load data sqlldr hr/password control=${filename}.ctl data="$file" \ log="${filename}_${DATE}.log"
# Archive processed file mv "$file" "$ARCHIVE_DIR/${filename}_${DATE}.csv" fidone
2. Error Notification
Section titled “2. Error Notification”#!/bin/bashLOG_FILE="employees.log"
sqlldr hr/password control=employees.ctl log=$LOG_FILE
# Check for errorsif grep -i "ora-\|error" $LOG_FILE > /dev/null; then mail -s "SQL*Loader Error Detected" $EMAIL < $LOG_FILEfi
Quick Reference
Section titled “Quick Reference”Essential Commands
Section titled “Essential Commands”# Basic loadsqlldr userid=user/password control=file.ctl
# Direct path loadsqlldr userid=user/password control=file.ctl direct=true
# Parallel load with multiple sessionssqlldr userid=user/password control=file.ctl parallel=true
# Skip header rowssqlldr userid=user/password control=file.ctl skip=1
Common Control File Clauses
Section titled “Common Control File Clauses”Clause | Purpose | Example |
---|---|---|
INFILE | Specify data file | INFILE 'data.csv' |
BADFILE | Bad records file | BADFILE 'data.bad' |
FIELDS TERMINATED BY | Field delimiter | FIELDS TERMINATED BY ',' |
TRAILING NULLCOLS | Handle missing columns | TRAILING NULLCOLS |
WHEN | Conditional loading | WHEN (1:1) = 'A' |
This foundation will help you implement effective bulk data loading with SQL*Loader in your Oracle environment.