ORA-00600: internal error code, arguments: [4194] 数据库恢复
ORA-00600: internal error code, arguments: 数据库恢复
记录下当时数据库诊断和恢复大致步骤 oracle 10g rac
大致情况如下:
半夜3点钟,驻地工程师电话打给我说是RAC数据库服务器突然重启了,起来后oracle进程占用CPU 99%-100%。我SSH远程登录上查看了下,原来是oracle系统进程想smon、dbwn等一些进程很不正常。
这个时候,第一要去查看的是alert文件,发现有以下错误:
Sat Apr 7 03:42:54 2012
alter database open
Sat Apr 7 03:42:54 2012
This instance was first to open
Sat Apr 7 03:42:58 2012
Beginning crash recovery of 1 threads
parallel recovery started with 3 processes
Sat Apr 7 03:42:59 2012
Started redo scan
Sat Apr 7 03:42:59 2012
Completed redo scan
142 redo blocks read, 99 data blocks need recovery
Sat Apr 7 03:42:59 2012
Started redo application at
Thread 2: logseq 78462, block 332
Sat Apr 7 03:42:59 2012
Recovery of Online Redo Log: Thread 2 Group 14 Seq 78462 Reading mem 0
Mem# 0: +ASM_VAS_STORAGE/ora9i/onlinelog/redo1401
Sat Apr 7 03:42:59 2012
Completed redo application
Sat Apr 7 03:42:59 2012
Completed crash recovery at
Thread 2: logseq 78462, block 474, scn 12417856572307
99 data blocks read, 99 data blocks written, 142 redo blocks read
Picked broadcast on commit scheme to generate SCNs
Sat Apr 7 03:43:00 2012
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=33, OS id=13815
ARC1 started with pid=34, OS id=13817
ARC2 started with pid=35, OS id=13844
ARC3 started with pid=36, OS id=13846
Sat Apr 7 03:43:00 2012
ARC0: Archival started
ARC1: Archival started
ARC2: Archival started
ARC3: Archival started
ARC4: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC4 started with pid=37, OS id=13848
Sat Apr 7 03:43:00 2012
Thread 2 advanced to log sequence 78463 (thread open)
Thread 2 opened at log sequence 78463
Current log# 15 seq# 78463 mem# 0: +ASM_VAS_STORAGE/ora9i/onlinelog/redo1501
Successful open of redo thread 2
Sat Apr 7 03:43:00 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Apr 7 03:43:00 2012
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Sat Apr 7 03:43:00 2012
ARC2: Becoming the heartbeat ARCH
Sat Apr 7 03:43:00 2012
SMON: enabling cache recovery
Sat Apr 7 03:43:01 2012
Successfully onlined Undo Tablespace 5.
Sat Apr 7 03:43:01 2012
SMON: enabling tx recovery
Sat Apr 7 03:43:04 2012
Thread 1 advanced to log sequence 30000 (archiver wakeup)
Sat Apr 7 03:43:04 2012
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 4
replication_dependency_tracking turned off (no async multimaster replication found)
Sat Apr 7 03:43:05 2012
Errors in file /opt/oracle/admin/ora9i/bdump/ora9i2_smon_12689.trc:
ORA-00600: internal error code, arguments: , , , [], [], [], [], []
Starting background process QMNC
QMNC started with pid=41, OS id=14000
Sat Apr 7 03:43:08 2012
Completed: alter database open
Sat Apr 7 03:43:09 2012
Errors in file /opt/oracle/admin/ora9i/bdump/ora9i2_mmon_12695.trc:
ORA-00600: internal error code, arguments: , , , [], [], [], [], []
一看到到 600 4194 错误,就推断数据库服务器在重启的时候,undo在处理事务的时候 oracle异常关闭才出现的问题。
(幸好2节点只有B节点出现问题)
找到问题根源就容易解决问题了
上下翻看了log,找到以下一段:
Errors in file /opt/oracle/admin/ora9i/bdump/ora9i2_smon_25366.trc:
ORA-01595: error freeing extent (48) of rollback segment (12))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: , , , [], [], [], [], []
回滚段异常,数据库是不能正常关闭的,不然会出现以下错误:
SQL> shutdown immediate;
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: , , , [], [], [], [],
[]
解决方法如下:
SQL> create pfile='/home/oracle/pfile.ora' from spfile;
SQL> shutdown abort;
/home/oracle/pfile.ora 文件添加以下两行
ora9i2.undo_management='MANUAL'
ora9i2._corrupted_rollback_segments='_SYSSMU12$'
SQL> startup pfile=/home/oracle/pfile.ora --到这一步,是能正常启动的
--重建undo表空间
SQL> drop tablespace undotbs2 including contents and datafiles;
SQL> create undo tablespace undotbs2 datafile '+ASM_VAS_STORAGE/ora9i/datafile/undotbs2.dbf' size 4096m autoextend on next 50m maxsize 10240m;
SQL> shutdown immediate; --能正常关闭
SQL> startup --启动正常
恢复完毕
页:
[1]