Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖
1#
发表于 2012-2-7 20:14:42
1. 什么存储环境?

2.

ODM finding :
  1. ORA-600 [kcratr_nab_less_than_odr] during Instance Recovery after Database Crash [ID 1299564.1]

  2. Applies to:
  3. Oracle Server - Enterprise Edition - Version: 11.2.0.1 to 11.2.0.2 - Release: 11.2 to 11.2
  4. Information in this document applies to any platform.
  5. Symptoms
  6. Trying to open a Database after a Crash caused by Storage Problems the Instance Recovery fails with :
  7. ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [219], [25020], [25021], []

  8. The Database can't open at this Point. In the corresponding Tracefile we can find this Error Callstack:


  9. dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
  10. ----- Current SQL Statement for this session (sql_id=1h50ks4ncswfn) -----
  11. ALTER DATABASE OPEN

  12. ----- Call Stack Trace -----
  13. ksedst1 <- ksedst <- dbkedDefDump <- ksedmp <- dbgexPhaseII <- dbgexProcessError <- dbgePostErrorKGE  <- kgeasnmierr <- kcratr_odr_check  <- kcratr <- kctrec <- kcvcrv <- kcfopd <- adbdrv <- opiexe <- opiosq0 <- kpoal8 <- opiodr <- ttcpip <- opitsk <- opiino <- opiodr <- opidrv <- sou2o <- opimai_real <- ssthrdmain <- main <- start


  14. Cause
  15. This Problem is caused by Storage Problem of the Database Files. The Subsystem (eg. SAN) crashed while the Database was open. The Database then crashed since the Database Files were not accessible anymore. This caused a lost Write into the Online RedoLogs and so Instance Recovery is not possible and raising the ORA-600.
  16. Solution
  17. There are two possible Solutions:

  18. 1. If you could restore your Storage Environment and the Online RedoLogs from the Time of the crash you can try a manual Recovery followed by a RESETLOGS:

  19. SQL> startup mount;

  20. SQL> recover database until cancel using backup controlfile;

  21. -> manually provide Online RedoLog containing the last (current) Sequence when asked, eg.

  22. ORA-00279: change 100000 generated at xx/xx/xxxx xx:xx:xx needed for thread 1
  23. ORA-00289: suggestion :
  24. /flash_recovery/archivelog/xxxx_xx_xx/o1_mf_1_100_%u_.arc
  25. ORA-00280: change 100000 for thread 1 is in sequence #100

  26. Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
  27. /ora/oradata/dbtest/redo04_1.rdo
  28. Log applied.
  29. Media recovery complete.

  30. SQL> alter database open resetlogs;

  31. 2. If  step1. fails or you don't have the full Set of Files you have to restore and recover the Database from a recent Backup.


  32. Alter database open fails with ORA-00600 kcratr_nab_less_than_odr [ID 1296264.1]

  33. Applies to:
  34. Oracle Server - Standard Edition - Version: 11.2.0.1 and later   [Release: 11.2 and later ]
  35. Information in this document applies to any platform.
  36. Symptoms
  37. After Power Fail Alter database open fails with

  38. ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr]

  39. Changes
  40. Power failure
  41. Cause
  42. There was a power failure causing logical corruption in controlfile
  43. Solution

  44. Option a
  45. ------------

  46. SQL>Startup mount ;

  47. SQL>Show parameter control_files

  48. Query 1
  49. ------------

  50. sql>select a.member,a.group#,b.status from v$logfile a ,v$log b where a.group#=b.group# and b.status='CURRENT'

  51. Note down the name of the redo log

  52. SQL>Shutdown abort ;

  53. Take a OS Level back of the controlfile (This is to ensure we have a backup of current state of controlfile)

  54. SQL>Startup mount ;

  55. SQL>recover database using backup controlfile until cancel ;

  56. Enter location of redo log shown as current in Query 1 when prompted for recovery

  57. Hit Enter

  58. SQL>Alter database open resetlogs ;





  59. Option b
  60. -----------


  61. Recreate the controlfile using the Controlfile recreation script

  62. With database in mount stage

  63. rman target /

  64. rman> spool log to '/tmp/rman.log';

  65. Rman> list backup ;

  66. Rman > exit

  67. Keep this log handy

  68. Go to sqlplus

  69. SQL> Show parameter control_files

  70. Keep this location handy.

  71. SQL>oradebug setmypid

  72. SQL>Alter session set tracefile_identifier='controlfilerecreate' ;

  73. SQL>Alter database backup controlfile to trace ;

  74. SQL>Oradebug tracefile_name ; --> This command will give the path and name of the trace file


  75. Go to this location ,Open this trace file and select the controlfile recreation script with NO Resetlogs option


  76. SQL>Shutdown immediate;


  77. Rename the existing controlfile to <originalname>_old ---> This is Important as we need to have a backup of existing controlfile since we plan to recreate it

  78. SQL>Startup nomount

  79. Now run the Controlfile recreation script with NO Resetlogs mode

  80. SQL>Alter database open ;

  81. For database version 10g and above

  82. Once database is opened you can recatalog the rman backup information present in the list /tmp/rman.log using

  83. Rman> Catalog start with '<location of backupiece>' ;


  84. Once the database has been opened using the option a or option b its recommended to take a hot backup of the database.
  85. Same Steps are applicable to Rac if all instance are down with same error.
复制代码

回复 显示全部楼层 道具 举报

2#
发表于 2012-2-7 20:39:23
1.kcratr_nab_less_than_odr 可能因为 存储问题引发:

Trying to open a Database after a Crash caused by Storage Problems the Instance Recovery fails with


2.  分析trace:


Dump continued from file: /orasys/diag/rdbms/wimng2/wimng2/trace/wimng2_ora_29785.trc
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [1468], [57304], [57605], [], [], [], [], [], [], []

========= Dump for incident 16953 (ORA 600 [kcratr_nab_less_than_odr]) ========

*** 2012-02-07 13:40:54.447
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=a01hp0psv0rrh) -----
alter database open

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+41        call     kgdsdst()            BFFE7388 ? 2 ?
ksedst1()+77         call     skdstdst()           BFFE7388 ? 0 ? 1 ? AB8E3A8 ?
                                                   853C46E ? AB8E3A8 ?
ksedst()+33          call     ksedst1()            0 ? 1 ?
dbkedDefDump()+2699  call     ksedst()             0 ? 5AF911 ? BFFE74AC ?
                                                   1007B40C ? BFFE7794 ? 0 ?
ksedmp()+47          call     dbkedDefDump()       3 ? 2 ?
ksfdmp()+59          call     ksedmp()             3EB ? BFFE92D0 ? DFBE5A3 ?
                                                   106AD160 ? 3EB ? 106AD160 ?
dbgexPhaseII()+1725  call     00000000             106AD160 ? 3EB ?
dbgexProcessError()  call     dbgexPhaseII()       B7FEB598 ? B7DBC888 ?
+2089                                              BFFECBA4 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   106AD160 ? B7FDD0D4 ? 258 ?
47
kgeadse()+286        call     00000000             106AD160 ? B7FDD0D4 ? 258 ?
kgerinv_internal()+  call     kgeadse()            106AD160 ? B7FDD0D4 ? 258 ?
47                                                 FD8DC58 ? 0 ? 4 ? BFFED45C ?
kgerinv()+41         call     kgerinv_internal()   106AD160 ? B7FDD0D4 ?
                                                   FD8DC58 ? 258 ? 0 ? 4 ?
                                                   BFFED45C ?
kgeasnmierr()+47     call     kgerinv()            106AD160 ? B7FDD0D4 ?
                                                   FD8DC58 ? 4 ? BFFED45C ?
kcratr_odr_check()+  call     kgeasnmierr()        106AD160 ? B7FDD0D4 ?
204                                                FD8DC58 ? 4 ? 0 ? 1 ?
kcratr()+1806        call     kcratr_odr_check()   BFFED6EC ? 0 ? F386D53 ? 0 ?
                                                   9 ? F386D53 ?
kctrec()+9311        call     kcratr()             BFFED6EC ? BFFF45D0 ? 0 ?
kcvcrv()+5906        call     kctrec()             BFFF5868 ? 0 ? B7FD0BD0 ?
                                                   B7FD122C ? B7E1BE00 ? 0 ?


Kernel function kcratr 是forward recovery algorithm的起点 kcrfr.c Kernel Cache Redo

[kcratr_nab_less_than_odr], [1], [1468], [57304], [57605], [], [], [], [], [], [], []的argument 定义

(a) redo thread id
(b) redo log sequence
(c) NAB
(d) on-disk rda  block number


这个trace 日志里有一个过程 是比较理想的 rolling forward 前滚的教学演示:

  1. 2012-02-07 13:40:53.366569 :80000687:KFNU:kfn.c@2200:kfnPrepareASM(): kfnPrepareASM force=0 state_kfnsg=0x7
  2. 2012-02-07 13:40:53.366569*:80000688:CACHE_RCV:kcv.c@16365:kcvcrv(): kcvcrv: Calling kctrec()
  3. 2012-02-07 13:40:53.366569*:80000689:CACHE_RCV:kct.c@4163:kctrec(): kctrec: Entering kctrec()
  4. 2012-02-07 13:40:53.413557*:8000068A:CACHE_RCV:kct.c@4271:kctrec(): kctrec: thread 1 cf thread ckpt: logseq 1468, block 2,scn 25917106
  5. 2012-02-07 13:40:53.413557*:8000068B:CACHE_RCV:kct.c@4285:kctrec(): kctrec: Checkpoint progress record contents
  6. 2012-02-07 13:40:53.413557*:8000068C:CACHE_RCV:kct.c@4287:kctrec(): kctrec:    kcccpsta 2, kcccpflg 0, kcccpdrt 48, kcccplrba 0x0005bc.0000dfd8.0000 kcccpodr 0x0005bc.0000e105.0000
  7. 2012-02-07 13:40:53.413557*:8000068D:CACHE_RCV:kct.c@4299:kctrec(): kctrec:    kcccpods 0x0000.018be694, kcccpodt 773934914, kcccprlc 753362405, kcccprls 0x0000.00000001, kcccphbt 774572255, kcccpmid 1635578584
  8. 2012-02-07 13:40:53.413557*:8000068E:CACHE_RCV:kct.c@4311:kctrec(): kctrec:    kcccpsdr 0x0005bc.00000001.0000, kcccpfbend (krfbafln 0, krfbathr 0, krfbaseq 0, krfbabno 0 krfbabof 0), kcccprsv 0
  9. 2012-02-07 13:40:53.413557*:8000068F:CACHE_RCV:kct.c@4360:kctrec(): kctrec: cache-low rba: logseq 1468, block 57304
  10. 2012-02-07 13:40:53.413557*:80000690:CACHE_RCV:kct.c@4374:kctrec(): kctrec: on-disk rba: logseq 1468, block 57605, scn 25945748
  11. 2012-02-07 13:40:53.413557*:80000691:CACHE_RCV:kct.c@4450:kctrec(): kctrec: Current ckpt RBA < cache-low RBA, adjusted ckpt RBA to cache low RBA, zeroed ckpt SCN and timestamp to 0
  12. 2012-02-07 13:40:53.413557*:80000692:CACHE_RCV:kct.c@4604:kctrec(): kctrec: Recovery starting point for thread 1 -  logseq 1468, block 57304, scn 0
  13. 2012-02-07 13:40:53.449498*:80000693:CACHE_RCV:kct.c@4664:kctrec(): kctrec: Do thread recovery, calling kcratr()
  14. 2012-02-07 13:40:53.456376 :80000694:CACHE_RCV:kcra.c@1517:kcratr(): kcratr: Entering kcratr()
  15. 2012-02-07 13:40:53.458293 :80000695:CACHE_RCV:kcra.c@1541:kcratr(): kcratr: Started redo scan
  16. 2012-02-07 13:40:53.458293*:80000696:CACHE_RCV:kcra.c@1862:kcratr_scan(): kcratr_scan: Entering kcratr_scan()
  17. 2012-02-07 13:40:53.458293*:80000697:CACHE_RCV:kcra.c@2000:kcratr_scan(): kcratr_scan: Log not open, opening online log for thread 1, RBA 0x0005bc.0000dfd8.0000, SCN 0x0000.00000000
  18. 2012-02-07 13:40:53.694427*:800006A4:CACHE_RCV:kcra.c@2036:kcratr_scan(): kcratr_scan: End of curr thread reached
  19. 2012-02-07 13:40:53.694427*:800006A5:CACHE_RCV:kcra.c@2038:kcratr_scan(): kcratr_scan:    end rcv RBA 0x0005bc.0000dfd8.   0, end rcv SCN 0x0000.018b76b3 end SCN timestamp 773895659, NAB 57304
  20. 2012-02-07 13:40:53.694427*:800006A6:CACHE_RCV:kcra.c@2048:kcratr_scan(): kcratr_scan:   (Previous) highest SCN seen in the redo stream 0x0000.00000000
  21. 2012-02-07 13:40:53.694427*:800006A7:CACHE_RCV:kcra.c@2162:kcratr_scan(): kcratr_scan: Exiting kcratr_scan()
  22. 2012-02-07 13:40:53.702245 :800006A8:CACHE_RCV:kcra.c@1559:kcratr(): kcratr: Completed redo scan, read 0 KB redo, 0 data blocks need recovery
复制代码


这里可以看到 kcratr_scan 负责scan redo log 读取了 redo logfile header ,发现NAB =57304 这个值小与 odr( on disk rba redo block adress) ,
说明 redo logfile header存在讹误,于是报 600 kcratr_nab_less_than_odr错误

回复 显示全部楼层 道具 举报

3#
发表于 2012-2-7 20:42:18
1.
尝试用MOS的方案1解决该问题:


SQL> startup mount;

SQL> recover database until cancel using backup controlfile;

SQL> alter database open resetlogs;

回复 显示全部楼层 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-5-4 05:13 , Processed in 0.049245 second(s), 22 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569