Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

77

积分

0

好友

0

主题
1#
发表于 2012-3-4 10:34:39 | 查看: 3641| 回复: 1
10g rac,重启一个节点(OS.CRS,DB都重启了),发现一直停在mount状态,一直在OPEN,搞了一个小时还没有正常启动,没有报错,结果重新启动RAC的两个节点实例,才搞定。alert.log,crs,集群日志没有任何报错信息。在一个节点的alert有一条可能相关的信息:
Thread 1 advanced to log sequence 189311
  Current log# 2 seq# 189311 mem# 0: /iomdata1/iomdb/redo13.log
  Current log# 2 seq# 189311 mem# 1: /iomdata1/iomdb/redo14.log
Sun Mar  4 06:59:59 2012
GES: Potential blocker (pid=6558972) on resource TT-00000000-00000000;
enqueue info in file /oracle/app/oracle/admin/iomdb/bdump/iomdb1_lmd0_2119230.trc and DIAG trace file
Sun Mar  4 07:01:24 2012

为什么会无法open呢?
2#
发表于 2012-3-5 22:03:22
请上传 /oracle/app/oracle/admin/iomdb/bdump/iomdb1_lmd0_2119230.trc  和 当时 diag的trace

可能的一个bug:
  1. Hdr: 5192411 10.2.0.1 RDBMS 10.2.0.1 RAC PRODID-5 PORTID-197
  2. Abstract: DATABASE OPEN HANGS ON TT ENQUEUE

  3. *** 04/27/06 08:28 am ***


  4. =========================   
  5. PROBLEM:

  6. 1. Clear description of the problem encountered:

  7.    Instance open can hang waiting on TT enqueue.

  8. 2. Pertinent configuration information (MTS/OPS/distributed/etc)

  9.    3-node RAC on HP/UX Itanium.

  10. 3. Indication of the frequency and predictability of the problem  

  11.     Seems intermittent, but ct. has given a sequence of events below to
  12. repeat this:

  13. 4. Sequence of events leading to the problem  

  14. step 1. Instance 1,2,3 : open
  15. step 2. Instance 1 : shutdown abort
  16. step 3. Instance 2 : startup --> this will hang attempting to get TT enqueue.


  17. 5. Technical impact on the customer. Include persistent after effects.

  18.     Unable to start instance.

  19. =========================   
  20. DIAGNOSTIC ANALYSIS:


  21. - From alert_node1.log :

  22. Tue Apr 25 13:24:39 2006
  23. GES: Potential blocker (pid=29817) on resource TT-0x0-0x0;
  24. enqueue info in file /db/trace/bdump/ocoredb1_lmd0_21047.trc and DIAG trace
  25. file

  26. - The corresponding ocoredb1_diag_20660.trc :

  27. [---cut---]
  28. Dumping process info for ospid 29817:
  29. ...
  30. #4  0x4000000002e2deb0:0 in ksliwat () at ksl.c:7238
  31. #5  0x4000000002d76c20:0 in kslwaitns () at ksl.c:7435
  32. #6  0x4000000002d76d80:0 in kskthbwt () at ksk.c:1907
  33. #7  0x4000000002d75a00:0 in kslwait () at ksl.c:7419
  34. #8  0x4000000005653e40:0 in kjsdrmwt () at
  35. /project/qa/reza/5036736/kjs.c:4129
  36. #9  0x40000000056519a0:0 in kjsmbesmi () at
  37. /project/qa/reza/5036736/kjs.c:4495
  38. #10 0x40000000056f35e0:0 in kjbenterdlm ()
  39. #11 0x4000000005477260:0 in kclusnaff () at
  40. /project/qa/reza/5034539/kcl.c:9683
  41. #12 0x40000000054b1b90:0 in kcblus () at kcb.c:16342
  42. #13 0x40000000029c19e0:0 in $cold_ktuswr+0x1d90 () at ksl.c:3964
  43. #14 0x40000000021d37a0:0 in ktusmous_online_undoseg () at ktusm.c:649
  44. #15 0x4000000001e34710:0 in ktusmout_online_ut () at ktusm.c:1339
  45. #16 0x4000000001da8100:0 in ktusmiut_init_ut () at ktusm.c:1010
  46. #17 0x4000000001dae380:0 in ktuini () at ktu.c:2944
  47. #18 0x400000000209da80:0 in adbdrv () at dbsdrv.c:4436
  48. #19 0x4000000002e71090:0 in opiexe () at opiexe.c:2634
  49. #20 0x4000000002e97460:0 in opiosq0 () at opiosq.c:598
  50. #21 0x400000000238eb60:0 in kpooprx () at kpoal8.c:1607
  51. #22 0x4000000002485c20:0 in kpoal8 () at kpoal8.c:657
  52. #23 0x4000000002e3cd80:0 in opiodr () at opiodr.c:615
  53. #24 0x4000000002ecbe90:0 in ttcpip () at ttcpip.c:946
  54. #25 0x4000000002eca660:0 in opitsk () at opitsk.c:835
  55. #26 0x40000000021c0890:0 in opiino () at opiino.c:1108
  56. #27 0x4000000002e3cd80:0 in opiodr () at opiodr.c:615
  57. #28 0x400000000214d1d0:0 in opidrv () at opidrv.c:793
  58. #29 0x400000000214c6f0:0 in sou2o () at sou2o.c:128
  59. #30 0x400000000213c340:0 in opimai_real () at opimai.c:257
  60. #31 0x4000000002043a20:0 in main () at opimai.c:173
  61. ...
  62. Current wait and wait history for this process is  'gcs drm freeze in enter
  63. server mode'
  64. Current SQL statement 'ALTER DATABASE OPEN'
  65. ...
  66. Then we have a dump of the blocking process, followed by a systemstate for
  67. this instance. In the systemstate MMON also holds a TT lock but on
  68. TT-7FFFFFFF-00000001.

  69. -The systemstate from node 3 shows a process waiting on TT-0
  70. -The systemstate from node 2 shows a process holding TT-0 , but this
  71. systemstate is dumped a little later than the process dump and is for another
  72. 'alter database open' attempt.

  73. - Based on the stack reference to kjsdrmwt asked ct. to disable DRM i.e. set:

  74.    _gc_affinity_time=0
  75.    _gc_undo_affinity=false

  76. - The problem reproduces with _gc_affinity_time=0.
  77. - Still waiting for ct. to test with  _gc_undo_affinity=false.
  78. - Attempted to get systemstates level 266 , but the stacks didn't get
  79. printed.

  80. =========================   
  81. WORKAROUND:

  82. None.

  83. =========================   
  84. RELATED BUGS:

  85. A couple of similar bugs, but fixed in 10.2.0.1:

  86. bug:3523133
  87. bug:3838679


  88. =========================   
  89. REPRODUCIBILITY:
  90. 1. State if the problem is reproducible; indicate where and predictability

  91. - Seems to be reproducable.

  92. 2. List the versions in which the problem has reproduced

  93. 3. List any versions in which the problem has not reproduced

  94. There is no 10.2.0.2 available yet for this platform.
复制代码
GES: Potential blocker (pid=29817) on resource TT-0x0-0x0;
enqueue info in file /db/trace/bdump/ocoredb1_lmd0_21047.trc and DIAG trace
file

需要 pid=29817 stack call 来确认该问题

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-24 04:27 , Processed in 0.045061 second(s), 21 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569