Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

13

积分

0

好友

0

主题
1#
发表于 2012-6-11 12:56:21 | 查看: 6262| 回复: 7
crsd-log.rar (122.31 KB, 下载次数: 936)
非常非常感谢!!
时间是6月10日晚上23点以后。
2#
发表于 2012-6-11 12:59:21
请上传 ocssd.log 而非crsd.log 在$ORA_CRS_HOME/log/$hostname/cssd 目录下

回复 只看该作者 道具 举报

3#
发表于 2012-6-11 14:05:36

cssd.log

cssd-log.rar (271.69 KB, 下载次数: 1009)
sorry,着急给弄错了

回复 只看该作者 道具 举报

4#
发表于 2012-6-11 14:42:29
KEY WORD   10.2.0.1  + OCFS


NODE 2
  1. [    CSSD]2012-06-10 23:06:22.507 [3054844816] >TRACE:   clssgmReconfigThread:  completed for reconfig(5), with status(1)
  2. [    CSSD]2012-06-10 23:06:22.945 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948e4b8) proc(0x948f690) pid() proto(10:2:1:1)
  3. [    CSSD]2012-06-10 23:06:23.498 [3065334672] >TRACE:   clssnmWaitForAcks: done, msg type(15)
  4. [    CSSD]2012-06-10 23:06:23.498 [3065334672] >TRACE:   clssnmDoSyncUpdate: Sync Complete!
  5. [    CSSD]2012-06-10 23:06:24.298 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94b8700) proc(0x949d480) pid() proto(10:2:1:1)
  6. [    CSSD]2012-06-10 23:07:29.241 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948e4b8) proc(0x9487758) pid() proto(10:2:1:1)
  7. [    CSSD]2012-06-10 23:07:34.770 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94935c0) proc(0x949d480) pid() proto(10:2:1:1)
  8. [    CSSD]2012-06-10 23:08:35.623 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948fb98) proc(0x94878c8) pid() proto(10:2:1:1)
  9. [    CSSD]2012-06-10 23:08:38.701 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94935c0) proc(0x949d480) pid() proto(10:2:1:1)
  10. [    CSSD]2012-06-10 23:08:45.646 [68623248] >TRACE:   clssnmReadDskHeartbeat: node(1) is down. rcfg(1) wrtcnt(1) LATS(378924) Disk lastSeqNo(1)
  11. [    CSSD]2012-06-10 23:08:47.014 [89602960] >TRACE:   clssnmConnComplete: probe from node 1
  12. [    CSSD]2012-06-10 23:08:47.014 [89602960] >TRACE:   clssnmconnect: connecting to node 1, flags 0x0001, connector 1
  13. [    CSSD]2012-06-10 23:08:47.015 [89602960] >TRACE:   clssnmConnComplete: connected to node 1 (con 0x948fb98), state 1 birth 0, unique 1339384124/1339384124  prevConuni(0)
  14. [    CSSD]2012-06-10 23:08:47.712 [3065334672] >TRACE:   clssnmDoSyncUpdate: Initiating sync 6
  15. [    CSSD]2012-06-10 23:08:47.712 [3065334672] >TRACE:   clssnmSetupAckWait: Ack message type (11)
  16. [    CSSD]2012-06-10 23:08:47.713 [3065334672] >TRACE:   clssnmSetupAckWait: node(1) is ALIVE
  17. [    CSSD]2012-06-10 23:08:47.713 [3065334672] >TRACE:   clssnmSetupAckWait: node(2) is ALIVE
  18. [    CSSD]2012-06-10 23:08:47.713 [3065334672] >TRACE:   clssnmSendSync: syncSeqNo(6)
  19. [    CSSD]2012-06-10 23:08:47.713 [3065334672] >TRACE:   clssnmWaitForAcks: Ack message type(11), ackCount(2)
  20. [    CSSD]2012-06-10 23:08:47.713 [89602960] >TRACE:   clssnmHandleSync: Acknowledging sync: src[2] srcName[rac2] seq[5] sync[6]
  21. [    CSSD]2012-06-10 23:08:47.789 [1480512] >USER:    NMEVENT_SUSPEND [00][00][00][04]
  22. [    CSSD]2012-06-10 23:08:48.716 [3065334672] >TRACE:   clssnmWaitForAcks: done, msg type(11)
  23. [    CSSD]2012-06-10 23:08:48.716 [3065334672] >TRACE:   clssnmDoSyncUpdate: node(0) missCount(755) state(0)
  24. [    CSSD]2012-06-10 23:08:48.716 [3065334672] >TRACE:   clssnmDoSyncUpdate: node(1) is transitioning from joining state to active state
  25. [    CSSD]2012-06-10 23:08:48.716 [3065334672] >TRACE:   clssnmSetupAckWait: Ack message type (13)
  26. [    CSSD]2012-06-10 23:08:48.716 [3065334672] >TRACE:   clssnmSetupAckWait: node(1) is ACTIVE
  27. [    CSSD]2012-06-10 23:08:48.716 [3065334672] >TRACE:   clssnmSetupAckWait: node(2) is ACTIVE
  28. [    CSSD]2012-06-10 23:08:48.716 [3065334672] >TRACE:   clssnmSendVote: syncSeqNo(6)
  29. [    CSSD]2012-06-10 23:08:48.717 [3065334672] >TRACE:   clssnmWaitForAcks: Ack message type(13), ackCount(2)
  30. [    CSSD]2012-06-10 23:08:48.717 [89602960] >TRACE:   clssnmSendVoteInfo: node(2) syncSeqNo(6)
  31. [    CSSD]2012-06-10 23:08:49.718 [3065334672] >TRACE:   clssnmWaitForAcks: done, msg type(13)
  32. [    CSSD]2012-06-10 23:08:49.718 [3065334672] >TRACE:   clssnmCheckDskInfo: Checking disk info...
  33. [    CSSD]2012-06-10 23:08:50.719 [3065334672] >TRACE:   clssnmEvict: Start
  34. [    CSSD]2012-06-10 23:08:50.719 [3065334672] >TRACE:   clssnmWaitOnEvictions: Start
  35. [    CSSD]2012-06-10 23:08:50.719 [3065334672] >TRACE:   clssnmWaitOnEvictions: Node(0) down, LATS(0),timeout(383994)
  36. [    CSSD]2012-06-10 23:08:50.719 [3065334672] >TRACE:   clssnmSetupAckWait: Ack message type (15)
  37. [    CSSD]2012-06-10 23:08:50.719 [3065334672] >TRACE:   clssnmSetupAckWait: node(1) is ACTIVE
  38. [    CSSD]2012-06-10 23:08:50.719 [3065334672] >TRACE:   clssnmSetupAckWait: node(2) is ACTIVE
  39. [    CSSD]2012-06-10 23:08:50.719 [3065334672] >TRACE:   clssnmSendUpdate: syncSeqNo(6)
  40. [    CSSD]2012-06-10 23:08:50.721 [3065334672] >TRACE:   clssnmWaitForAcks: Ack message type(15), ackCount(2)
  41. [    CSSD]2012-06-10 23:08:50.722 [89602960] >TRACE:   clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
  42. [    CSSD]2012-06-10 23:08:50.722 [89602960] >TRACE:   clssnmDeactivateNode: node 0 () left cluster

  43. [    CSSD]2012-06-10 23:08:50.722 [89602960] >TRACE:   clssnmUpdateNodeState: node 1, state (2/2) unique (1339384124/1339384124) prevConuni(0) birth (6/6) (old/new)
  44. [    CSSD]2012-06-10 23:08:50.722 [89602960] >TRACE:   clssnmUpdateNodeState: node 2, state (3/3) unique (1339383371/1339383371) prevConuni(0) birth (4/4) (old/new)
  45. [    CSSD]2012-06-10 23:08:50.722 [89602960] >USER:    clssnmHandleUpdate: SYNC(6) from node(2) completed
  46. [    CSSD]2012-06-10 23:08:50.722 [89602960] >USER:    clssnmHandleUpdate: NODE 1 (rac1) IS ACTIVE MEMBER OF CLUSTER
  47. [    CSSD]2012-06-10 23:08:50.722 [89602960] >USER:    clssnmHandleUpdate: NODE 2 (rac2) IS ACTIVE MEMBER OF CLUSTER
  48. [    CSSD]2012-06-10 23:08:50.731 [3054844816] >TRACE:   clssgmReconfigThread:  started for reconfig (6)
  49. [    CSSD]2012-06-10 23:08:50.731 [3054844816] >USER:    NMEVENT_RECONFIG [00][00][00][06]
  50. [    CSSD]2012-06-10 23:08:50.731 [3054844816] >TRACE:   clssgmEstablishConnections: 2 nodes in cluster incarn 6
  51. [    CSSD]2012-06-10 23:08:50.819 [125442960] >TRACE:   clssgmInitialRecv: (0x94935c0) accepted a new connection from node 1 born at 6 active (2, 2), vers (10,3,1,2)
  52. [    CSSD]2012-06-10 23:08:50.819 [125442960] >TRACE:   clssgmInitialRecv: conns done (2/2)
  53. [    CSSD]2012-06-10 23:08:50.819 [3054844816] >TRACE:   clssgmEstablishMasterNode: MASTER for 6 is node(2) birth(4)
  54. [    CSSD]2012-06-10 23:08:50.819 [3054844816] >TRACE:   clssgmMasterCMSync: Synchronizing group/lock status
  55. [    CSSD]2012-06-10 23:08:50.828 [3054844816] >TRACE:   clssgmMasterSendDBDone: group/lock status synchronization complete
  56. [    CSSD]CLSS-3000: reconfiguration successful, incarnation 6 with 2 nodes

  57. [    CSSD]CLSS-3001: local node number 2, master node number 2

  58. [    CSSD]2012-06-10 23:08:50.832 [3054844816] >TRACE:   clssgmReconfigThread:  completed for reconfig(6), with status(1)
  59. [    CSSD]2012-06-10 23:08:51.723 [3065334672] >TRACE:   clssnmWaitForAcks: done, msg type(15)
  60. [    CSSD]2012-06-10 23:08:51.723 [3065334672] >TRACE:   clssnmDoSyncUpdate: Sync Complete!
  61. [    CSSD]2012-06-10 23:08:54.141 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948e4b8) proc(0x949c380) pid() proto(10:2:1:1)
  62. [    CSSD]2012-06-10 23:09:42.122 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x93386c0) proc(0x949fe20) pid() proto(10:2:1:1)
  63. [    CSSD]2012-06-10 23:10:48.531 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948e4b8) proc(0x949fe20) pid() proto(10:2:1:1)
  64. [    CSSD]2012-06-10 23:11:55.014 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94a6e80) proc(0x949fe20) pid() proto(10:2:1:1)
  65. [    CSSD]2012-06-10 23:12:50.928 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94a6e80) proc(0x949fe20) pid() proto(10:2:1:1)
  66. [    CSSD]2012-06-10 23:13:01.353 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94a6e80) proc(0x949fdb8) pid() proto(10:2:1:1)
  67. [    CSSD]2012-06-10 23:13:04.177 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948e4b8) proc(0x94b6948) pid() proto(10:2:1:1)
  68. [    CSSD]2012-06-10 23:14:07.876 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x93386c0) proc(0x949fe20) pid() proto(10:2:1:1)
  69. [    CSSD]2012-06-10 23:15:14.465 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948e4b8) proc(0x949fe20) pid() proto(10:2:1:1)
  70. [    CSSD]2012-06-10 23:16:21.076 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94a6e80) proc(0x949fe20) pid() proto(10:2:1:1)
  71. [    CSSD]2012-06-10 23:17:27.628 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x93386c0) proc(0x949fe20) pid() proto(10:2:1:1)
  72. [    CSSD]2012-06-10 23:18:34.073 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948e4b8) proc(0x949fe20) pid() proto(10:2:1:1)
  73. [    CSSD]2012-06-10 23:19:40.592 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94a6e80) proc(0x949fe20) pid() proto(10:2:1:1)
  74. [    CSSD]2012-06-10 23:20:47.143 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x93386c0) proc(0x949fe20) pid() proto(10:2:1:1)
  75. [    CSSD]2012-06-10 23:21:53.814 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x93386c0) proc(0x94b9358) pid() proto(10:2:1:1)
  76. [    CSSD]2012-06-10 23:22:51.812 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94a6e80) proc(0x949fe20) pid() proto(10:2:1:1)
  77. [    CSSD]2012-06-10 23:23:00.323 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x93386c0) proc(0x949fdb8) pid() proto(10:2:1:1)
  78. [    CSSD]2012-06-10 23:23:04.915 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94a6e80) proc(0x94b6948) pid() proto(10:2:1:1)
  79. [    CSSD]2012-06-10 23:24:06.618 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94a6e80) proc(0x949fe20) pid() proto(10:2:1:1)
  80. [    CSSD]2012-06-10 23:25:13.000 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948e4b8) proc(0x949fe20) pid() proto(10:2:1:1)
  81. [    CSSD]2012-06-10 23:26:19.550 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x93386c0) proc(0x949fe20) pid() proto(10:2:1:1)
  82. [    CSSD]2012-06-10 23:27:26.027 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x94a6e80) proc(0x949fe20) pid() proto(10:2:1:1)
  83. [    CSSD]2012-06-10 23:28:16.154 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x93386c0) proc(0x949a138) pid() proto(10:2:1:1)
  84. [    CSSD]2012-06-10 23:28:16.244 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x9389c20) proc(0x94ae2c8) pid() proto(10:2:1:1)
  85. [    CSSD]2012-06-10 23:28:18.580 [104463248] >TRACE:   clssgmClientConnectMsg: Connect from con(0x948e4b8) proc(0x9389468) pid() proto(10:2:1:1)
复制代码
NODE 2 在23:08:45.646 [68623248] >TRACE:   clssnmReadDskHeartbeat: node(1) is down. 发现 NODE 1 的 diskheartbeat 失败 发起对NODE 1的驱逐




类似的NODE 1也做了以一样的的事情
  1. [    CSSD]2012-06-10 23:08:47.185 [90983312] >TRACE:   clssnmconnect: connecting to node 0, flags 0x0000, connector 1
  2. [    CSSD]2012-06-10 23:08:47.185 [90983312] >TRACE:   clssnmClusterListener: Probing node(2)
  3. [    CSSD]2012-06-10 23:08:47.188 [90983312] >TRACE:   clssnmConnComplete: connected to node 2 (con 0x8c006c8), state 3 birth 0, unique 1339383371/1339383371  prevConuni(0)
  4. [    CSSD]2012-06-10 23:08:47.212 [55368592] >TRACE:   clssnmReadDskHeartbeat: node(2) is down. rcfg(6) wrtcnt(693) LATS(0) Disk lastSeqNo(693)
  5. [    CSSD]2012-06-10 23:08:47.234 [3065797520] >TRACE:   clssnmPollingThread: Connection complete
  6. [    CSSD]2012-06-10 23:08:47.234 [3055307664] >TRACE:   clssnmSendingThread: Connection complete
  7. [    CSSD]2012-06-10 23:08:47.234 [3044817808] >TRACE:   clssnmRcfgMgrThread: Connection complete
  8. [    CSSD]2012-06-10 23:08:47.234 [3044817808] >TRACE:   clssnmRcfgMgrThread: Local Join
  9. [    CSSD]2012-06-10 23:08:47.234 [3044817808] >TRACE:   clssnmLocalJoinEvent: set node(2) inactive
  10. [    CSSD]2012-06-10 23:08:47.234 [3044817808] >WARNING: clssnmLocalJoinEvent: takeover aborted due to UNKNOWN nodes
  11. [    CSSD]2012-06-10 23:08:47.234 [101473168] >TRACE:   clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_mycrs_1))
  12. [    CSSD]2012-06-10 23:08:47.234 [101473168] >TRACE:   clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_mycrs))
  13. [    CSSD]2012-06-10 23:08:47.887 [90983312] >TRACE:   clssnmHandleSync: Acknowledging sync: src[2] srcName[rac2] seq[5] sync[6]
  14. [    CSSD]2012-06-10 23:08:48.235 [3044817808] >TRACE:   clssnmRcfgMgrThread: lastleader(2) unique(1339384124)
  15. [    CSSD]2012-06-10 23:08:48.890 [90983312] >TRACE:   clssnmSendVoteInfo: node(2) syncSeqNo(6)
  16. [    CSSD]2012-06-10 23:08:50.895 [90983312] >TRACE:   clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
  17. [    CSSD]2012-06-10 23:08:50.895 [90983312] >TRACE:   clssnmDeactivateNode: node 0 () left cluster

  18. [    CSSD]2012-06-10 23:08:50.895 [90983312] >TRACE:   clssnmUpdateNodeState: node 1, state (1/2) unique (1339384124/1339384124) prevConuni(0) birth (0/6) (old/new)
  19. [    CSSD]2012-06-10 23:08:50.895 [90983312] >TRACE:   clssnmUpdateNodeState: node 2, state (4/3) unique (1339383371/1339383371) prevConuni(0) birth (0/4) (old/new)
  20. [    CSSD]2012-06-10 23:08:50.896 [90983312] >USER:    clssnmHandleUpdate: SYNC(6) from node(2) completed
  21. [    CSSD]2012-06-10 23:08:50.896 [90983312] >USER:    clssnmHandleUpdate: NODE 1 (rac1) IS ACTIVE MEMBER OF CLUSTER
  22. [    CSSD]2012-06-10 23:08:50.896 [90983312] >USER:    clssnmHandleUpdate: NODE 2 (rac2) IS ACTIVE MEMBER OF CLUSTER
  23. [    CSSD]2012-06-10 23:08:50.987 [2058448] >USER:    NMEVENT_SUSPEND [00][00][00][00]
  24. [    CSSD]2012-06-10 23:08:50.988 [3034327952] >TRACE:   clssgmReconfigThread:  started for reconfig (6)
  25. [    CSSD]2012-06-10 23:08:50.988 [3034327952] >USER:    NMEVENT_RECONFIG [00][00][00][06]
  26. [    CSSD]2012-06-10 23:08:50.989 [3034327952] >TRACE:   clssgmEstablishConnections: 2 nodes in cluster incarn 6
  27. [    CSSD]2012-06-10 23:08:50.992 [3076287376] >TRACE:   clssgmInitialRecv: (0x8d4d7b8) accepted a new connection from node 2 born at 4 active (2, 2), vers (10,3,1,2)
  28. [    CSSD]2012-06-10 23:08:50.992 [3076287376] >TRACE:   clssgmInitialRecv: conns done (2/2)
  29. [    CSSD]2012-06-10 23:08:50.993 [3034327952] >TRACE:   clssgmEstablishMasterNode: MASTER for 6 is node(2) birth(4)
  30. [    CSSD]2012-06-10 23:08:50.993 [3034327952] >TRACE:   clssgmChangeMasterNode: requeued 0 RPCs
  31. [    CSSD]2012-06-10 23:08:51.001 [3076287376] >TRACE:   clssgmHandleDBDone(): src/dest (2/65535) size(68) incarn 6
  32. [    CSSD]CLSS-3000: reconfiguration successful, incarnation 6 with 2 nodes

  33. [    CSSD]CLSS-3001: local node number 1, master node number 2
复制代码
clssnmReadDskHeartbeat 磁盘心跳失败引发的10.2.0.1 的brain split 导致 Stonith algorithm算法被触发,2台主机均重启


建议

不要使用 10.2.0.1 版本的CRS ,不管你是测试 还是产品环境!!

不要使用OCFS作为共享存储解决方案!

回复 只看该作者 道具 举报

5#
发表于 2012-6-11 22:43:12

感谢..

感谢您的帮助,给您添麻烦了!
嗯,这是确实是一个测试环境,其实目的倒还不是测RAC,是想用现有的rac管理第三方应用,所以在虚机上试了一下,不过真环境也是10g的还是ocfs2。晕,说到底还是因为自己太盲目和无知。
是不是10201和ocfs都有什么毛病啊?加一起毛病更大,还是?在您方便的时候,告诉我一下吧,想知道一下。
再致谢,感谢

回复 只看该作者 道具 举报

6#
发表于 2012-6-11 22:50:53

回复 5# 的帖子

不管是测试 还是其他目的  至少使用10.2.0.5 的CRS clusterware ,10.2.0.1 CRS BUG非常多 ,这已经是老生常谈了!

回复 只看该作者 道具 举报

7#
发表于 2012-6-11 23:30:42
thx,感谢

回复 只看该作者 道具 举报

8#
发表于 2012-6-25 16:04:43
他的public网卡故障,怎么引发判断磁盘心跳失败?

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-26 00:50 , Processed in 0.056765 second(s), 25 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569