Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

72

积分

0

好友

11

主题
1#
发表于 2012-8-11 08:32:16 | 查看: 7430| 回复: 7
oracle 11.2.0.3 linux  一个节点不能启动
os linux 5.5

第一个节点,据现场描述,在linux上安装了卡巴斯基后,oracle就down了。然后工程师卸载了卡巴斯基,第一个节点的crs启动不起来了。
客户曾经重新启动了主机。
附件是ocssd的日志。
2012-08-09 17:48:26.465: [    CSSD][1103505728]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386129, wrtcnt, 750118, LATS 5171474, lastSeqNo 750117, uniqueness 1344499854, timestamp 1344505705/5231454
心跳是通的。
crs里面记载的心跳的网络也是没有问题的。
查看了mos说是多播的问题的,用mos上的工具测试了下,看到如下:
[oracle@node-rac1 mcasttest]$  perl mcasttest.pl -n node-rac1,node-rac2 -i eth1
###########  Setup for node node-rac1  ##########
Checking node access 'node-rac1'
Checking node login 'node-rac1'
Checking/Creating Directory /tmp/mcasttest for binary on node 'node-rac1'
Distributing mcast2 binary to node 'node-rac1'
###########  Setup for node node-rac2  ##########
Checking node access 'node-rac2'
Checking node login 'node-rac2'
Checking/Creating Directory /tmp/mcasttest for binary on node 'node-rac2'
Distributing mcast2 binary to node 'node-rac2'
###########  testing Multicast on all nodes  ##########
Test for Multicast address 230.0.1.0
8月 10 22:23:42 | Multicast Succeeded for eth1 using address 230.0.1.0:42000
Test for Multicast address 224.0.0.251
8月 10 22:23:43 | Multicast Succeeded for eth1 using address 224.0.0.251:42001

防火墙也关闭了。

crslog.rar

1.44 MB, 下载次数: 1048

2#
发表于 2012-8-11 10:30:44
这个帖子为什么发在调优版?发错版面的帖子,今后我本人一律只转不看!!

回复 只看该作者 道具 举报

3#
发表于 2012-8-11 18:14:23
traceroute -r -F -m1  <node1_priv>
traceroute -r -F -m1  <node2_priv>

回复 只看该作者 道具 举报

4#
发表于 2012-8-11 19:12:19
谢谢。谢谢。谢谢。谢谢。

回复 只看该作者 道具 举报

5#
发表于 2012-8-11 21:39:15
  1. 2012-08-10 17:45:05.213: [    CSSD][1075632448]clssgmJoinGrock: global grock CRF- new client 0x2aaab004ada0 with con 0x8568, requested num -1, flags 0x4000e00
  2. 2012-08-10 17:45:05.213: [    CSSD][1075632448]clssgmJoinGrock: ignoring grock join for client not requiring fencing until group information has been received from the master; group name CRF-, member number -1, flags 0x4000e00
  3. 2012-08-10 17:45:05.213: [    CSSD][1075632448]clssgmDiscEndpcl: gipcDestroy 0x8568
  4. 2012-08-10 17:45:05.214: [    CSSD][1075632448]clssgmDeadProc: proc 0x2aaab002d570
  5. 2012-08-10 17:45:05.214: [    CSSD][1075632448]clssgmDestroyProc: cleaning up proc(0x2aaab002d570) con(0x8539) skgpid  ospid 15244 with 0 clients, refcount 0
  6. 2012-08-10 17:45:05.214: [    CSSD][1075632448]clssgmDiscEndpcl: gipcDestroy 0x8539
  7. 2012-08-10 17:45:05.267: [    CSSD][1105119552]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 0
  8. 2012-08-10 17:45:05.590: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833242, LATS 6165464, lastSeqNo 833241, uniqueness 1344566603, timestamp 1344591904/24678294
  9. 2012-08-10 17:45:06.269: [    CSSD][1105119552]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 0
  10. 2012-08-10 17:45:06.593: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833243, LATS 6166464, lastSeqNo 833242, uniqueness 1344566603, timestamp 1344591905/24679304
  11. 2012-08-10 17:45:06.732: [    CSSD][1108273472]clssnmSendingThread: sending join msg to all nodes
  12. 2012-08-10 17:45:06.732: [    CSSD][1108273472]clssnmSendingThread: sent 4 join msgs to all nodes
  13. 2012-08-10 17:45:07.271: [    CSSD][1105119552]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 0
  14. 2012-08-10 17:45:07.613: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833244, LATS 6167484, lastSeqNo 833243, uniqueness 1344566603, timestamp 1344591906/24680304
  15. 2012-08-10 17:45:07.722: [    CSSD][1109850432]clssnmRcfgMgrThread: Local Join
  16. 2012-08-10 17:45:07.722: [    CSSD][1109850432]clssnmLocalJoinEvent: begin on node(1), waittime 193000
  17. 2012-08-10 17:45:07.722: [    CSSD][1109850432]clssnmLocalJoinEvent: set curtime (6167594) for my node
  18. 2012-08-10 17:45:07.722: [    CSSD][1109850432]clssnmLocalJoinEvent: scanning 32 nodes
  19. 2012-08-10 17:45:07.722: [    CSSD][1109850432]clssnmLocalJoinEvent: Node node-rac2, number 2, is in an existing cluster with disk state 3
  20. 2012-08-10 17:45:07.723: [    CSSD][1109850432]clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk
  21. 2012-08-10 17:45:08.124: [    CSSD][1075632448]clssscSelect: cookie accept request 0x2aaaac029ee0
  22. 2012-08-10 17:45:08.124: [    CSSD][1075632448]clssgmAllocProc: (0x2aaab0072100) allocated
  23. 2012-08-10 17:45:08.125: [    CSSD][1075632448]clssgmClientConnectMsg: properties of cmProc 0x2aaab0072100 - 1,2,3,4,5
  24. 2012-08-10 17:45:08.125: [    CSSD][1075632448]clssgmClientConnectMsg: Connect from con(0x85c4) proc(0x2aaab0072100) pid(15501) version 11:2:1:4, properties: 1,2,3,4,5
  25. 2012-08-10 17:45:08.125: [    CSSD][1075632448]clssgmClientConnectMsg: msg flags 0x0000
  26. 2012-08-10 17:45:08.127: [    CSSD][1075632448]clssscSelect: cookie accept request 0x2aaab0072100
  27. 2012-08-10 17:45:08.127: [    CSSD][1075632448]clssscevtypSHRCON: getting client with cmproc 0x2aaab0072100
  28. 2012-08-10 17:45:08.127: [    CSSD][1075632448]clssgmRegisterClient: proc(4/0x2aaab0072100), client(1/0x2aaab0074960)
  29. 2012-08-10 17:45:08.127: [    CSSD][1075632448]clssgmJoinGrock: global grock CRF- new client 0x2aaab0074960 with con 0x85f3, requested num -1, flags 0x4000e00
  30. 2012-08-10 17:45:08.128: [    CSSD][1075632448]clssgmJoinGrock: ignoring grock join for client not requiring fencing until group information has been received from the master; group name CRF-, member number -1, flags 0x4000e00
  31. 2012-08-10 17:45:08.128: [    CSSD][1075632448]clssgmDiscEndpcl: gipcDestroy 0x85f3
  32. 2012-08-10 17:45:08.129: [    CSSD][1075632448]clssgmDeadProc: proc 0x2aaab0072100
  33. 2012-08-10 17:45:08.129: [    CSSD][1075632448]clssgmDestroyProc: cleaning up proc(0x2aaab0072100) con(0x85c4) skgpid  ospid 15501 with 0 clients, refcount 0
  34. 2012-08-10 17:45:08.129: [    CSSD][1075632448]clssgmDiscEndpcl: gipcDestroy 0x85c4
  35. 2012-08-10 17:45:08.273: [    CSSD][1105119552]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 0
  36. 2012-08-10 17:45:08.616: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833245, LATS 6168494, lastSeqNo 833244, uniqueness 1344566603, timestamp 1344591907/24681304
  37. 2012-08-10 17:45:09.276: [    CSSD][1105119552]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 0
  38. 2012-08-10 17:45:09.618: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833246, LATS 6169494, lastSeqNo 833245, uniqueness 1344566603, timestamp 1344591908/24682304
  39. 2012-08-10 17:45:09.655: [    CSSD][1075632448]clssgmExecuteClientRequest: MAINT recvd from proc 2 (0x18e0da70)
  40. 2012-08-10 17:45:09.655: [    CSSD][1075632448]clssgmShutDown: Received abortive shutdown request from client.
  41. 2012-08-10 17:45:09.655: [    CSSD][1075632448]###################################
  42. 2012-08-10 17:45:09.655: [    CSSD][1075632448]clssscExit: CSSD aborting from thread GMClientListener
  43. 2012-08-10 17:45:09.655: [    CSSD][1075632448]###################################
  44. 2012-08-10 17:45:09.655: [    CSSD][1075632448](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally
  45. 2012-08-10 17:45:09.656: [    CSSD][1075632448]clssgmUpdateEventValue: CmInfo State  val 0, changes 1
  46. 2012-08-10 17:45:09.727: [    CSSD][1106696512]clssnmPollingThread: state(1) clusterState(0) exit
  47. 2012-08-10 17:45:09.727: [    CSSD][1106696512]clssscExit: abort already set 0
  48. 2012-08-10 17:45:10.278: [    CSSD][1105119552]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 0 waited 380
  49. 2012-08-10 17:45:10.278: [    CSSD][1105119552]clssgmPeerListener: terminating at incarn(0)
  50. 2012-08-10 17:45:10.620: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833247, LATS 6170494, lastSeqNo 833246, uniqueness 1344566603, timestamp 1344591909/24683304
  51. 2012-08-10 17:45:10.742: [    CSSD][1108273472]clssnmSendingThread: sending join msg to all nodes
  52. 2012-08-10 17:45:10.742: [    CSSD][1108273472]clssnmSendingThread: sent 4 join msgs to all nodes
  53. 2012-08-10 17:45:11.622: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833248, LATS 6171494, lastSeqNo 833247, uniqueness 1344566603, timestamp 1344591910/24684314
  54. 2012-08-10 17:45:12.625: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833249, LATS 6172504, lastSeqNo 833248, uniqueness 1344566603, timestamp 1344591911/24685314
  55. 2012-08-10 17:45:13.629: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833250, LATS 6173504, lastSeqNo 833249, uniqueness 1344566603, timestamp 1344591912/24686314
  56. 2012-08-10 17:45:14.632: [    CSSD][1096227136]clssnmvDHBValidateNCopy: node 2, node-rac2, has a disk HB, but no network HB, DHB has rcfg 234386141, wrtcnt, 833251, LATS 6174504, lastSeqNo 833250, uniqueness 1344566603, timestamp 1344591913/24687314
  57. 2012-08-10 17:45:14.750: [    CSSD][1108273472]clssnmSendingThread: sending join msg to all nodes
  58. 2012-08-10 17:45:14.750: [    CSSD][1108273472]clssnmSendingThread: sent 4 join msgs to all nodes
复制代码
检查过 heartbeat network了吗?


cat /etc/hosts

2012-08-09 17:55:08.455: [ CLSINET][1093237056] # 0 Interface 'eth1',ip='10.0.0.1',mac='00-e0-81-c8-da-da',mask='255.255.255.128',net='10.0.0.0',use='cluster_interconnect'



先ping 下看看,如果能正常ping ,尝试在重启问题节点

回复 只看该作者 道具 举报

6#
发表于 2012-8-11 21:51:38
先ping 下看看,如果能正常ping ,尝试在重启问题节点
ping了正常。
重新启动了有问题的节点了。也没效果。

回复 只看该作者 道具 举报

7#
发表于 2012-8-11 21:53:12
提问的智慧:  只列出自己做了什么,看到什么, 不要强调自己的看法和结论

回复 只看该作者 道具 举报

8#
发表于 2013-2-19 16:01:47
whiterain 发表于 2012-8-11 21:51
先ping 下看看,如果能正常ping ,尝试在重启问题节点
ping了正常。
重新启动了有问题的节点了。也没效果。 ...

一样的版本,我也是同样结果ping的通重启节点无效,你们是怎么解决的啊?

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-27 03:48 , Processed in 0.051315 second(s), 23 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569