WMLM 发表于 2014-5-22 12:08:08

节点1正常,节点2 css 不能启动

AIX6100-07-04-1216 + HACMP + ORACLE10.2.0.1 RAC

原来正常使用,这几天节点1正常,节点2 css 不能启动。


crsd.log 部分内容
2014-05-22 09:35:36.310: [ COMMCRS]clsc_connect: (11098bd70) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac2_crs))
2014-05-22 09:35:36.310: [ CSSCLNT]clsssInitNative: connect failed, rc 9
2014-05-22 09:35:36.311: [  CRSRTI]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-05-22 09:35:37.644: [ COMMCRS]clsc_connect: (11098bd70) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac2_crs))
2014-05-22 09:35:37.644: [ CSSCLNT]clsssInitNative: connect failed, rc 9
2014-05-22 09:35:37.644: [  CRSRTI]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-05-22 09:35:38.978: [ COMMCRS]clsc_connect: (11098bd70) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac2_crs))
2014-05-22 09:35:38.978: [ CSSCLNT]clsssInitNative: connect failed, rc 9
2014-05-22 09:35:38.978: [  CRSRTI]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..


ocssd.log 部分内容
2014-05-22 09:35:37.329 >TRACE:   clssnmInitNMInfo: misscount set to 600
2014-05-22 09:35:37.331 >TRACE:   clssnmDiskStateChange: state from 1 to 2 disk (0//dev/rrac_vote)
2014-05-22 09:35:39.333 >TRACE:   clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rrac_vote)
2014-05-22 09:35:39.336 >TRACE:   clssnmReadDskHeartbeat: node(1) is down. rcfg(4) wrtcnt(1459468) LATS(0) Disk lastSeqNo(1459468)
2014-05-22 09:35:39.433 >TRACE:   clssnmFatalInit: fatal mode enabled
2014-05-22 09:35:39.439 >TRACE:   clssnmconnect: connecting to node 2, flags 0x0001, connector 1
2014-05-22 09:35:39.448 >TRACE:   clssnmconnect: connecting to node 1, flags 0x0001, connector 0
2014-05-22 09:35:39.502 >TRACE:   clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_2))
2014-05-22 09:35:39.502 >TRACE:   clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac2_crs))
2014-05-22 09:35:40.339 >TRACE:   clssnmReadDskHeartbeat: node(1) is down. rcfg(4) wrtcnt(1459469) LATS(1470324929) Disk lastSeqNo(1459469)
2014-05-22 09:35:41.349 >TRACE:   clssnmReadDskHeartbeat: node(1) is down. rcfg(4) wrtcnt(1459470) LATS(1470325939) Disk lastSeqNo(1459470)
2014-05-22 09:35:42.359 >TRACE:   clssnmReadDskHeartbeat: node(1) is down. rcfg(4) wrtcnt(1459471) LATS(1470326949) Disk lastSeqNo(1459471)

hacmp 正常启动。 datavg concurentvg 正常带起。
两个节点上lsvg datavg ; lsvg -l datavg 状态正常。
public ip 和 心跳地址 都能相互ping通。
public ip ping 网关地址也正常。

ocr disk \ voting disk 用的两个lv,状态在两个节点上都是open;

出问题的节点2上的完整日志打包在附件中。
怀疑是共享磁盘的问题,但是如何解决呢?
页: [1]
查看完整版本: 节点1正常,节点2 css 不能启动