- 最后登录
- 2023-8-16
- 在线时间
- 1686 小时
- 威望
- 2135
- 金钱
- 50532
- 注册时间
- 2011-10-12
- 阅读权限
- 200
- 帖子
- 5207
- 精华
- 39
- 积分
- 2135
- UID
- 2
|
1#
发表于 2012-7-19 20:29:33
|
查看: 4425 |
回复: 1
Question from 网友:
服务器型号
IBM P740
数据库版本:
CRS:10.2.0.1
操作系统版本
$ oslevel -s
6100-05-06-1119
HACMP版本
# lslpp -l |grep cluster
cluster.adt.es.client.include
cluster.adt.es.client.samples.clinfo
cluster.adt.es.client.samples.clstat
cluster.adt.es.client.samples.libcl
cluster.adt.es.java.demo.monitor
cluster.doc.en_US.assist.db2.html
cluster.doc.en_US.assist.db2.pdf
cluster.doc.en_US.assist.oracle.html
cluster.doc.en_US.assist.oracle.pdf
cluster.doc.en_US.assist.websphere.html
cluster.doc.en_US.assist.websphere.pdf
cluster.doc.en_US.es.html 6.1.0.0 COMMITTED HAES Web-based HTML
cluster.doc.en_US.es.pdf 6.1.0.0 COMMITTED HAES PDF Documentation - U.S.
cluster.es.assist.common 6.1.0.0 COMMITTED HACMP Smart Assist Common
cluster.es.assist.db2 6.1.0.3 COMMITTED HACMP Smart Assist for DB2
cluster.es.assist.oracle 6.1.0.2 COMMITTED HACMP Smart Assist for Oracle
cluster.es.assist.sap 6.1.0.0 COMMITTED HACMP Smart Assist for SAP
cluster.es.assist.websphere
cluster.es.cfs.rte 6.1.0.1 COMMITTED ES Cluster File System Support
cluster.es.client.clcomd 6.1.0.4 COMMITTED ES Cluster Communication
cluster.es.client.lib 6.1.0.3 COMMITTED ES Client Libraries
cluster.es.client.rte 6.1.0.4 COMMITTED ES Client Runtime
cluster.es.client.utils 6.1.0.2 COMMITTED ES Client Utilities
cluster.es.client.wsm 6.1.0.3 COMMITTED Web based Smit
cluster.es.cspoc.cmds 6.1.0.5 COMMITTED ES CSPOC Commands
cluster.es.cspoc.dsh 6.1.0.0 COMMITTED ES CSPOC dsh
cluster.es.cspoc.rte 6.1.0.5 COMMITTED ES CSPOC Runtime Commands
cluster.es.nfs.rte 6.1.0.2 COMMITTED ES NFS Support
cluster.es.plugins.dhcp 6.1.0.0 COMMITTED ES Plugins - dhcp
cluster.es.plugins.dns 6.1.0.0 COMMITTED ES Plugins - Name Server
cluster.es.plugins.printserver
cluster.es.server.cfgast 6.1.0.0 COMMITTED ES Two-Node Configuration
cluster.es.server.diag 6.1.0.4 COMMITTED ES Server Diags
cluster.es.server.events 6.1.0.5 COMMITTED ES Server Events
cluster.es.server.rte 6.1.0.5 COMMITTED ES Base Server Runtime
cluster.es.server.testtool
cluster.es.server.utils 6.1.0.5 COMMITTED ES Server Utilities
cluster.es.worksheets 6.1.0.1 COMMITTED Online Planning Worksheets
cluster.license 6.1.0.0 COMMITTED HACMP Electronic License
cluster.msg.en_US.assist 6.1.0.0 COMMITTED HACMP Smart Assist Messages -
cluster.msg.en_US.cspoc 6.1.0.0 COMMITTED HACMP CSPOC Messages - U.S.
cluster.msg.en_US.es.client
cluster.msg.en_US.es.server
cluster.es.assist.db2 6.1.0.0 COMMITTED HACMP Smart Assist for DB2
cluster.es.assist.oracle 6.1.0.0 COMMITTED HACMP Smart Assist for Oracle
cluster.es.assist.sap 6.1.0.0 COMMITTED HACMP Smart Assist for SAP
cluster.es.assist.websphere
cluster.es.client.clcomd 6.1.0.4 COMMITTED ES Cluster Communication
cluster.es.client.lib 6.1.0.3 COMMITTED ES Client Libraries
cluster.es.client.rte 6.1.0.4 COMMITTED ES Client Runtime
cluster.es.client.wsm 6.1.0.0 COMMITTED Web based Smit
cluster.es.cspoc.rte 6.1.0.0 COMMITTED ES CSPOC Runtime Commands
cluster.es.nfs.rte 6.1.0.2 COMMITTED ES NFS Support
cluster.es.server.diag 6.1.0.0 COMMITTED ES Server Diags
cluster.es.server.events 6.1.0.0 COMMITTED ES Server Events
cluster.es.server.rte 6.1.0.5 COMMITTED ES Base Server Runtime
cluster.es.server.utils 6.1.0.5 COMMITTED ES Server Utilities
cluster.man.en_US.assist.data
cluster.man.en_US.es.data 6.1.0.2 COMMITTED ES Man Pages - U.S. English
----------------------------------------------------------------------------------
现象描述:
安装10.2.0.1 crs 环境,在执行root.sh时失败,在两个节点都报错:"Failure at final check of Oracle CRS stack"
检查CRS进程,发现cssd进程没有起来
ps -ef|grep init
root 1 0 0 15:27:02 - 0:00 /etc/init
root 10485970 1 0 13:16:56 - 0:00 /bin/sh /etc/init.cssd oclsmon
root 11534362 1 0 13:22:51 - 0:00 /bin/sh /etc/init.evmd run
root 13828156 1 0 13:17:04 - 0:00 /bin/sh /etc/init.cssd fatal
root 14548996 1 0 13:18:09 - 0:00 /bin/sh /etc/init.crsd run
检查日志ocssd.log,发现有以下错误:
ERROR: clssnm_skgxnmon: Failure 0 registering.(3/sskgxn_gs_join (pb)/skgxnreg)
搜索metalink,发现一篇文档的出错信息和这次的一样,但是操作系统是AIX5.3的
bug 7373395
检查系统环境
$ id
uid=1003(ora10g) gid=101(dba) groups=1(staff),220(hagsuser),301(oinstall)
$ id root
uid=0(root) gid=0(system) groups=2(bin),3(sys),7(security),8(cron),10(audit),11(lp),220(hagsuser)
$ lsnodes -n
p740brac1 2
p740arac1 1
Cluster: ibeavrac_cluster (1083654549)
Tue Jul 17 14:12:17 GMT+08:00 2012
State: UP Nodes: 2
SubState: STABLE
Node: p740arac1 State: UP
Interface: p740arac1 (0) Address: 10.6.184.148
State: UP
Interface: p740arac1_pri (1) Address: 1.184.100.148
State: UP
Resource Group: ibeavrac_rg State: On line
Node: p740brac1 State: UP
Interface: p740brac1 (0) Address: 10.6.184.151
State: UP
Interface: p740brac1_pri (1) Address: 1.184.100.151
State: UP
Resource Group: ibeavrac_rg State: On line
cssd里的报错
[ CSSD]2012-07-17 10:34:42.607 [1029] >ERROR: clssnm_skgxnmon: Failure 0 registering.(3/sskgxn_gs_join (pb)/skgxnreg)
[ CSSD]2012-07-17 10:34:42.609 [1] >TRACE: clssnmInitNMInfo: misscount set to 600
[ CSSD]2012-07-17 10:34:42.612 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/rrac_voting1)
[ CSSD]2012-07-17 10:34:42.615 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (1//dev/rrac_voting2)
[ CSSD]2012-07-17 10:34:42.617 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (2//dev/rrac_voting3)
[ CSSD]2012-07-17 10:34:44.614 [1031] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rrac_voting1)
[ CSSD]2012-07-17 10:34:44.617 [1544] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/rrac_voting2)
[ CSSD]2012-07-17 10:34:44.618 [1801] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/rrac_voting3)
[ CSSD]2012-07-17 10:34:44.626 [1] >TRACE: clssscSclsFatal: read value of disable
[ CSSD]2012-07-17 10:34:44.626 [2315] >TRACE: clssnmFatalThread: spawned
[ CSSD]2012-07-17 10:34:44.626 [1] >TRACE: clssscSclsFatal: read value of disable
[ CSSD]2012-07-17 10:34:44.626 [2572] >TRACE: clssnmconnect: connecting to node 1, flags 0x0001, connector 1
[ CSSD]2012-07-17 10:34:44.673 [2829] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_ibeav_1))
[ CSSD]2012-07-17 10:34:44.673 [2829] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_p740arac1_ibeav))
[ CSSD]2012-07-17 10:34:44.682 [3857] >TRACE: clssnmPollingThread: Connection complete
[ CSSD]2012-07-17 10:34:44.682 [4114] >TRACE: clssnmSendingThread: Connection complete
[ CSSD]2012-07-17 10:34:44.682 [4371] >TRACE: clssnmRcfgMgrThread: Connection complete
[ CSSD]2012-07-17 10:34:44.682 [4371] >TRACE: clssnmRcfgMgrThread: Local Join
[ CSSD]2012-07-17 10:34:44.682 [4371] >TRACE: clssnmDoSyncUpdate: Initiating sync 1
[ CSSD]2012-07-17 10:34:44.682 [4371] >TRACE: clssnmSetupAckWait: Ack message type (11)
[ CSSD]2012-07-17 10:34:44.682 [4371] >TRACE: clssnmSetupAckWait: node(1) is ALIVE
[ CSSD]2012-07-17 10:34:44.683 [4371] >TRACE: clssnmSendSync: syncSeqNo(1)
[ CSSD]2012-07-17 10:34:44.683 [4371] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(1)
[ CSSD]2012-07-17 10:34:44.683 [2572] >TRACE: clssnmHandleSync: Acknowledging sync: src[1] srcName[p740arac1] seq[1] sync[1]
[ CSSD]2012-07-17 10:34:44.782 [1] >USER: NMEVENT_SUSPEND [00][00][00][00]
[ CSSD]2012-07-17 10:34:45.683 [4371] >TRACE: clssnmWaitForAcks: done, msg type(11)
[ CSSD]2012-07-17 10:34:45.683 [4371] >TRACE: clssnmSetupAckWait: Ack message type (13)
[ CSSD]2012-07-17 10:34:45.683 [4371] >TRACE: clssnmSetupAckWait: node(1) is ACTIVE
[ CSSD]2012-07-17 10:34:45.683 [4371] >TRACE: clssnmSendVote: syncSeqNo(1)
[ CSSD]2012-07-17 10:34:45.683 [4371] >TRACE: clssnmWaitForAcks: Ack message type(13), ackCount(1)
[ CSSD]2012-07-17 10:34:45.683 [2572] >TRACE: clssnmSendVoteInfo: node(1) syncSeqNo(1)
[ CSSD]2012-07-17 10:34:46.683 [4371] >TRACE: clssnmWaitForAcks: done, msg type(13)
[ CSSD]2012-07-17 10:34:46.683 [4371] >TRACE: clssnmCheckDskInfo: Checking disk info...
[ CSSD]2012-07-17 10:34:47.683 [4371] >ERROR: clssnmCheckDskInfo: We appear to be dead skgxn 0
[ CSSD]2012-07-17 10:34:47.683 [4371] >ERROR: clssnmDoSyncUpdate: checkDskInfo signaled shutdown
[ CSSD]2012-07-17 10:34:42.607 [1029] >ERROR: clssnm_skgxnmon: Failure 0 registering.(3/sskgxn_gs_join (pb)/skgxnreg)
[ CSSD]2012-07-17 10:34:47.683 [4371] >ERROR: clssnmCheckDskInfo: We appear to be dead skgxn 0
[ CSSD]2012-07-17 10:34:47.683 [4371] >ERROR: clssnmDoSyncUpdate: checkDskInfo signaled shutdown
这几个error 不太懂
http://t.askmaclean.com/thread-233-1-1.html 跟之前的帖子不太一样 |
|