Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

2135

积分

502

好友

184

主题
1#
发表于 2012-7-19 20:29:33 | 查看: 4425| 回复: 1
Question from 网友:

服务器型号
IBM P740

数据库版本:
CRS:10.2.0.1

操作系统版本
$ oslevel -s
6100-05-06-1119

HACMP版本
# lslpp -l |grep cluster
  cluster.adt.es.client.include
  cluster.adt.es.client.samples.clinfo
  cluster.adt.es.client.samples.clstat
  cluster.adt.es.client.samples.libcl
  cluster.adt.es.java.demo.monitor
  cluster.doc.en_US.assist.db2.html
  cluster.doc.en_US.assist.db2.pdf
  cluster.doc.en_US.assist.oracle.html
  cluster.doc.en_US.assist.oracle.pdf
  cluster.doc.en_US.assist.websphere.html
  cluster.doc.en_US.assist.websphere.pdf
  cluster.doc.en_US.es.html  6.1.0.0  COMMITTED  HAES Web-based HTML
  cluster.doc.en_US.es.pdf   6.1.0.0  COMMITTED  HAES PDF Documentation - U.S.
  cluster.es.assist.common   6.1.0.0  COMMITTED  HACMP Smart Assist Common
  cluster.es.assist.db2      6.1.0.3  COMMITTED  HACMP Smart Assist for DB2
  cluster.es.assist.oracle   6.1.0.2  COMMITTED  HACMP Smart Assist for Oracle
  cluster.es.assist.sap      6.1.0.0  COMMITTED  HACMP Smart Assist for SAP
  cluster.es.assist.websphere
  cluster.es.cfs.rte         6.1.0.1  COMMITTED  ES Cluster File System Support
  cluster.es.client.clcomd   6.1.0.4  COMMITTED  ES Cluster Communication
  cluster.es.client.lib      6.1.0.3  COMMITTED  ES Client Libraries
  cluster.es.client.rte      6.1.0.4  COMMITTED  ES Client Runtime
  cluster.es.client.utils    6.1.0.2  COMMITTED  ES Client Utilities
  cluster.es.client.wsm      6.1.0.3  COMMITTED  Web based Smit
  cluster.es.cspoc.cmds      6.1.0.5  COMMITTED  ES CSPOC Commands
  cluster.es.cspoc.dsh       6.1.0.0  COMMITTED  ES CSPOC dsh
  cluster.es.cspoc.rte       6.1.0.5  COMMITTED  ES CSPOC Runtime Commands
  cluster.es.nfs.rte         6.1.0.2  COMMITTED  ES NFS Support
  cluster.es.plugins.dhcp    6.1.0.0  COMMITTED  ES Plugins - dhcp
  cluster.es.plugins.dns     6.1.0.0  COMMITTED  ES Plugins - Name Server
  cluster.es.plugins.printserver
  cluster.es.server.cfgast   6.1.0.0  COMMITTED  ES Two-Node Configuration
  cluster.es.server.diag     6.1.0.4  COMMITTED  ES Server Diags
  cluster.es.server.events   6.1.0.5  COMMITTED  ES Server Events
  cluster.es.server.rte      6.1.0.5  COMMITTED  ES Base Server Runtime
  cluster.es.server.testtool
  cluster.es.server.utils    6.1.0.5  COMMITTED  ES Server Utilities
  cluster.es.worksheets      6.1.0.1  COMMITTED  Online Planning Worksheets
  cluster.license            6.1.0.0  COMMITTED  HACMP Electronic License
  cluster.msg.en_US.assist   6.1.0.0  COMMITTED  HACMP Smart Assist Messages -
  cluster.msg.en_US.cspoc    6.1.0.0  COMMITTED  HACMP CSPOC Messages - U.S.
  cluster.msg.en_US.es.client
  cluster.msg.en_US.es.server
  cluster.es.assist.db2      6.1.0.0  COMMITTED  HACMP Smart Assist for DB2
  cluster.es.assist.oracle   6.1.0.0  COMMITTED  HACMP Smart Assist for Oracle
  cluster.es.assist.sap      6.1.0.0  COMMITTED  HACMP Smart Assist for SAP
  cluster.es.assist.websphere
  cluster.es.client.clcomd   6.1.0.4  COMMITTED  ES Cluster Communication
  cluster.es.client.lib      6.1.0.3  COMMITTED  ES Client Libraries
  cluster.es.client.rte      6.1.0.4  COMMITTED  ES Client Runtime
  cluster.es.client.wsm      6.1.0.0  COMMITTED  Web based Smit
  cluster.es.cspoc.rte       6.1.0.0  COMMITTED  ES CSPOC Runtime Commands
  cluster.es.nfs.rte         6.1.0.2  COMMITTED  ES NFS Support
  cluster.es.server.diag     6.1.0.0  COMMITTED  ES Server Diags
  cluster.es.server.events   6.1.0.0  COMMITTED  ES Server Events
  cluster.es.server.rte      6.1.0.5  COMMITTED  ES Base Server Runtime
  cluster.es.server.utils    6.1.0.5  COMMITTED  ES Server Utilities
  cluster.man.en_US.assist.data
  cluster.man.en_US.es.data  6.1.0.2  COMMITTED  ES Man Pages - U.S. English
  
----------------------------------------------------------------------------------
现象描述:
安装10.2.0.1 crs 环境,在执行root.sh时失败,在两个节点都报错:"Failure at final check of Oracle CRS stack"
检查CRS进程,发现cssd进程没有起来
ps -ef|grep init
root 1 0 0 15:27:02 - 0:00 /etc/init
root 10485970 1 0 13:16:56 - 0:00 /bin/sh /etc/init.cssd oclsmon
root 11534362 1 0 13:22:51 - 0:00 /bin/sh /etc/init.evmd run
root 13828156 1 0 13:17:04 - 0:00 /bin/sh /etc/init.cssd fatal
root 14548996 1 0 13:18:09 - 0:00 /bin/sh /etc/init.crsd run

检查日志ocssd.log,发现有以下错误:
ERROR: clssnm_skgxnmon: Failure 0 registering.(3/sskgxn_gs_join (pb)/skgxnreg)

搜索metalink,发现一篇文档的出错信息和这次的一样,但是操作系统是AIX5.3的
bug 7373395


检查系统环境
$ id
uid=1003(ora10g) gid=101(dba) groups=1(staff),220(hagsuser),301(oinstall)
$ id root
uid=0(root) gid=0(system) groups=2(bin),3(sys),7(security),8(cron),10(audit),11(lp),220(hagsuser)
$ lsnodes -n
p740brac1 2
p740arac1 1

Cluster: ibeavrac_cluster (1083654549)
Tue Jul 17 14:12:17 GMT+08:00 2012
State: UP Nodes: 2
SubState: STABLE

Node: p740arac1 State: UP
Interface: p740arac1 (0) Address: 10.6.184.148
State: UP
Interface: p740arac1_pri (1) Address: 1.184.100.148
State: UP
Resource Group: ibeavrac_rg State: On line

Node: p740brac1 State: UP
Interface: p740brac1 (0) Address: 10.6.184.151
State: UP
Interface: p740brac1_pri (1) Address: 1.184.100.151
State: UP
Resource Group: ibeavrac_rg State: On line


cssd里的报错
[    CSSD]2012-07-17 10:34:42.607 [1029] >ERROR:   clssnm_skgxnmon: Failure 0 registering.(3/sskgxn_gs_join (pb)/skgxnreg)
[    CSSD]2012-07-17 10:34:42.609 [1] >TRACE:   clssnmInitNMInfo: misscount set to 600
[    CSSD]2012-07-17 10:34:42.612 [1] >TRACE:   clssnmDiskStateChange: state from 1 to 2 disk (0//dev/rrac_voting1)
[    CSSD]2012-07-17 10:34:42.615 [1] >TRACE:   clssnmDiskStateChange: state from 1 to 2 disk (1//dev/rrac_voting2)
[    CSSD]2012-07-17 10:34:42.617 [1] >TRACE:   clssnmDiskStateChange: state from 1 to 2 disk (2//dev/rrac_voting3)
[    CSSD]2012-07-17 10:34:44.614 [1031] >TRACE:   clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rrac_voting1)
[    CSSD]2012-07-17 10:34:44.617 [1544] >TRACE:   clssnmDiskStateChange: state from 2 to 4 disk (1//dev/rrac_voting2)
[    CSSD]2012-07-17 10:34:44.618 [1801] >TRACE:   clssnmDiskStateChange: state from 2 to 4 disk (2//dev/rrac_voting3)
[    CSSD]2012-07-17 10:34:44.626 [1] >TRACE:   clssscSclsFatal: read value of disable
[    CSSD]2012-07-17 10:34:44.626 [2315] >TRACE:   clssnmFatalThread: spawned
[    CSSD]2012-07-17 10:34:44.626 [1] >TRACE:   clssscSclsFatal: read value of disable
[    CSSD]2012-07-17 10:34:44.626 [2572] >TRACE:   clssnmconnect: connecting to node 1, flags 0x0001, connector 1
[    CSSD]2012-07-17 10:34:44.673 [2829] >TRACE:   clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_ibeav_1))
[    CSSD]2012-07-17 10:34:44.673 [2829] >TRACE:   clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_p740arac1_ibeav))
[    CSSD]2012-07-17 10:34:44.682 [3857] >TRACE:   clssnmPollingThread: Connection complete
[    CSSD]2012-07-17 10:34:44.682 [4114] >TRACE:   clssnmSendingThread: Connection complete
[    CSSD]2012-07-17 10:34:44.682 [4371] >TRACE:   clssnmRcfgMgrThread: Connection complete
[    CSSD]2012-07-17 10:34:44.682 [4371] >TRACE:   clssnmRcfgMgrThread: Local Join
[    CSSD]2012-07-17 10:34:44.682 [4371] >TRACE:   clssnmDoSyncUpdate: Initiating sync 1
[    CSSD]2012-07-17 10:34:44.682 [4371] >TRACE:   clssnmSetupAckWait: Ack message type (11)
[    CSSD]2012-07-17 10:34:44.682 [4371] >TRACE:   clssnmSetupAckWait: node(1) is ALIVE
[    CSSD]2012-07-17 10:34:44.683 [4371] >TRACE:   clssnmSendSync: syncSeqNo(1)
[    CSSD]2012-07-17 10:34:44.683 [4371] >TRACE:   clssnmWaitForAcks: Ack message type(11), ackCount(1)
[    CSSD]2012-07-17 10:34:44.683 [2572] >TRACE:   clssnmHandleSync: Acknowledging sync: src[1] srcName[p740arac1] seq[1] sync[1]
[    CSSD]2012-07-17 10:34:44.782 [1] >USER:    NMEVENT_SUSPEND [00][00][00][00]
[    CSSD]2012-07-17 10:34:45.683 [4371] >TRACE:   clssnmWaitForAcks: done, msg type(11)
[    CSSD]2012-07-17 10:34:45.683 [4371] >TRACE:   clssnmSetupAckWait: Ack message type (13)
[    CSSD]2012-07-17 10:34:45.683 [4371] >TRACE:   clssnmSetupAckWait: node(1) is ACTIVE
[    CSSD]2012-07-17 10:34:45.683 [4371] >TRACE:   clssnmSendVote: syncSeqNo(1)
[    CSSD]2012-07-17 10:34:45.683 [4371] >TRACE:   clssnmWaitForAcks: Ack message type(13), ackCount(1)
[    CSSD]2012-07-17 10:34:45.683 [2572] >TRACE:   clssnmSendVoteInfo: node(1) syncSeqNo(1)
[    CSSD]2012-07-17 10:34:46.683 [4371] >TRACE:   clssnmWaitForAcks: done, msg type(13)
[    CSSD]2012-07-17 10:34:46.683 [4371] >TRACE:   clssnmCheckDskInfo: Checking disk info...
[    CSSD]2012-07-17 10:34:47.683 [4371] >ERROR:   clssnmCheckDskInfo: We appear to be dead skgxn 0
[    CSSD]2012-07-17 10:34:47.683 [4371] >ERROR:   clssnmDoSyncUpdate:  checkDskInfo signaled shutdown


[    CSSD]2012-07-17 10:34:42.607 [1029] >ERROR:   clssnm_skgxnmon: Failure 0 registering.(3/sskgxn_gs_join (pb)/skgxnreg)
[    CSSD]2012-07-17 10:34:47.683 [4371] >ERROR:   clssnmCheckDskInfo: We appear to be dead skgxn 0
[    CSSD]2012-07-17 10:34:47.683 [4371] >ERROR:   clssnmDoSyncUpdate:  checkDskInfo signaled shutdown
这几个error 不太懂


http://t.askmaclean.com/thread-233-1-1.html 跟之前的帖子不太一样
下载专业ORACLE数据库恢复工具PRM-DUL  For Oracle http://www.parnassusdata.com/zh-hans/emergency-services

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线 : 13764045638  QQ: 47079569   
2#
发表于 2012-7-19 20:31:08
as maclean said:

ODM FINDING:

CSSD does not Start on AIX (Oracle Clusterware with HACMP)

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.1 - Release: 10.2 to 10.2
IBM AIX on POWER Systems (64-bit)
AIX5L Based Systems (64-bit)
CRS
RAC
HACMP
Symptoms

Oracle Clusterware installed on top of HACMP cluster, but cssd startup fails with following error message in the ocssd.log:

    [ CSSD]2008-05-27 15:09:43.456 [1029] >TRACE: clssnm_skgxninit: initialized skgxn version (2/0/IBM AIX skgxn)
    [ CSSD]2008-05-27 15:09:43.475 [1029] >ERROR: clssnm_skgxnmon: Failure 0 registering.(1/1 [HA_GS_NOT_OK]/sskgxn_gs_in)
    ..............

    [ CSSD]2008-05-27 15:09:47.554 [3600] >TRACE: clssnmCheckDskInfo: Checking disk info...
    [ CSSD]2008-05-27 15:09:48.554 [3600] >ERROR: clssnmCheckDskInfo: We appear to be dead skgxn 0
    [ CSSD]2008-05-27 15:09:48.554 [3600] >ERROR: clssnmDoSyncUpdate: checkDskInfo signaled shutdown
    [ CSSD]2008-05-27 15:09:48.554 [3600] >TRACE: clssscctx: dump of 0x11000ddf0, len 3752
    [ CSSD]2008-05-27 15:09:48.554 [3600] >TRACE: 0x11000ddf0 00 00 00 01 10 9a 33 90 - 00 00 00 01 10 95 a2 b0

    ............
    [ CSSD]2008-05-28 15:09:58.508 [3600] >TRACE: clssscctx->nmctx->nmnode[002]->nodeData: dump of 0x0, len 0
    [ CSSD]2008-05-28 15:09:58.508 [3600] >TRACE: clssscctx->nmctx->nmnode[002]->con: dump of 0x0, len 976
    [ CSSD]--- DUMP GROCK STATE DB ---
    [ CSSD]--- END OF GROCK STATE DUMP ---

Cause
CSSD dameon is not able to communicate with HACMP .
Solution

Oracle cssd must able to communicate with HACMP cluster.  Therefore the oracle user must be part of the hagsuser group.

Make sure Oracle and Root user are part of the hagsuser group.

Reference:

Oracle� Database Oracle Clusterware and Oracle Real Application Clusters Installation Guide
10g Release 2 (10.2) for AIX

2.3.4 Creating a HAGSUSER Group (Optional)

http://download.oracle.com/docs/ ... reaix.htm#sthref228


Node Monitoring Consistency with vendor clusterware ensured via skgxn

dead skgxn说明 css与hacmp  通信存在问题

建议你参考上面的note

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-25 01:41 , Processed in 0.047887 second(s), 21 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569