11g Rac 实例重启

baolei 发表于 2014-1-20 15:54:25

hi ML：

我们生产系统这个库连续2次发生实例重启：

相应报错日志：

9:02:31 开始ncxdb11 就没有alert 日志直到 08分开始数据库启动，之前一直在报no heartbeat have disk hb , oswbb 中心跳存在一定延时但之前也是那么多：

ncxdb11 oswprvtnet:

zzz ***Mon Jan 20 09:01:29 GMT+08:00 2014
trying to get source for ncxdb11-pri
source should be 172.32.204.29
traceroute to ncxdb11-pri (172.32.204.29) from 172.32.204.29 (172.32.204.29), 30 hops max
outgoing MTU = 1500
1  ncxdb11-pri (172.32.204.29)  58 ms  0 ms  0 ms
trying to get source for ncxdb12-pri
source should be 172.32.204.29
traceroute to ncxdb12-pri (172.32.204.30) from 172.32.204.29 (172.32.204.29), 30 hops max
outgoing MTU = 1500
1  ncxdb12-pri (172.32.204.30)  58 ms  0 ms  1 ms
zzz ***Mon Jan 20 09:02:01 GMT+08:00 2014
trying to get source for ncxdb11-pri
source should be 172.32.204.29
traceroute to ncxdb11-pri (172.32.204.29) from 172.32.204.29 (172.32.204.29), 30 hops max
outgoing MTU = 1500
1  ncxdb11-pri (172.32.204.29)  45 ms  0 ms  0 ms
trying to get source for ncxdb12-pri
source should be 172.32.204.29
traceroute to ncxdb12-pri (172.32.204.30) from 172.32.204.29 (172.32.204.29), 30 hops max
outgoing MTU = 1500
1  ncxdb12-pri (172.32.204.30)  46 ms  0 ms  0 ms
zzz ***Mon Jan 20 09:10:29 GMT+08:00 2014
trying to get source for ncxdb11-pri
source should be 172.32.204.29
traceroute to ncxdb11-pri (172.32.204.29) from 172.32.204.29 (172.32.204.29), 30 hops max
outgoing MTU = 1500
1  ncxdb11-pri (172.32.204.29)  33 ms  0 ms  0 ms
trying to get source for ncxdb12-pri
source should be 172.32.204.29
traceroute to ncxdb12-pri (172.32.204.30) from 172.32.204.29 (172.32.204.29), 30 hops max
outgoing MTU = 1500
1  ncxdb12-pri (172.32.204.30)  32 ms  0 ms  0 ms

baolei 发表于 2014-1-20 15:54:52

ncxdb11 ocssd.log:

2014-01-20 09:02:20.226: [ CSSD]clssnmPollingThread: node ncxdb12 (2) at 50% heartbeat fatal, removal in 14.903 seconds
2014-01-20 09:02:20.226: [ CSSD]clssnmPollingThread: node ncxdb12 (2) is impending reconfig, flag 2229260, misstime 15097
2014-01-20 09:02:20.227: [ CSSD]clssnmPollingThread: local diskTimeout set to 27000 ms, remote disk timeout set to 27000, impending reconfig status(1)
2014-01-20 09:02:20.227: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434934, LATS 670
757723, lastSeqNo 93427805, uniqueness 1382516743, timestamp 1390179739/670661093
2014-01-20 09:02:20.227: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434936, LATS 670
757723, lastSeqNo 93427803, uniqueness 1382516743, timestamp 1390179740/670661392
2014-01-20 09:02:20.366: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670757862/1390179740
2014-01-20 09:02:20.736: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670758232/1390179740
2014-01-20 09:02:20.776: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670758272/1390179740
2014-01-20 09:02:20.876: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670758372/1390179740
2014-01-20 09:02:21.227: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434939, LATS 670
758723, lastSeqNo 93434936, uniqueness 1382516743, timestamp 1390179741/670662392
2014-01-20 09:02:21.236: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670758732/1390179741
2014-01-20 09:02:21.277: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670758772/1390179741
2014-01-20 09:02:21.376: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670758872/1390179741
2014-01-20 09:02:21.729: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434941, LATS 670
759224, lastSeqNo 93405380, uniqueness 1382516743, timestamp 1390179741/670662648
2014-01-20 09:02:21.737: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434942, LATS 670
759233, lastSeqNo 93434939, uniqueness 1382516743, timestamp 1390179741/670662893
2014-01-20 09:02:21.738: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670759233/1390179741
2014-01-20 09:02:21.778: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670759273/1390179741
2014-01-20 09:02:21.879: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670759375/1390179741
2014-01-20 09:02:22.231: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434944, LATS 670
759727, lastSeqNo 93434941, uniqueness 1382516743, timestamp 1390179741/670663148
2014-01-20 09:02:22.231: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434945, LATS 670
759727, lastSeqNo 93434942, uniqueness 1382516743, timestamp 1390179742/670663399
2014-01-20 09:02:22.240: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670759736/1390179742
2014-01-20 09:02:22.280: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670759775/1390179742
2014-01-20 09:02:22.380: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670759876/1390179742
2014-01-20 09:02:22.480: [ CSSD]clssnmSendingThread: sending status msg to all nodes
2014-01-20 09:02:22.480: [ CSSD]clssnmSendingThread: sent 4 status msgs to all nodes
2014-01-20 09:02:22.734: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434947, LATS 670
760230, lastSeqNo 93434944, uniqueness 1382516743, timestamp 1390179742/670663651
2014-01-20 09:02:22.734: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434948, LATS 670
760230, lastSeqNo 93434945, uniqueness 1382516743, timestamp 1390179742/670663900
2014-01-20 09:02:22.744: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670760240/1390179742
2014-01-20 09:02:22.781: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670760277/1390179742
2014-01-20 09:02:22.881: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670760377/1390179742
2014-01-20 09:02:23.232: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434951, LATS 670
760727, lastSeqNo 93434948, uniqueness 1382516743, timestamp 1390179743/670664402
2014-01-20 09:02:23.245: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670760741/1390179743
2014-01-20 09:02:23.282: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670760777/1390179743
2014-01-20 09:02:23.386: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670760881/1390179743
2014-01-20 09:02:23.733: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434953, LATS 670
761229, lastSeqNo 93434947, uniqueness 1382516743, timestamp 1390179743/670664657
2014-01-20 09:02:23.735: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434954, LATS 670
761231, lastSeqNo 93434951, uniqueness 1382516743, timestamp 1390179743/670664907
2014-01-20 09:02:23.746: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670761242/1390179743
2014-01-20 09:02:23.784: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670761279/1390179743
2014-01-20 09:02:23.887: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670761382/1390179743
2014-01-20 09:02:24.235: [ CSSD]clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB, DHB has rcfg 254422787, wrtcnt, 93434956, LATS 670
761730, lastSeqNo 93434953, uniqueness 1382516743, timestamp 1390179743/670665157

在ncxdb11的ocssd log中，从9点02分20秒开始，报no network HB的错误。

baolei 发表于 2014-1-20 15:57:30

2014-01-20 09:02:35.463: [ CSSD]clssnmCheckDskInfo: My cohort: 2
2014-01-20 09:02:35.463: [ CSSD]clssnmCheckDskInfo: Surviving cohort: 1
2014-01-20 09:02:35.463: [ CSSD](:CSSNM00008:)clssnmCheckDskInfo: Aborting local node to avoid splitbrain. Cohort of 1 nodes with leader 2, ncxdb12, is smalle
r than cohort of 1 nodes led by node 1, ncxdb11, based on map type 2
2014-01-20 09:02:35.463: [ CSSD]clssgmQueueGrockEvent: groupName(IGCXDB1SYS$USERS) count(2) master(1) event(2), incarn 8, mbrc 2, to member 2, events 0x0, state
0x0
2014-01-20 09:02:35.463: [ CSSD]###################################
2014-01-20 09:02:35.463: [ CSSD]clssgmQueueGrockEvent: groupName(crs_version) count(3) master(1) event(2), incarn 15, mbrc 3, to member 0, events 0x0, state 0x0
2014-01-20 09:02:35.463: [ CSSD]clssscExit: CSSD aborting from thread clssnmRcfgMgrThread

baolei 发表于 2014-1-20 15:59:11

ncxdb12 :

2014-01-20 09:02:35.463: [ CSSD]clssnmCheckDskInfo: My cohort: 2
2014-01-20 09:02:35.463: [ CSSD]clssnmCheckDskInfo: Surviving cohort: 1
2014-01-20 09:02:35.463: [ CSSD](:CSSNM00008:)clssnmCheckDskInfo: Aborting local node to avoid splitbrain. Cohort of 1 nodes with leader 2, ncxdb12, is smalle
r than cohort of 1 nodes led by node 1, ncxdb11, based on map type 2
2014-01-20 09:02:35.463: [ CSSD]clssgmQueueGrockEvent: groupName(IGCXDB1SYS$USERS) count(2) master(1) event(2), incarn 8, mbrc 2, to member 2, events 0x0, state
0x0
2014-01-20 09:02:35.463: [ CSSD]###################################
2014-01-20 09:02:35.463: [ CSSD]clssgmQueueGrockEvent: groupName(crs_version) count(3) master(1) event(2), incarn 15, mbrc 3, to member 0, events 0x0, state 0x0
2014-01-20 09:02:35.463: [ CSSD]clssscExit: CSSD aborting from thread clssnmRcfgMgrThread

为何ncxdb12 的node number < ncxdb11 的 node number ？将ncxdb11 剔除？

另外 ncxdb12 lmon 日志中有kjxggpoll: change db group poll time to 50 ms

这段信息如何解读？

baolei 发表于 2014-1-20 15:59:52

os版本： aix 6.1
db： 11.2.0.3

Liu Maclean(刘相兵 发表于 2014-1-20 16:21:41

需要ocssd.log ，请打包上传

baolei 发表于 2014-1-20 17:06:10

已上传ossd.log alert ,lmon trace 等。希望从这个案例中找到根本原因，现在IBM原厂也与我们一起分析，之前down过第二个节点 SR 给出是访问存储通道有问题，不过和上一次不是一样的报错。

baolei 发表于 2014-1-20 17:08:18

本帖最后由 baolei 于 2014-1-20 17:10 编辑

这是我们这工程师给点建议，我个人认为有点随意，分析过程就不发了，很长。。结论如下

1 在9点02分20秒出现脑裂，节点1在9点02分35秒被逐出cluster，从而导致节点1的主机crash
2. 在9点02分35秒脑裂之后，由于节点1被逐出，从而节点2在9点02分36秒接管节点1，不过由于asm的lmon进程在接管投票的时间超过了50ms，从而引起asm的pmon进程误以为lmon进程已经僵死，进而导致了pmon进程异常终止，最终导致了节点2的数据库的crash

建议：
本次故障主要是由于心跳网络异常导致的一系列问题，建议，主机检查网络心跳为何远超过正常值。

Liu Maclean(刘相兵 发表于 2014-1-20 20:30:09

你给的日志最早记录是在1月14日

timeline:

node 1 2014-01-20 09:02:20.227: clssnmvDHBValidateNcopy: node 2, ncxdb12, has a disk HB, but no network HB,

node 2 2014-01-20 09:02:20.769  clssnmvDHBValidateNcopy: node 1, ncxdb11, has a disk HB, but no network HB

node 1  2014-01-20 09:02:31.423: [ CSSD]clssnmvDiskPing: Writing with status 0x3, timestamp 670768918/1390179751       重启前最后一条日志

node 2 2014-01-20 09:02:46.401: [ CSSD]clssgmClientShutdown: sending shutdown, fence_done 1 IO fench 后CSSD SHUTDOWN

node 2 2014-01-20 09:02:54.091: [ CSSD]clssscmain: Starting CSS daemon, version 11.2.0.3.0, in (clustered) mode with uniqueness value 1390179774 CRS shutdown 启动CSS

node 1 2014-01-20 09:08:22.972: [ CSSD]clssscmain: Starting CSS daemon, version 11.2.0.3.0, in (clustered) mode with uniqueness value 1390180102 重启后启动CSS

node 1 没有太多可用信息

node 2  可以看到原来这里是想 Aborting local node to avoid splitbrain，  因为这个sub-cluster的权重小于节点1
2014-01-20 09:02:35.463: [ CSSD]clssnmCheckDskInfo: Checking disk info...
2014-01-20 09:02:35.463: [ CSSD]clssnmCheckSplit: Node 1, ncxdb11, is alive, DHB (1390179755, 670772496) more than disk timeout of 27000 after the last NHB (1390179725, 670742947)
2014-01-20 09:02:35.463: [ CSSD]clssnmCheckDskInfo: My cohort: 2
2014-01-20 09:02:35.463: [ CSSD]clssnmCheckDskInfo: Surviving cohort: 1
2014-01-20 09:02:35.463: [ CSSD](:CSSNM00008:)clssnmCheckDskInfo: Aborting local node to avoid splitbrain. Cohort of 1 nodes with leader 2, ncxdb12, is smaller than cohort of 1 nod
es led by node 1, ncxdb11, based on map type 2
2014-01-20 09:02:35.463: [ CSSD]clssgmQueueGrockEvent: groupName(IGCXDB1SYS$USERS) count(2) master(1) event(2), incarn 8, mbrc 2, to member 2, events 0x0, state 0x0
2014-01-20 09:02:35.463: [ CSSD]###################################
2014-01-20 09:02:35.463: [ CSSD]clssgmQueueGrockEvent: groupName(crs_version) count(3) master(1) event(2), incarn 15, mbrc 3, to member 0, events 0x0, state 0x0
2014-01-20 09:02:35.463: [ CSSD]clssscExit: CSSD aborting from thread clssnmRcfgMgrThread
2014-01-20 09:02:35.464: [ CSSD]###################################
2014-01-20 09:02:35.464: [ CSSD](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally

这里的奇怪在于 node 1的ocssd.log 中没有显示有clssnmCheckDskInfo的部分就重启了，照理说2个节点都做clssnmCheckDskInfo的话，1节点奖获胜并存活。

但1节点 09:02:31左右就直接reboot了，这个时间点其实2个节点还没有通过votedisk商讨谁存活下去。

疑问：这2个节点的时钟是否一致，

crsctl query votedisk 什么结果？

Liu Maclean(刘相兵 发表于 2014-1-20 20:40:35

PS: 就你提供的日志而言仅仅2014-01-20 有no network HB的现象，没有看到其他时候有这种现象。

baolei 发表于 2014-1-21 12:33:23

Liu Maclean(刘相兵发表于 2014-1-20 20:30 static/image/common/back.gif
你给的日志最早记录是在1月14日

timeline:

我也是非常奇怪，按道理都有disk hb 情况下应该是 cxdb12 down ，怎么是 cxdb11 的node number 比12大，而且 9:02:31 秒node1就没有日志了，不知道这个reboot到底是人为还是系统，当时问了一圈人都没有人会做这个操作。
crsctl query votedisk 稍后晚些输出，好像这个库是我建的。。

baolei 发表于 2014-1-21 12:33:56

Liu Maclean(刘相兵发表于 2014-1-20 20:40 static/image/common/back.gif
PS: 就你提供的日志而言仅仅2014-01-20 有no network HB的现象，没有看到其他时候有这种现象。 ...

是啊，日志里面也就这些信息，也没看到其他。

baolei 发表于 2014-1-21 17:49:44

Liu Maclean(刘相兵发表于 2014-1-20 20:30 static/image/common/back.gif
你给的日志最早记录是在1月14日

timeline:

grid@ncxdb11:/home/grid$ crsctl query css votedisk
##  STATE File Universal Id             File Name Disk group
--  ----- -----------------             --------- ---------
1. ONLINE 5e380d62b6cb4f6ebf13846fa0e0f0c8 (/dev/rhdiskpower6001)
2. ONLINE 4397d72c00174fe7bf13acd712071024 (/dev/rhdiskpower6002)
3. ONLINE 9b1cb71a838c4ffdbfb93565d5c1cb4c (/dev/rhdiskpower6003)
Located 3 voting disk(s).

ncxdb12:[/]#crsctl query css votedisk
##  STATE File Universal Id             File Name Disk group
--  ----- -----------------             --------- ---------
1. ONLINE 5e380d62b6cb4f6ebf13846fa0e0f0c8 (/dev/rhdiskpower6001)
2. ONLINE 4397d72c00174fe7bf13acd712071024 (/dev/rhdiskpower6002)
3. ONLINE 9b1cb71a838c4ffdbfb93565d5c1cb4c (/dev/rhdiskpower6003)

Liu Maclean(刘相兵 发表于 2014-1-21 19:47:54

疑问：这2个节点的时钟是否一致，如9楼

baolei 发表于 2014-1-21 20:10:11

Liu Maclean(刘相兵发表于 2014-1-21 19:47 static/image/common/back.gif
疑问：这2个节点的时钟是否一致，如9楼

oracle@ncxdb11:/home/oracle$ ssh ncxdb12 date;date;
Tue Jan 21 20:09:44 GMT+08:00 2014
Tue Jan 21 20:09:44 GMT+08:00 2014

一致的啊

baolei 发表于 2014-1-23 11:34:04

Liu Maclean(刘相兵发表于 2014-1-21 19:47 static/image/common/back.gif
疑问：这2个节点的时钟是否一致，如9楼

请问ML还有什么新发现吗？

Liu Maclean(刘相兵 发表于 2014-1-23 15:40:51

1、显然要控制这种网络故障包括采用bind等技术

2、设置crsctl set log css CSSD:2 保证下次发生时让node1 重启产生日志

3、需要2节点 errpt 数据

baolei 发表于 2014-1-23 16:24:58

节点1：
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
ECCE4018 0122181714 T S fcs4          SOFTWARE PROGRAM ERROR
ECCE4018 0122181714 T S fcs6          SOFTWARE PROGRAM ERROR
ECCE4018 0122181414 T S fcs4          SOFTWARE PROGRAM ERROR
ECCE4018 0122181314 T S fcs6          SOFTWARE PROGRAM ERROR
ECCE4018 0122181314 T S fcs6          SOFTWARE PROGRAM ERROR
ECCE4018 0122181314 T S fcs6          SOFTWARE PROGRAM ERROR
ECCE4018 0122181314 T S fcs6          SOFTWARE PROGRAM ERROR
ECCE4018 0122181314 T S fcs8          SOFTWARE PROGRAM ERROR
A6DF45AA 0120090714 I O RMCdaemon    The daemon is started.
2BFA76F6 0120090314 T S SYSPROC       SYSTEM SHUTDOWN BY USER
9DBCFDEE 0120090614 T O errdemon    ERROR LOGGING TURNED ON
E87EF1BE 0119150014 P O dumpcheck    The largest dump device is too small.
E87EF1BE 0118150014 P O dumpcheck    The largest dump device is too small.

ncxdb11:[/]#lsattr -El ent17
adapter_names ent4          EtherChannel Adapters                         True
alt_addr       0x000000000000 Alternate EtherChannel Address                True
auto_recovery yes          Enable automatic recovery after failover       True
backup_adapter  ent10       Adapter used when whole channel fails          True
hash_mode    default       Determines how outgoing adapter is chosen    True
interval       long          Determines interval value for IEEE 802.3ad mode True
mode          standard    EtherChannel mode of operation                True
netaddr       0             Address to ping                               True
noloss_failover yes          Enable lossless failover after ping failure    True
num_retries    3             Times to retry ping before failing             True
retry_time    1             Wait time (in seconds) between pings          True
use_alt_addr no          Enable Alternate EtherChannel Address          True
use_jumbo_frame no          Enable Gigabit Ethernet Jumbo Frames          True
节点2 ：

IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
E87EF1BE 0120150014 P O dumpcheck    The largest dump device is too small.
A924A5FC 0120090214 P S SYSPROC       SOFTWARE PROGRAM ABNORMALLY TERMINATED
E87EF1BE 0119150014 P O dumpcheck    The largest dump device is too small.
E87EF1BE 0118150014 P O dumpcheck    The largest dump device is too small.

ncxdb12:[/]#lsattr -El ent17
adapter_names ent4          EtherChannel Adapters                         True
alt_addr       0x000000000000 Alternate EtherChannel Address                True
auto_recovery yes          Enable automatic recovery after failover       True
backup_adapter  ent10       Adapter used when whole channel fails          True
hash_mode    default       Determines how outgoing adapter is chosen    True
interval       long          Determines interval value for IEEE 802.3ad mode True
mode          standard    EtherChannel mode of operation                True
netaddr       0             Address to ping                               True
noloss_failover yes          Enable lossless failover after ping failure    True
num_retries    3             Times to retry ping before failing             True
retry_time    1             Wait time (in seconds) between pings          True
use_alt_addr no          Enable Alternate EtherChannel Address          True
use_jumbo_frame no          Enable Gigabit Ethernet Jumbo Frames          True

没有异常的啊。。

baolei 发表于 2014-1-23 16:27:51

Liu Maclean(刘相兵发表于 2014-1-23 15:40 static/image/common/back.gif
1、显然要控制这种网络故障包括采用bind等技术

2、设置crsctl set log css CSSD:2 保证下次发生时让 ...

我同事的结论貌似也不对吧？

页: [1]

Oracle数据库数据恢复、性能优化's Archiver

11g Rac 实例重启