- 最后登录
- 2017-5-4
- 在线时间
- 81 小时
- 威望
- 999
- 金钱
- 2391
- 注册时间
- 2013-9-11
- 阅读权限
- 150
- 帖子
- 1124
- 精华
- 5
- 积分
- 999
- UID
- 1220
|
2#
发表于 2013-11-17 16:54:45
1.集群启动时的gipcd.log
2013-07-17
12:28:28.071: [ default][3041003216]gipcd START pid=22337 Oracle Grid IPC
Daemon
2013-07-17
12:28:28.072: [ GIPCD][3041003216]
gipcdMain: gipcd Started <<<<<< gipcd守护进程被启动了。
……
2013-07-17
12:28:29.046: [ GPNP][3041003216]clsgpnp_getCachedProfileEx: [at clsgpnp.c:613] Result:
(26) CLSGPNP_NO_PROFILE. Can't get offline GPnP service profile: local gpnpd is
up and running. Use getProfile instead.
2013-07-17
12:28:29.046: [ GPNP][3041003216]clsgpnp_getCachedProfileEx: [at clsgpnp.c:623] Result:
(26) CLSGPNP_NO_PROFILE. Failed to get offline GPnP service profile.
2013-07-17
12:28:29.066: [ GPNP][3041003216]clsgpnpm_newWiredMsg: [at clsgpnpm.c:741] Msg-reply has
soap fault 10 (Operation returned Retry (error CLSGPNP_CALL_AGAIN)) [uri
"http://www.grid-pnp.org/2005/12/gpnp-errors#"] <<<< gipcd 尝试访问gpnp profile但是没有成功。由于log是在GI启动时获取的,这部分信息可以忽略,因为原因是gpnpd还没有成功启动。
……
2013-07-17
12:28:39.342: [ CLSINET][3023027088] # 0 Interface
'eth1',ip='192.168.254.30',mac='00-0c-29-a8-14-65',mask='255.255.255.0',net='192.168.254.0',use='cluster_interconnect'
2013-07-17
12:28:39.342: [ CLSINET][3023027088] # 1 Interface
'eth2',ip='192.168.254.31',mac='00-0c-29-a8-14-6f',mask='255.255.255.0',net='192.168.254.0',use='cluster_interconnect'
<<<<< gipcd
发现了本地节点用于私网的网卡信息,在这个集群中有2块网卡作为集群的私网。
……
2013-07-17 12:28:39.344:
[GIPCHTHR][3025128336] gipchaWorkerUpdateInterface: created local bootstrap
interface for node 'single1', haName 'gipcd_ha_name', inf
'mcast://230.0.1.0:42424/192.168.254.30'
2013-07-17
12:28:39.344: [GIPCHTHR][3025128336] gipchaWorkerUpdateInterface: created local
interface for node 'single1', haName 'gipcd_ha_name', inf
'192.168.254.30:46782'
2013-07-17
12:28:39.345: [GIPCHTHR][3025128336] gipchaWorkerUpdateInterface: created local
bootstrap interface for node 'single1', haName 'gipcd_ha_name', inf
'mcast://230.0.1.0:42424/192.168.254.31'
2013-07-17
12:28:39.345: [GIPCHTHR][3025128336] gipchaWorkerUpdateInterface: created local
interface for node 'single1', haName 'gipcd_ha_name', inf
'192.168.254.31:39332' <<<<<<< gipcd 用于集群数据通信(就是我们之前提到的第一种数据通信)的endpoint 已经产生。
……
2013-07-17
12:28:56.767: [GIPCHGEN][3023027088] gipchaNodeCreate: adding new node 0x9c107d8 { host 'single2', haName
'gipcd_ha_name', srcLuid 465fb26d-8b46eb95, dstLuid 00000000-00000000 numInf 0,
contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [0 : 0], createTime 797327224,
flags 0x0 } <<<<< 远程节点被发现。
……
2013-07-17
12:28:58.415: [GIPCHTHR][3025128336] gipchaWorkerUpdateInterface: created
remote interface for node 'single2', haName 'gipcd_ha_name', inf
'udp://192.168.254.33:16663'
2013-07-17
12:28:58.415: [GIPCHGEN][3025128336] gipchaWorkerAttachInterface: Interface
attached inf 0x9c0bb60
{ host 'single2', haName 'gipcd_ha_name', local 0xb4c4e590, ip '192.168.254.33:16663', subnet
'192.168.254.0', mask '255.255.255.0', numRef 0, numFail 0, flags 0x6 }
2013-07-17
12:28:58.415: [GIPCHTHR][3025128336] gipchaWorkerUpdateInterface: created
remote interface for node 'single2', haName 'gipcd_ha_name', inf
'udp://192.168.254.32:17578'
2013-07-17
12:28:58.415: [GIPCHGEN][3025128336] gipchaWorkerAttachInterface: Interface
attached inf 0x9c0a900 { host 'single2', haName
'gipcd_ha_name', local 0xb4cb8eb8, ip '192.168.254.32:17578', subnet
'192.168.254.0', mask '255.255.255.0', numRef 0, numFail 0, flags 0x6 } <<<<<< gipcd 发现了远程节点私网的网卡信息。
……
2013-07-17
12:29:36.120: [GIPCDMON][3027229584] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 99, avgms 6.326531 [ 257 / 250 / 245 ]
2013-07-17
12:29:36.120: [GIPCDMON][3027229584] gipcdMonitorSaveInfMetrics: inf[ 1] eth2 - rank 99, avgms 5.182186 [ 259 / 250 / 247 ] <<<<<gipcd 检查本地私网网卡状态。
……
2. 当集群中的一个私网down掉时的gipcd.log。
2013-07-17
13:23:20.346: [ CLSINET][3027229584] Returning NETDATA: 2 interfaces
2013-07-17
13:23:20.346: [ CLSINET][3027229584] # 0 Interface
'eth1',ip='192.168.254.30',mac='00-0c-29-a8-14-65',mask='255.255.255.0',net='192.168.254.0',use='cluster_interconnect'
2013-07-17
13:23:20.346: [ CLSINET][3027229584] # 1 Interface
'eth2',ip='192.168.254.31',mac='00-0c-29-a8-14-6f',mask='255.255.255.0',net='192.168.254.0',use='cluster_interconnect'
2013-07-17 13:23:20.359:
[GIPCDMON][3027229584] gipcdMonitorSaveInfMetrics: inf[ 0] eth1 - rank 99, avgms 1.560694 [ 171 / 173 / 173 ]
2013-07-17
13:23:20.359: [GIPCDMON][3027229584] gipcdMonitorSaveInfMetrics: inf[ 1] eth2 - rank 99, avgms 1.802326 [ 172 / 172 / 172 ] <<<<<<<< gipcd 仍然在进行私网检查。
……
+++使用命令“ifconfig eth1 down”禁用集群中的一个私网网卡。
……
2013-07-17
13:23:44.397: [ CLSINET][3027229584] # 0 Interface
'eth2',ip='192.168.254.31',mac='00-0c-29-a8-14-6f',mask='255.255.255.0',net='192.168.254.0',use='cluster_interconnect'
2013-07-17
13:23:44.397: [GIPCDMON][3027229584] gipcdMonitorUpdate: interface went down -
[ ip 192.168.254.30, subnet 192.168.254.0, mask 255.255.255.0 ]
2013-07-17
13:23:44.397: [GIPCDMON][3027229584] gipcdMonitorUpdate: msg sent to client
thread (([update(ip: 192.168.254.30, mask: 255.255.255.0, subnet
192.168.254.0), state(gipcdadapterstateDown)]))
<<<<<<<< gipcd 发现私网eth1 down掉,同时向它的客户(例如:ocssd.bin)发送消息。
……
2013-07-17
13:23:44.426: [GIPCHGEN][3025128336] gipchaInterfaceDisable: disabling
interface 0xb4c4e590
{ host '', haName 'gipcd_ha_name', local (nil), ip '192.168.254.30', subnet
'192.168.254.0', mask '255.255.255.0', numRef 0, numFail 1, flags 0x1cd }
2013-07-17
13:23:44.428: [GIPCHGEN][3025128336] gipchaInterfaceDisable: disabling
interface 0x9c0bb60
{ host 'single2', haName 'gipcd_ha_name', local 0xb4c4e590, ip '192.168.254.33:16663', subnet
'192.168.254.0', mask '255.255.255.0', numRef 0, numFail 0, flags 0x86 }
2013-07-17
13:23:44.428: [GIPCHALO][3025128336] gipchaLowerCleanInterfaces: performing
cleanup of disabled interface 0x9c0bb60
{ host 'single2', haName 'gipcd_ha_name', local 0xb4c4e590, ip '192.168.254.33:16663', subnet
'192.168.254.0', mask '255.255.255.0', numRef 0, numFail 0, flags 0xa6 } <<<<<<<<gipcd 开始清理本地私网eth1 的信息,同时也清理掉与之对应的远程节点私网的信息。
……
2013-07-17
13:24:08.747: [GIPCDMON][3027229584] gipcdMonitorSaveInfMetrics: inf[ 0] eth2 - rank 99, avgms 1.955307 [ 204 / 181 / 179 ] <<<<<<<gipcd 继续检查正常的私网网卡。
注意:在整个过程中,我们还会看到集群的一致性仍然能够保证,不会出现节点离开集群的现象。而且,我们还会看到原来运行在eth1上的HAIP,会failover到eth2 上,与此同时,数据库和ASM实例一切正常。
3. 当网卡eht1恢复后。
++ 使用命令”ifconfig eth1 up”恢复网卡eth1
2013-07-17
13:36:31.260: [GIPCDMON][3027229584] gipcdMonitorUpdate: New Interface found -
[ ip 192.168.254.30, subnet 192.168.254.0, mask 255.255.255.0 ]
2013-07-17
13:36:31.260: [GIPCDMON][3027229584] gipcdMonitorUpdate: msg sent to client
thread (([update(ip: 192.168.254.30, mask: 255.255.255.0, subnet
192.168.254.0), state(gipcdadapterstateUp)])) <<<<< gpicd 发现了新的私网网卡。
……
2013-07-17 13:36:31.471:
[GIPCHTHR][3025128336] gipchaWorkerUpdateInterface: created local bootstrap
interface for node 'single1', haName 'gipcd_ha_name', inf
'mcast://230.0.1.0:42424/192.168.254.30'
2013-07-17
13:36:31.471: [GIPCHTHR][3025128336] gipchaWorkerUpdateInterface: created local
interface for node 'single1', haName 'gipcd_ha_name', inf
'192.168.254.30:55548' <<<<<< 本地的通信endpoint被建立。
……
2013-07-17
13:37:11.493: [ CLSINET][3027229584] Returning NETDATA: 2 interfaces
2013-07-17
13:37:11.493: [ CLSINET][3027229584] # 0 Interface
'eth1',ip='192.168.254.30',mac='00-0c-29-a8-14-65',mask='255.255.255.0',net='192.168.254.0',use='cluster_interconnect'
2013-07-17
13:37:11.493: [ CLSINET][3027229584] # 1 Interface
'eth2',ip='192.168.254.31',mac='00-0c-29-a8-14-6f',mask='255.255.255.0',net='192.168.254.0',use='cluster_interconnect'
2013-07-17
13:37:11.510: [GIPCDMON][3027229584] gipcdMonitorSaveInfMetrics: inf[ 0] eth2 - rank 99, avgms 6.141304 [ 307 / 184 / 184 ] <<<<<<<<
<<<<<<<< gipcd进行私网检查。
注意:在整个过程中,我们还会看到集群的一致性仍然能够保证,不会出现节点离开集群的现象。而且,我们还会看到之前failover到eth2上的HAIP,会重新回到eth1 上,与此同时,数据库和ASM实例一切正常。
最后,我们需要强调的是,gipcd 虽然能够管理集群的私网,但是,如果私网网卡本身,或者节点间私网链路(或者性能)存在问题,gipcd仍然无法正常工作。另外,如果您在使用HAIP的同时,仍在使用了第三方的网卡聚合软件(例如:Linux bonding,etherchannel等),最好使用最新版本的软件,并且确保配置正确。
希望以上的解释能够对大家了解11gR2 新的守护进程gipcd有些帮助,并且在处理相关问题的时候,能够保持正确的方向。
参与此主题的后续讨论,可以访问我们的中文社区,跟帖“共享:11gR2新特性---gipc守护进程"。 |
|