Oracle RAC 10g 两节点,某节点不定期被驱逐,求原因
本帖最后由 WMLM 于 2014-12-2 12:56 编辑环境:
IBM 550 两台小机 + 一套浪潮存储
AIX 6100-08-02
oracle 10g RAC 10.2.0.4
症状:
CRS不稳定 某节点不定期被驱逐。
目标:
希望刘大或者其他大侠抽空儿看看上传的日志,给些建议,不胜感激。
node1:
[ CSSD]2014-12-01 07:21:14.172 >WARNING: clssnmPollingThread: node node2 (2) at 50 2.481040e-265artbeat fatal, eviction in 14.435 seconds
[ CSSD]2014-12-01 07:21:14.172 >TRACE: clssnmPollingThread: node node2 (2) is impending reconfig, flag 1, misstime 15565
[ CSSD]2014-12-01 07:21:14.172 >TRACE: clssnmPollingThread: diskTimeout set to (27000)ms impending reconfig status(1)
[ CSSD]2014-12-01 07:21:21.202 >WARNING: clssnmPollingThread: node node2 (2) at 75 2.481040e-265artbeat fatal, eviction in 7.405 seconds
[ CSSD]2014-12-01 07:21:22.202 >WARNING: clssnmPollingThread: node node2 (2) at 75 2.481040e-265artbeat fatal, eviction in 6.405 seconds
[ CSSD]2014-12-01 07:21:26.228 >WARNING: clssnmPollingThread: node node2 (2) at 90 2.481040e-265artbeat fatal, eviction in 2.379 seconds
[ CSSD]2014-12-01 07:21:27.232 >WARNING: clssnmPollingThread: node node2 (2) at 90 2.481040e-265artbeat fatal, eviction in 1.375 seconds
[ CSSD]2014-12-01 07:21:28.239 >WARNING: clssnmPollingThread: node node2 (2) at 90 2.481040e-265artbeat fatal, eviction in 0.368 seconds
[ CSSD]2014-12-01 07:21:28.612 >TRACE: clssnmPollingThread: Eviction started for node node2 (2), flags 0x0001, state 3, wt4c 0
[ CSSD]2014-12-01 07:21:28.612 >TRACE: clssnmDoSyncUpdate: Initiating sync 3
[ CSSD]2014-12-01 07:21:28.612 >TRACE: clssnmDoSyncUpdate: diskTimeout set to (27000)ms
node2 : [ CSSD]2014-12-01 09:26:52.478 >TRACE: clssnmPollingThread: node node1 (1) is impending reconfig, flag 1039, misstime 15026
[ CSSD]2014-12-01 09:26:52.478 >TRACE: clssnmPollingThread: diskTimeout set to (27000)ms impending reconfig status(1)
[ CSSD]2014-12-01 09:26:53.480 >WARNING: clssnmPollingThread: node node1 (1) at 50 2.481040e-265artbeat fatal, eviction in 13.973 seconds
[ CSSD]2014-12-01 09:27:00.478 >WARNING: clssnmPollingThread: node node1 (1) at 75 2.481040e-265artbeat fatal, eviction in 6.975 seconds
[ CSSD]2014-12-01 09:27:04.482 >WARNING: clssnmPollingThread: node node1 (1) at 90 2.481040e-265artbeat fatal, eviction in 2.971 seconds
[ CSSD]2014-12-01 09:27:05.480 >WARNING: clssnmPollingThread: node node1 (1) at 90 2.481040e-265artbeat fatal, eviction in 1.973 seconds
[ CSSD]2014-12-01 09:27:06.478 >WARNING: clssnmPollingThread: node node1 (1) at 90 2.481040e-265artbeat fatal, eviction in 0.975 seconds
[ CSSD]2014-12-01 09:27:07.455 >TRACE: clssnmPollingThread: Eviction started for node node1 (1), flags 0x040f, state 3, wt4c 0
[ CSSD]2014-12-01 09:27:07.456 >TRACE: clssnmDoSyncUpdate: Initiating sync 7
孤证不立, 部署 osw 和ping private network 脚本 以便下次确认 我也注意到 node node2 (2) at 50 2.481040e-265artbeat fatal, eviction in 14.435 seconds
但不知道这个地方指的是心跳磁盘问题,还是心跳网络的问题。
刘大既然指出, 我这就着手去部署OSW 和 ping private network . 后续再放日志。多谢 跟我遇到的一个好象,
请问 节点被驱逐后,主机的状态是什么样子? 死机? 还是自动重启? 节点被驱逐后 监听器就停止了。因为这几天故障没有重现,所以还没有去收集OSW日志,如果故障再现,收集OSW日志之后,再行上传,多谢关注。 WMLM 发表于 2014-12-8 10:03 static/image/common/back.gif
节点被驱逐后 监听器就停止了。因为这几天故障没有重现,所以还没有去收集OSW日志,如果故障再现,收集OSW ...
是否有必要按
http://t.askmaclean.com/thread-3551-1-1.html
11gR2之前版本的集群,将Diagwait设置为13。 Diagwait设置为13 是最佳实践中的配置,原来安装数据库时,已经设置过了。
页:
[1]