Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

11

积分

0

好友

0

主题
1#
发表于 2012-2-17 13:00:58 | 查看: 7938| 回复: 5
环境:
两台主机+SAN存储+Cisco 7609+oracle11.2.0.2

故障现象:
一台主机的Public接口线路中断后,VIP飘移到另一台主机,数据库连接正常。
当把网线插回原处public接口重新通讯后,VIP飘回原位,但Ping 该VIP不通,数据库连接中断
经查7609上VIP的mac地址没有更新,clear 该接口的arp信息后,该VIP的MAC更新,一切正常。

请问老大这是什么原因是配置问题还是oracle的一个BUG
2#
发表于 2012-2-17 13:19:29
是什么操作系统? Windows?

$CRS_HOME/log/<nodename>/crsd/*.log
$CRS_HOME/log/<nodename>/racg/*.log

把这2个目录下的日志上传

回复 只看该作者 道具 举报

3#
发表于 2012-2-17 13:43:04
操作系统都是Redhat 5.4,日志附上,13.24分左右作了一次测试

Log.rar

361.77 KB, 下载次数: 1469

回复 只看该作者 道具 举报

4#
发表于 2012-2-17 14:28:12
ODM Data:
  1. Linux: ARP cache issues with Red Hat "balance-alb (mode 6)" bonding driver [ID 756259.1]

  2. Applies to:
  3. Oracle Server - Enterprise Edition - Version: 10.1.0.2 to 11.1.0.7 - Release: 10.1 to 11.1
  4. Oracle Server - Enterprise Edition - Version: 10.1.0.2 to 11.1.0.7   [Release: 10.1 to 11.1]
  5. Linux x86
  6. Linux x86-64
  7. Oracle Server Enterprise Edition - Version: 10.1.0.2.0 to 11.1.0.7.0
  8. Oracle Clusterware when using bonding on Public network
  9. Oracle Clusterware when using bonding on Private network
  10. Symptoms
  11. When using "balance-alb (mode 6)" bonding driver in an Oracle Clusterware setup, if the VIP fails over to another node, the ping requests for the relocated VIP fail unless the ARP cache is manually cleared.

  12. This causes sessions to hang as the session retries to connect to the VIP but it is unsuccessful due to the stale ARP cache entries.
  13. Cause

  14. This hang is due to the failover mode configured for NIC bonding. Specifically  "balance-alb (mode 6)" bonding driver requires ARP to be cleaned manually. This is not an Oracle Clusterware issue. It's recommended to use other mode if there's difficulty to fix the OS issue.
  15. Additionally the on failover "balance-alb" driver issues arp requests every two seconds. This is the way "balance-alb" bonding driver was designed to work in failover situations however this may cause sessions to hang therefore rendering the Oracle Clusterware VIP failover useless.
  16. Solution
  17. This is not an Oracle Clusterware issue. Different NIC bonding modes may be incompatible with certain switches, interface cards and drivers, Customers should test VIP failover in their environment with different modes and choose the one that fits their requirements.
  18. For more information refer on bonding modes refer to bonding.txt which is supplied by the linux vendor.
复制代码
确认你的环境中是否使用了"balance-alb (mode 6)" bonding driver。




另一个类似的case
  1. Continuous Ping Fails For a Short Time Following SCAN Relocation [ID 1061722.1]

  2. Applies to:
  3. Oracle Server - Enterprise Edition - Version: 11.2.0.1.0 to 11.2.0.1.0 - Release: 11.2 to 11.2
  4. Information in this document applies to any platform.
  5. Symptoms
  6. After SCAN VIP relocation form one node to another by any means (manual relocation or CRS or node shutdown), the relocated SCAN VIP is not pingable for a small amount of time.

  7. By running any packet analyzer utility on the failed ping packets sent to the relocated SCAN VIP, it should show something like the following:

  8. Frame 760 (60 bytes on wire, 60 bytes captured)
  9. Ethernet II, Src: 00:19:bb:cf:7b:dc (00:19:bb:cf:7b:dc), Dst:
  10. ff:ff:ff:ff:ff:ff (ff:ff:ff:ff:ff:ff)
  11. Address Resolution Protocol (reply/gratuitous ARP)
  12. Hardware type: Ethernet (0x0001)
  13. Protocol type: IP (0x0800)
  14. Hardware size: 6
  15. Protocol size: 4
  16. Opcode: reply (0x0002)
  17. Sender MAC address: 00:19:bb:cf:7b:dc (00:19:bb:cf:7b:dc)
  18. Sender IP address: 10.3.2.74 (10.3.2.74)
  19. Target MAC address: ff:ff:ff:ff:ff:ff (ff:ff:ff:ff:ff:ff)
  20. Target IP address: 10.3.2.74 (10.3.2.74)

  21. Target MAC address filed is ff:ff:ff:ff:ff:ff and not 00:19:bb:cf:7b:dc as it should be.
  22. Changes
  23. 11.2.0.1 Grid Infrastructure (a.k.a CRS) version, with the public interface connected to layer 3 switches or routers that are RFC compliant, specifically that are unable to violate the RFC2002.
  24. Cause
  25. After the start of VIP on a node, the vipagent sends a gratuitous ARP request, there was a problem in the packet sent as gratuitous ARP.

  26. RFC2002:

  27. A Gratuitous ARP is an ARP packet sent by a node in order to
  28. spontaneously cause other nodes to update an entry in their ARP
  29. cache. A gratuitous ARP MAY use either an ARP Request or an ARP
  30. Reply packet. In either case, the ARP Sender Protocol Address
  31. and ARP Target Protocol Address are both set to the IP address
  32. of the cache entry to be updated, and the ARP Sender Hardware
  33. Address is set to the link-layer address to which this cache
  34. entry should be updated. When using an ARP Reply packet, the
  35. Target Hardware Address is also set to the link-layer address to
  36. which this cache entry should be updated (this field is not used
  37. in an ARP Request packet)


  38. Solution
  39. The fix for this code defect Bug:9109880 has been included in the GRID Infrastructure PSU2 patch of 11.2.0.1 version and in version 11.2.0.2.
复制代码
但是oracle 宣称在 11.2.0.2 中已经修复以上1061722.1问题

回复 只看该作者 道具 举报

5#
发表于 2012-2-17 16:11:20
关于案例一:
我这边public网卡没有做Bond,所以也不存在这个问题。

案例二:我这边是11.2.0.2不存在这个问题。

郁闷中

回复 只看该作者 道具 举报

6#
发表于 2012-2-17 16:26:18
建议你检查 ora.vip的 相关日志 , 没有上传过。

此外 实际在10.2.0.3以后和11gR1中 public network挂掉, 默认vip不会自动failover了。

Starting from 10.2.0.4 and 11.1, VIP does not fail-over back to the original node even after the public network problem is resolved.  This behavior is the default behavior in 10.2.0.4 and 11.1 and is different from that of 10.2.0.3

http://www.oracledatabase12g.com ... original%20node.htm

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-11-15 01:58 , Processed in 0.054410 second(s), 24 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569