Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

8

积分

0

好友

0

主题
1#
发表于 2012-3-7 13:18:04 | 查看: 6896| 回复: 8
1> 16:50 上去rac 两边listener down查找原因。 16:34:28 vip都down



相关日志:
CRS.log日志
2012-03-06 16:34:28.058: [  CRSAPP][11050]32CheckResource error for ora.power740a.vip error code = 1
2012-03-06 16:34:28.071: [  CRSRES][11050]32In stateChanged, ora.power740a.vip target is ONLINE
2012-03-06 16:34:28.071: [  CRSRES][11050]32ora.power740a.vip on power740a went OFFLINE unexpectedly
2012-03-06 16:34:28.072: [  CRSRES][11050]32StopResource: setting CLI values
2012-03-06 16:34:28.078: [  CRSRES][11050]32Attempting to stop `ora.power740a.vip` on member `power740a`
2012-03-06 16:34:28.618: [  CRSRES][11050]32Stop of `ora.power740a.vip` on member `power740a` succeeded.
2012-03-06 16:34:28.619: [  CRSRES][11050]32ora.power740a.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
2012-03-06 16:34:28.635: [  CRSRES][11050]32ora.power740a.vip failed on power740a relocating.
2012-03-06 16:34:28.675: [  CRSRES][11050]32StopResource: setting CLI values
2012-03-06 16:34:28.681: [  CRSRES][11050]32Attempting to stop `ora.power740a.LISTENER_POWER740A.lsnr` on member `power740a`
2012-03-06 16:35:40.785: [  CRSRES][12423]32startRunnable: setting CLI values
2012-03-06 16:35:45.991: [  CRSRES][11050]32Stop of `ora.power740a.LISTENER_POWER740A.lsnr` on member `power740a` succeeded.
2012-03-06 16:35:46.012: [  CRSRES][11050]32Attempting to start `ora.power740a.vip` on member `power740b`
2012-03-06 16:35:47.669: [  CRSRES][11050]32Start of `ora.power740a.vip` on member `power740b` succeeded.


数据库中报错信息是
Tue Mar 06 10:21:35 GMT+08:00 2012Thread 2 advanced to log sequence 23 (LGWR switch)
  Current log# 6 seq# 23 mem# 0: +ORADATA2/wzgs/redo6.log
Tue Mar 06 14:16:45 GMT+08:00 2012Global Enqueue Services Deadlock detected. More info in file
/oracle/admin/wzgs/bdump/wzgs2_lmd0_4915654.trc.
Tue Mar 06 16:35:05 GMT+08:00 2012ALTER SYSTEM SET service_names='' SCOPE=MEMORY SID='wzgs2';
Tue Mar 06 16:35:05 GMT+08:00 2012Immediate Kill Session#: 1430, Serial#: 453
Immediate Kill Session: sess: 7000007875d6660  OS pid: 13304304
Tue Mar 06 16:35:05 GMT+08:00 2012Process OS id : 13304304 alive after kill
Errors in file
Immediate Kill Session#: 1432, Serial#: 511
Immediate Kill Session: sess: 70000078f60a518  OS pid: 27394364
Tue Mar 06 16:35:05 GMT+08:00 2012Process OS id : 27394364 alive after kill
Errors in file /oracle/admin/wzgs/udump/wzgs2_ora_5439644.trc
Immediate Kill Session#: 1433, Serial#: 114
Immediate Kill Session: sess: 7000007855e4928  OS


数据库awr top event是
enq: TX - row lock contention 2,680 1,308 488 40.9 Application
CPU time   1,208   37.7   
process terminate 1,076 53 49 1.6 Other
ksdxexeotherwait 33,758,484 40 0 1.2 Other
DFS lock handle 16,100 7 0 .2 Other

asky.rar

368.81 KB, 下载次数: 1064

2#
发表于 2012-3-7 13:23:06
2012-03-06 16:34:28.058: [  CRSAPP][11050]32CheckResource error for ora.power740a.vip error code = 1
2012-03-06 16:34:28.071: [  CRSRES][11050]32In stateChanged, ora.power740a.vip target is ONLINE
2012-03-06 16:34:28.071: [  CRSRES][11050]32ora.power740a.vip on power740a went OFFLINE unexpectedly

2012-03-06 16:35:45.991: [  CRSRES][11050]32Stop of `ora.power740a.LISTENER_POWER740A.lsnr` on member `power740a` succeeded.

==>  VIP OFFLINE 造成  ora.power740a.LISTENER_POWER740A.lsnr stop


Script:Collect vip resource Diagnostic Information
http://www.oracledatabase12g.com ... ic-information.html


$CRS_HOME/log/<nodename>/*.log
$CRS_HOME/log/<nodename>/crsd/*.log
$CRS_HOME/log/<nodename>/cssd/*.log
$CRS_HOME/log/<nodename>/racg/*.log
$CRS_HOME/log/<nodename>/client/*.log
$CRS_HOME/log/<nodename>/evmd/*.log

将power740a 上的以上日志打包上传

回复 只看该作者 道具 举报

3#
发表于 2012-3-7 13:38:28
2012-03-06 16:35:45.937: [    RACG][1] [25755696][1][ora.power740a.LISTENER_POWER740A.lsnr]:
LSNRCTL for IBM/AIX RISC System/6000: Version 10.2.0.5.0 - Production on 06-MAR-2012 16:34:28

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=power740a-vip)(PORT=1521)(IP=FIRST)))

2012-03-06 16:35:45.937: [    RACG][1] [25755696][1][ora.power740a.LISTENER_POWER740A.lsnr]: TNS-12535: TNS:operation timed out
TNS-12560: TNS:protocol adapter error
  TNS-00505: Operation timed out
   IBM/AIX RISC System/6000 Error: 78: Connection timed out


2012-03-06 16:35  POWER740A.lsnr发现 针对power740a-vip 的连接超时 Connection timed out

回复 只看该作者 道具 举报

4#
发表于 2012-3-7 14:57:29
ora.power740a.vip.log

Version 10.2.0.5.0  AIX

ODM finding:
  1. 2012-02-28 16:03:52.850: [    RACG][1] [6095138][1][ora.power740b.vip]: Invalid parameters, or failed to bring up VIP (host=power740a)

  2. 2012-02-28 16:03:52.850: [    RACG][1] [6095138][1][ora.power740b.vip]: clsrcexecut: env ORACLE_CONFIG_HOME=/oracle/product/10.2.0/crs

  3. 2012-02-28 16:03:52.850: [    RACG][1] [6095138][1][ora.power740b.vip]: clsrcexecut: cmd = /oracle/product/10.2.0/crs/bin/racgeut -e _USR_ORA_DEBUG=0 54 /oracle/product/10.2.0/crs/bin/racgvip start power740b

  4. 2012-02-28 16:03:52.850: [    RACG][1] [6095138][1][ora.power740b.vip]: clsrcexecut: rc = 1, time = 9.515s

  5. 2012-02-28 16:03:54.054: [    RACG][1] [6095138][1][ora.power740b.vip]: clsrcexecut: env ORACLE_CONFIG_HOME=/oracle/product/10.2.0/crs

  6. 2012-02-28 16:03:54.054: [    RACG][1] [6095138][1][ora.power740b.vip]: clsrcexecut: cmd = /oracle/product/10.2.0/crs/bin/racgeut -e _USR_ORA_DEBUG=0 54 /oracle/product/10.2.0/crs/bin/racgvip check power740b

  7. 2012-02-28 16:03:54.054: [    RACG][1] [6095138][1][ora.power740b.vip]: clsrcexecut: rc = 1, time = 1.203s

  8. 2012-02-28 16:03:54.054: [    RACG][1] [6095138][1][ora.power740b.vip]: end for resource = ora.power740b.vip, action = start, status = 1, time = 10.904s

  9. 2012-02-28 16:21:33.644: [    RACG][1] [7798874][1][ora.power740b.vip]: Invalid parameters, or failed to bring up VIP (host=power740a)
复制代码




[ora.power740b.vip]: Invalid parameters, or failed to bring up VIP (host=power740a)

回复 只看该作者 道具 举报

5#
发表于 2012-3-7 15:04:22
查一下 以下信息 并贴出 输出结果:

/usr/bin/entstat -d  YOUR_PUBLIC_INTERFACE         
  
YOUR_PUBLIC_INTERFACE      填入 public network 的interface名字 如en0



ping power740a

cat /etc/hosts

回复 只看该作者 道具 举报

6#
发表于 2012-3-7 15:15:01
PING power740a: (10.18.1.91): 56 data bytes
64 bytes from 10.18.1.91: icmp_seq=0 ttl=255 time=0 ms
64 bytes from 10.18.1.91: icmp_seq=1 ttl=255 time=0 ms
64 bytes from 10.18.1.91: icmp_seq=2 ttl=255 time=0 ms
64 bytes from 10.18.1.91: icmp_seq=3 ttl=255 time=0 ms


127.0.0.1
loopback localhost
# loopback (lo0) name/address

10.18.1.91      power740a
10.18.1.92      power740b
192.168.18.1    power740a-priv
192.168.18.2    power740b-priv
10.18.1.4       power740a-vip
10.18.1.40      power740b-vip




ETHERNET STATISTICS (ent0) :
Device Type: 4-Port 10/100/1000 Base-TX PCI-Express Adapter (14106803)

ETHERNET STATISTICS (ent4) :
Device Type: 4-Port 10/100/1000 Base-TX PCI-Express Adapter (14106803)

ETHERNET STATISTICS (en12) :
Device Type: EtherChannel
Hardware Address: e4:1f:13:fc:96:15

回复 只看该作者 道具 举报

7#
发表于 2012-3-7 15:20:30
en12 Network   10.18.1     10.18.1.91   10.18.1.4  

PING power740a: (10.18.1.91): 56 data bytes
64 bytes from 10.18.1.91: icmp_seq=0 ttl=255 time=0 ms

public interface is en12

en12 Device Type: EtherChannel


ODM Finding:
  1. RAC on AIX: With Virtual Interfaces Racgvip Fails Even Though Public Interface is Up [ID 567286.1]

  2. Applies to:
  3. Oracle Server - Enterprise Edition - Version: 10.2.0.3 and later   [Release: 10.2 and later ]
  4. IBM AIX on POWER Systems (64-bit)
  5. IBM AIX Based Systems (64-bit)
  6. Symptoms

  7. If the Etherchannel and VLAN devices are used for the public network, racgvip script may fail and cause the VIP to offline as a result.   
  8. Cause

  9. Currently, the racgvip script issues the following command to check if the public network is up:


  10. '$ENTSTAT -d $_IF | $GREP -iEq '.*lan.*state.*:.*operational.*
  11. |.*link.*status.*:.*up.*' "



  12. However on some of the new AIX network devices, the output from the following command is different, so the above grep check on the entstat output fails:


  13. entstat -d <interface name>


  14. Bug:6608472 addresses this problem.


  15. The new fix is in 10.2.0.3 patch 6851901 (MLR #16).

  16. Although the bug 6608472 does not say the fix is in the patch 6851901, this patch (6851901) has the new racgvip that now issues the following to check the health of the public network:

  17. $ENTSTAT -d $_IF | $GREP -iEq '.*lan.*state.*:.*operational.*|.*link.*status.*:.*up.*|.*port.*operational.*state.*:.*up.*'

  18. The fix for the bug 6608472 is also included in 10.2.0.4
  19. Solution
  20. To resolve this issue, apply patch 6851901.

  21. Again, this patch provides the new racgvip script that can handle the new / current AIX interface types.

  22. The fix for the bug 6608472 is also in 10.2.0.4, so upgrading the CRS to 10.2.0.4 will resolve the problem produced by the bug 6608472.




  23. References
  24. BUG:6608472 - RACGVIP IN RAC FOR AIX FAILS EVEN THOUGH THE PUBLIC INTERFACE IS UP
复制代码
If the Etherchannel and VLAN devices are used for the public network, racgvip script may fail and cause the VIP to offline as a result.

回复 只看该作者 道具 举报

8#
发表于 2012-3-7 15:42:45
action plan:

贴出一下命令的输出

srvctl config nodeapps -n  power740a -a -g -s -l  
srvctl config nodeapps -n  power740b -a -g -s -l  

oifcfg iflist

回复 只看该作者 道具 举报

9#
发表于 2012-3-7 15:57:05
可能是由 Bug:6608472 引起的, 但是MOS上描述该bug 在10.2.0.4 已修复

racgvip日志中只有 Invalid parameters, or failed to bring up VIP  的信息

如果要进一步确认, 需要对vip resource做debug :

crsctl debug log res "ora.power740a.vip:5"
crsctl debug log res "ora.power740b.vip:5"
                 
之后手动启动  vip 资源

crs_start ora.power740a.vip
crs_start ora.power740b.vip
                                 
等待 vip OFFLINE , 上传 $ORA_CRS_HOME/log/racg下的所有日志
还原                                 
crsctl debug log res "ora.power740a.vip:0"
crsctl debug log res "ora.power740b.vip:0"

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-24 00:23 , Processed in 0.055978 second(s), 25 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569