Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

70

积分

0

好友

22

主题
1#
发表于 2013-12-27 15:03:14 | 查看: 5577| 回复: 8
问题节点系统信息:

CPU info:
  4 Intel(R) Itanium 2 9100 series processors (1.6 GHz, 24 MB)
          533 MT/s bus, CPU version A1
          8 logical processors (2 per socket)

Memory: 16316 MB (15.93 GB)

OS info:
   Nodename:  dqpicc07
   Release:   HP-UX B.11.31
   Version:   U (unlimited-user license)

数据库版本 Oracle 10.2.0.5
RAC节点数量:4节点


$ crs_stat -t
   无回应...............


$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy

$ ocrcheck                                                                                                                        
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :    1048300
         Used space (kbytes)      :       8340
         Available space (kbytes) :    1039960
         ID                       :  941242024
         Device/File Name         : /dev/rdisk/disk67
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

$ crsctl query css votedisk
0.     0    /dev/rdisk/disk68

located 1 votedisk(s).


crsd.log中一直在报如下错误:

2013-12-27 14:27:37.665: [  CRSEVT][953808] CAAMonitorHandler :: 0:Action Script /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check) timed out for ora.bxgrid.db! (timeout=600)
2013-12-27 14:27:37.665: [  CRSAPP][953808] CheckResource error for ora.bxgrid.db error code = -2
2013-12-27 14:27:45.395: [  CRSEVT][953809] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/bxgrid/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-12-27 14:27:45.395: [  CRSEVT][953809] CAAMonitorHandler :: 0:Action Script /u01/oracle/product/10.2.0.5/bxgrid/bin/racgwrap(check) timed out for ora.bxgrid.srv_bxgrid.cs! (timeout=600)
2013-12-27 14:27:45.395: [  CRSAPP][953809] CheckResource error for ora.bxgrid.srv_bxgrid.cs error code = -2
2013-12-27 14:27:53.175: [  CRSEVT][953810] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-12-27 14:27:53.175: [  CRSEVT][953810] CAAMonitorHandler :: 0:Action Script /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check) timed out for ora.dqpicc07.gsd! (timeout=600)
2013-12-27 14:27:53.175: [  CRSAPP][953810] CheckResource error for ora.dqpicc07.gsd error code = -2
2013-12-27 14:27:59.425: [  CRSEVT][953811] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-12-27 14:27:59.425: [  CRSEVT][953811] CAAMonitorHandler :: 0:Action Script /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check) timed out for ora.dqpicc07.ons! (timeout=600)
2013-12-27 14:27:59.425: [  CRSAPP][953811] CheckResource error for ora.dqpicc07.ons error code = -2
2013-12-27 14:28:08.575: [  CRSEVT][953812] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/bxgrid/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-12-27 14:28:08.575: [  CRSEVT][953812] CAAMonitorHandler :: 0:Action Script /u01/oracle/product/10.2.0.5/bxgrid/bin/racgwrap(check) timed out for ora.dqpicc07.ASM3.asm! (timeout=600)
2013-12-27 14:28:08.576: [  CRSAPP][953812] CheckResource error for ora.dqpicc07.ASM3.asm error code = -2
2013-12-27 14:28:55.955: [  CRSEVT][953831] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-12-27 14:28:55.955: [  CRSEVT][953831] CAAMonitorHandler :: 0:Action Script /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check) timed out for ora.dqpicc07.vip! (timeout=60)
2013-12-27 14:28:55.955: [  CRSAPP][953831] CheckResource error for ora.dqpicc07.vip error code = -2

2#
发表于 2013-12-27 15:06:36
rac故障相关日志见附件

raclog.rar

2.39 MB, 下载次数: 846

回复 只看该作者 道具 举报

3#
发表于 2013-12-27 15:19:58
2013-12-27 14:07:12.823: [  CRSEVT][953786] CAAMonitorHandler :: 0:Action Script /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check) timed out for ora.dqpicc07.vip! (timeout=60)
2013-12-27 14:07:12.824: [  CRSAPP][953786] CheckResource error for ora.dqpicc07.vip error code = -2
2013-12-27 14:07:15.314: [  CRSEVT][953787] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/bxgrid/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-12-27 14:07:15.314: [  CRSAPP][953787] CheckResource error for ora.bxgrid.srv_bxgrid.cs error code = -1
2013-12-27 14:07:23.080: [  CRSEVT][953785] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-12-27 14:07:23.081: [  CRSAPP][953785] CheckResource error for ora.dqpicc07.gsd error code = -1
2013-12-27 14:07:29.345: [  CRSEVT][953783] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-12-27 14:07:29.345: [  CRSAPP][953783] CheckResource error for ora.dqpicc07.ons error code = -1
2013-12-27 14:07:38.486: [  CRSEVT][953776] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/bxgrid/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-12-27 14:07:38.486: [  CRSAPP][953776] CheckResource error for ora.dqpicc07.ASM3.asm error code = -1
2013-12-27 14:08:45.904: [  CRSEVT][953790] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child


貌似心跳vip有问题。

回复 只看该作者 道具 举报

4#
发表于 2013-12-27 15:28:43
maybe:hp-ux: Node Crash Due To Large Amount Of Racgimon Threads or CRS_STAT/SRVCTL COMMAND HANG OS bug ( QX:QXCR1000940361 ) (文档 ID 883801.1)

RAC/cluster的问题很复杂,一般浅层次或最基本的分析就需要以下信息:
系统运行情况(CPU、内存、IO、进程、网络资源使用状况)
OS日志
clusterware日志($GRID_HOME/log)
ASM实例日志
数据库日志
listener 日志

回复 只看该作者 道具 举报

5#
发表于 2013-12-27 15:29:28
dqpicc07[/]#ping dqpicc07-vip
PING dqpicc07-vip: 64 byte packets
64 bytes from 10.65.99.12: icmp_seq=0. time=0. ms
64 bytes from 10.65.99.12: icmp_seq=1. time=0. ms

dqpicc07[/]#ping dqpicc06-vip
PING dqpicc06-vip: 64 byte packets
64 bytes from 10.65.99.11: icmp_seq=0. time=0. ms
64 bytes from 10.65.99.11: icmp_seq=1. time=0. ms

vip没有问题的!

回复 只看该作者 道具 举报

6#
发表于 2013-12-27 17:45:50
1、
2013-03-17 17:29:08.018: [  CRSEVT][65918] CAAMonitorHandler :: 0:Action Script /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check) timed out for ora.dqpicc07.gsd! (timeout=600)
2013-03-17 17:29:08.019: [  CRSAPP][65918] CheckResource error for ora.dqpicc07.gsd error code = -2
2013-04-02 07:25:12.566: [  CRSEVT][156736] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

2013-04-02 07:25:12.573: [  CRSEVT][156736] CAAMonitorHandler :: 0:Action Script /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check) timed out for ora.dqpicc07.vip! (timeout=60)
2013-04-02 07:25:12.573: [  CRSAPP][156736] CheckResource error for ora.dqpicc07.vip error code = -2
2013-04-08 13:38:42.215: [  CRSEVT][193196] CAAMonitorHandler :: 0:Could not join /u01/oracle/product/10.2.0.5/crs/bin/racgwrap(check)
category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

这个报错从
2013-03-17开始就有了


2、最近一次重启 就日志看是

Oracle Database 10g CRS Release 10.2.0.5.0 Production Copyright 1996, 2004, Oracle.  All rights reserved
2013-08-12 19:40:55.391: [ default][1] CRS Daemon Starting

该问题可能与OS资源或者CRS软件一致性有关系, 建议你先重启下CRS 后再观察

回复 只看该作者 道具 举报

7#
发表于 2014-1-2 11:02:47
重启后故障解决,谢谢!

回复 只看该作者 道具 举报

8#
发表于 2014-1-3 09:21:19
swgsw 发表于 2014-1-2 11:02
重启后故障解决,谢谢!

重启CRS解决的?

回复 只看该作者 道具 举报

9#
发表于 2014-1-3 10:07:54
是的重启节点故障消失。

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-21 09:52 , Processed in 0.051502 second(s), 23 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569