Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

45

积分

0

好友

0

主题
1#
发表于 2012-7-27 11:38:15 | 查看: 10410| 回复: 6
OS: AIX  6.1 (6100-05-01-1016) 64bit
ORACLE version: 11.2.0.3

1. 在第一个节点执行root.sh时出现了如下报错:
  1. # /grid01/11.2.0/product/grid/root.sh
  2. Performing root user operation for Oracle 11g

  3. The following environment variables are set as:
  4.     ORACLE_OWNER= grid
  5.     ORACLE_HOME=  /grid01/11.2.0/product/grid

  6. Enter the full pathname of the local bin directory: [/usr/local/bin]:
  7. The contents of "dbhome" have not changed. No need to overwrite.
  8. The contents of "oraenv" have not changed. No need to overwrite.
  9. The contents of "coraenv" have not changed. No need to overwrite.


  10. Creating /etc/oratab file...
  11. Entries will be added to the /etc/oratab file as needed by
  12. Database Configuration Assistant when a database is created
  13. Finished running generic part of root script.
  14. Now product-specific root actions will be performed.
  15. Using configuration parameter file: /grid01/11.2.0/product/grid/crs/install/crsconfig_params
  16. Creating trace directory
  17. User ignored Prerequisites during installation
  18. User grid has the required capabilities to run CSSD in realtime mode
  19. OLR initialization - successful
  20.   root wallet
  21.   root wallet cert
  22.   root cert export
  23.   peer wallet
  24.   profile reader wallet
  25.   pa wallet
  26.   peer wallet keys
  27.   pa wallet keys
  28.   peer cert request
  29.   pa cert request
  30.   peer cert
  31.   pa cert
  32.   peer root cert TP
  33.   profile reader root cert TP
  34.   pa root cert TP
  35.   peer pa cert TP
  36.   pa peer cert TP
  37.   profile reader pa cert TP
  38.   profile reader peer cert TP
  39.   peer user cert
  40.   pa user cert
  41. Adding Clusterware entries to inittab
  42. ohasd failed to start
  43. Failed to start the Clusterware. Last 20 lines of the alert log follow:
  44. 2012-07-27 07:39:44.714
  45. [client(4456640)]CRS-2101:The OLR was formatted using version 3.

  46.         /grid01/11.2.0/product/grid/perl/bin/perl -I/grid01/11.2.0/product/grid/perl/lib -I/grid01/11.2.0/product/grid/crs/install /grid01/11.2.0/product/grid/crs/install/rootcrs.pl execution failed
复制代码



2. 在后面的cfgtoollogs( <GRID_HOME>/cfgtoollogs/crsconfig/rootcrs_<nodename>.log)里,有如下报错:
  1. 2012-07-27 07:52:14: 'ohasd' is now registered
  2. 2012-07-27 07:52:14: Starting ohasd
  3. 2012-07-27 07:52:14: Checking the status of ohasd
  4. 2012-07-27 07:52:14: Executing cmd: /grid01/11.2.0/product/grid/bin/crsctl check has
  5. 2012-07-27 07:52:15: Checking the status of ohasd
  6. 2012-07-27 07:52:20: Executing cmd: /grid01/11.2.0/product/grid/bin/crsctl check has
  7. 2012-07-27 07:52:21: Checking the status of ohasd
  8. 2012-07-27 07:52:26: Executing cmd: /grid01/11.2.0/product/grid/bin/crsctl check has
  9. 2012-07-27 07:52:26: Checking the status of ohasd
  10. 2012-07-27 07:52:31: ohasd is not already running.. will start it now
  11. 2012-07-27 07:52:31: itab entries=cssd|evmd|crsd|ohasd
  12. 2012-07-27 07:52:31: Executing /usr/sbin/init q
  13. 2012-07-27 07:52:31: Executing cmd: /usr/sbin/init q
  14. 2012-07-27 07:52:36: Created backup /etc/inittab.no_crs
  15. 2012-07-27 07:52:36: Appending to /etc/inittab.tmp:
  16. 2012-07-27 07:52:36: h1:2:respawn:/etc/init.ohasd run >/dev/null 2>&1 </dev/null

  17. 2012-07-27 07:52:36: Done updating /etc/inittab.tmp
  18. 2012-07-27 07:52:36: Saved /etc/inittab.crs
  19. 2012-07-27 07:52:36: Installed new /etc/inittab
  20. 2012-07-27 07:52:36: Executing /usr/sbin/init q
  21. 2012-07-27 07:52:36: Executing cmd: /usr/sbin/init q
  22. 2012-07-27 07:52:36: Executing cmd: /grid01/11.2.0/product/grid/bin/crsctl start has
  23. 2012-07-27 07:54:37: Command output:
  24. >  CRS-4124: Oracle High Availability Services startup failed.
  25. >  CRS-4000: Command Start failed, or completed with errors.
  26. >End Command output
  27. 2012-07-27 07:54:37: Executing /etc/ohasd install
  28. 2012-07-27 07:54:37: Executing cmd: /etc/ohasd install
  29. 2012-07-27 07:54:38: ohasd failed to start
  30. 2012-07-27 07:54:38: ohasd failed to start
  31. 2012-07-27 07:54:38: Alert log is /grid01/11.2.0/product/grid/log/node1/alertnode1.log
  32. 2012-07-27 07:54:38: Failed to start  service 'ohasd'
  33. 2012-07-27 07:54:38: Checking the status of ohasd
复制代码




3. 在detail的log里有如下信息:
/grid01/11.2.0/product/grid/log/node1/alertnode1.log
  1. $ more /grid01/11.2.0/product/grid/log/node1/alertnode1.log
  2. 2012-07-27 07:39:44.714
  3. [client(4456640)]CRS-2101:The OLR was formatted using version 3.
  4. 2012-07-27 08:02:43.841
  5. [ohasd(4325408)]CRS-0715:Oracle High Availability Service has timed out waiting for init.ohasd to be started.
  6. 2012-07-27 10:10:23.750
  7. [client(4194450)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
  8. 2012-07-27 10:10:23.756
  9. [client(4194450)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /grid01/11.2.0/product/grid/log/n
  10. ode1/client/crsctl_grid.log.
复制代码



/grid01/11.2.0/product/grid/log/node1/client/crsctl_grid.log
  1. $ more /grid01/11.2.0/product/grid/log/node1/client/crsctl_grid.log
  2. Oracle Database 11g Clusterware Release 11.2.0.3.0 - Production Copyright 1996, 2011 Oracle. All rights reserved.
  3. 2012-07-27 10:10:18.614: [  OCRMSG][1]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2)
  4. 2012-07-27 10:10:18.614: [  OCRMSG][1]GIPC error [29] msg [gipcretConnectionRefused]
  5. 2012-07-27 10:10:18.614: [  OCRMSG][1]prom_connect: error while waiting for connection complete [24]
复制代码



感觉问题应该出在GPnP profile,但是,在metalink上和google没有找到解决的案例。
请大家帮忙看看。
谢谢。
7#
发表于 2012-7-30 14:49:26
谢谢,北柏兄指点。

分享一下,我的处理方式:
  1. 1. 之前的情况,出现了这个问题后,按照官方的<Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]>做了下排错,最后发现,其实,inittab里是调用了相应的init.ohasd,但是应该是ohasd异常退出了。然后,这个troubleshooting的guide里说这种情况的话,需要SA去做troubleshooting了,我还没有这样的水平。

  2. 2. 把整个环境用deinstall卸载,把相应asm的disk的头都清掉,重新尝试安装仍然和原来报错一致。

  3. 3. 尝试重新搭建整套东西包括(OS, RAC等),搭建过程正常。
复制代码
于是乎,感觉可能是没有卸载干净GI的造成的。在GI安装完毕之后,用find命令记录了在安装过程中,变动过的文件,现在还没来的及分析,想以此分析卸载的时候,需要动哪些文件。

北柏兄提供的信息很有价值,由于问题环境已经没有了,无法验证。报错感觉还是有些差别。

BTW, 现在11g的卸载官方似乎没有给出太完成手工卸载的文档,因此,环境出错后,用deinstall未必能完全做到清理干净。

回复 只看该作者 道具 举报

6#
发表于 2012-7-30 11:37:47

it's bug!

AIX6.1TL7 11gR2 RAC  bug:
ohasd failed to start
Failed to start the Clusterware. Last 20 lines of the alert log follow:
2012-07-05 15:46:54.573
[client(8454270)]CRS-2101:The OLR was formatted using version 3.

懂的人知道下面这行的意思
/bin/dd if=/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1
reference:http://www.itpub.net/thread-1593773-1-1.html

[ 本帖最后由 北柏 于 2012-7-30 11:41 编辑 ]

回复 只看该作者 道具 举报

5#
发表于 2012-7-27 13:47:16
“ohasd failed to start”,找找ohasd相关的资料

回复 只看该作者 道具 举报

4#
发表于 2012-7-27 13:06:29
以grid用户执行下面的命令检查运行级别:
$cluvfy stage -pre crsinst -n <nodelist>

看看是否哪些没有pass,针对性做调整。

回复 只看该作者 道具 举报

3#
发表于 2012-7-27 12:32:49
这个属性没问题,已经都设为no_reserve,并且pvid已经去掉。
  1. $ id grid
  2. uid=2000(grid) gid=1601(oinstall) groups=1602(asmadmin),1603(dba),1604(asmdba),1605(asmoper)
  3. $ id oracle
  4. uid=2001(oracle) gid=1601(oinstall) groups=1603(dba),1604(asmdba)
  5. $

  6. $ hostname
  7. node1
  8. $ lsattr -El hdisk2 | grep reserve
  9. reserve_policy  no_reserve                     Reserve Policy           True
  10. $ ls -l /dev/rhdisk2
  11. crw-rw----    1 grid     asmadmin     25,  0 Jul 27 06:23 /dev/rhdisk2
  12. $
  13. $



  14. $ hostname
  15. node2
  16. $ lsattr -El hdisk2 | grep reserve
  17. reserve_policy    no_reserve                    Reserve Policy           True
  18. $
  19. $ ls -l /dev/rhdisk2
  20. crw-rw----    1 grid     asmadmin     25,  0 Jul 26 21:31 /dev/rhdisk2
复制代码
谢谢。

[ 本帖最后由 miloluo 于 2012-7-27 12:35 编辑 ]

回复 只看该作者 道具 举报

2#
发表于 2012-7-27 11:51:48
看看磁盘属性:
lsattr -E -l xxx | grep reserve_

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-11-16 04:21 , Processed in 0.067896 second(s), 22 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569