Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖
匿名
1#
匿名  发表于 2012-2-23 14:29:25 | 查看: 7521| 回复: 3
数据库版本oracle 10.2.0.4  rac  aix 6.1


当前crs各资源状态
[test1:/oracle10/app/product/crs/10.2.0/log/test1/racg]crs_stat-t
Name          Type          Target    State    Host        
------------------------------------------------------------         
ora....m1.inst application    ONLINE   ONLINE    test1   
ora....m2.inst application   ONLINE    ONLINE    test2   
ora....icom.db application   ONLINE    ONLINE    test2   
ora....om1.srv application   ONLINE    ONLINE    test2   
ora.....dtp.cs application   ONLINE    ONLINE    test1   
ora....B1.lsnr application   ONLINE    ONLINE    test1   
ora....db1.gsd application   ONLINE   OFFLINE              
ora....db1.ons application   ONLINE   OFFLINE              
ora....db1.vip application   ONLINE    ONLINE    test1   
ora....B2.lsnr application   ONLINE    ONLINE    test2   
ora....db2.gsd application   ONLINE   OFFLINE              
ora....db2.ons application   ONLINE    ONLINE    test2   
ora....db2.vip application   ONLINE    ONLINE    test2   

节点1上各种状态
1.
ons进程
[test1:/oracle10/app/product/crs/10.2.0/log/test1/racg]ps-ef|grep "ons -d"|grep -v grep
oracle10  5701830  5112098   0   Apr21      -  7:18/oracle10/app/product/crs/10.2.0/opmn/bin/ons -d
oracle10 20119606       1   0 09:02:27      -  0:00/oracle10/app/product/db/10.2.0/opmn/bin/ons -d
oracle10  5112098       1   0   Apr 21      -  0:00/oracle10/app/product/crs/10.2.0/opmn/bin/ons -d
oracle10 28836336 20119606   009:02:27      -  0:00 /oracle10/app/product/db/10.2.0/opmn/bin/ons–d

2.
端口情况
[test1:/oracle10/app/product/crs/10.2.0/opmn/conf]moreons.config
localport=6113
remoteport=6200
loglevel=3
useocr=on

3.
当前系统监听端口
[test1:/oracle10/app/product/crs/10.2.0/opmn/conf]netstat-Aan|grep 6200|grep -v grep
f1000e00014223b0 tcp4      0      0  134.100.9.12.1521 10.235.5.21.46200  ESTABLISHED
f1000e0000307bb0 tcp4      0      0 *.6200            *.*               LISTEN
f1000e0023216200 udp4      0      0  192.168.16.11.5187*.*              
f1000e00308d6200 udp4      0      0  192.168.16.11.5737*.*              
f1000e00081f6200 udp4      0      0  192.168.16.11.5738*.*              
f1000e00222e6200 udp4      0      0  192.168.16.11.3415*.*              
f1000e0022266200 udp4      0      0  192.168.16.11.6002*.*              
f1000e00231d6200 udp4      0      0  192.168.16.11.6185*.*              
f1000e00360b6200 udp4      0      0  192.168.16.11.3867*.*              
f1000e001acb6200 udp4      0      0  192.168.16.11.3910*.*              
f1000e0008216200 udp4      0      0  192.168.16.11.3963*.*              
f1000e0017436200
f1000e0009854008 dgram      0     0               0 f1000e000004c180               0 f1000e0017436200
[test1:/oracle10/app/product/crs/10.2.0/opmn/conf]rmsockf1000e0000307bb0 tcpcb
The socket 0x307808 is being held by proccess 5701830 (ons).
[test1:/oracle10/app/product/crs/10.2.0/opmn/conf]ps -ef|grep5701830|grep -v grep
oracle10  5701830  5112098   0   Apr21      -  7:18 /oracle10/app/product/crs/10.2.0/opmn/bin/ons–d

4.
onsctl ping
[test1:/oracle10/app/product/crs/10.2.0/opmn/conf]onsctl ping
Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
   {node = test1, port = 6200}
Adding remote host test1:6200
onscfg[1]
   {node = test2, port = 6200}
Adding remote host test2:6200
/oracle10/app/product/db/10.2.0/opmn/logs/ons.log.test1:Permission denied
RCV: Permission denied
Communication error with the OPMN server local port.
Check the OPMN log files

RCV: Permission denied
Communication error with the OPMN server local port.
Check the OPMN log files

RCV: Permission denied
Communication error with the OPMN server local port.
Check the OPMN log files

RCV: Permission denied
Communication error with the OPMN server local port.
Check the OPMN log files

RCV: Permission denied
Communication error with the OPMN server local port.
Check the OPMN log files

RCV: Permission denied
Communication error with the OPMN server local port.
Check the OPMN log files

ons is not running ...
4#
发表于 2012-2-23 14:57:25
有兴趣的话  对ons 做 trace 深入分析一下 , follow 这个note:
  1. How to Log or Trace ONS in RAC 11g Release 2 [ID 1270841.1]

  2. Applies to:
  3. JDBC - Version: 10.2.0.1 to 11.2.0.2.0 - Release: 10.2 to 11.2
  4. Information in this document applies to any platform.
  5. Goal
  6. How to switch on ONS logging in RAC version 11.2.0.1 ?
  7. The loglevel parameter, used in earlier versions of RAC, is deprecated in 11.2.0.1.
  8. Solution
  9. ONS version 11.2.0.1 has introduced a new method of logging or tracing ONS activities.

  10. On each node, edit ons.config.<node name> and ons.config  files under /opmn/conf and add the following parameter:

  11. debugcomp=ons[subcomponent]


  12. Valid subcomponents for ons are:

  13.       all - all messages
  14.       local - ONS local information
  15.       listener - ONS listener information
  16.       discover - ONS discover (server or multicast) information
  17.       servers - ONS remote servers currently up and connected to the cluster
  18.       topology - ONS current cluster wide server connection topology
  19.       server - ONS remote server connection information
  20.       client - ONS client connection information
  21.       connect - ONS generic connection information
  22.       subscribe - ONS client subscription information
  23.       message - ONS notification receiving and processing information
  24.       deliver - ONS notification delivery information
  25.       special - ONS special notification processing
  26.       internal - ONS internal resource information
  27.       secure - ONS SSL operation information
  28.       workers - ONS worker threads


  29. For example,  debugcomp=ons[all,!secure]  logs all information except ONS SSL operations.

  30. All the log information is written to the file ons.dbg.<node name> under /opmn/logs.


  31. Note: ONS has to be restarted after editing ons.config.<node name> file :
  32. /bin/onsctl stop/start
复制代码

回复 只看该作者 道具 举报

3#
发表于 2012-2-23 14:54:38
ODM data:
  1. Hdr: 6429778 10.2.0.3 ONS 10.2.0.3 PRODID-1032 PORTID-212
  2. Abstract: ONS KEEPS GROWING AND CONSUME HIGH MEMORY

  3. *** 09/16/07 10:28 pm ***
  4. TAR:
  5. ----

  6. PROBLEM:
  7. --------
  8. ONS keeps growing and consume high memory. ONS needs to be recycled after
  9. every hour.

  10. DIAGNOSTIC ANALYSIS:
  11. --------------------
  12. s24_ons.log
  13. ===========
  14. 07/09/14 10:49:36 [2] Passive connection 0,159.181.33.227,6203 invalid
  15. connect server IP format
  16. 2679448025,6203,6203,0

  17. ONSinfo: !!2679448025!0!6203!0!6203

  18. hostName: sa4dj025

  19. clusterId: databaseClusterId

  20. clusterName: databaseClusterName

  21. instanceId: databaseInstanceId

  22. instanceName: databaseInstanceName


  23. s25_ons.log
  24. ===========
  25. 07/09/14 10:49:21 [2] Passive connection 0,159.181.33.217,6203 invalid
  26. connect server IP format
  27. 2679448035,6203,6203,0

  28. ONSinfo: !!2679448035!0!6203!0!6203

  29. hostName: sa4dj024

  30. clusterId: databaseClusterId

  31. clusterName: databaseClusterName

  32. instanceId: databaseInstanceId

  33. instanceName: databaseInstanceName

  34. WORKAROUND:
  35. -----------
  36. Recycle ONS after some time.

  37. RELATED BUGS:
  38. -------------

  39. REPRODUCIBILITY:
  40. ----------------
  41. continuously occurring in CT's environment


  42. Hdr: 6883669 10.2.0.3 ONS 10.2.0.3 PRODID-1032 PORTID-212
  43. Abstract: IN TWO NODE RAC, ONS.LOG REPORTS "INVALID CONNECT SERVER IP FORMAT" ERROR

  44. *** 03/11/08 06:13 pm ***
  45. TAR:
  46. ----

  47. PROBLEM:
  48. --------
  49. The ons.log grows continuously because of an error message that gets output
  50. regularly.

  51. ons.log has the following entries repeatedly:

  52. 08/03/05 17:36:43 [2] Passive connection 0,172.16.1.101,6200 invalid connect
  53. server IP format
  54. 2886730086,6201,6114,0 ONSinfo: !!2886730086!0!6201!0!6114 hostName: orarac2
  55. clusterId: databaseClusterId clusterName: databaseClusterName instanceId:
  56. databaseInstanceId instanceName: databaseInstanceName
  57. 08/03/05 17:36:56 [5] Listener thread 1286: 172.16.1.101:6200 (0x101)
  58. listening

  59. DIAGNOSTIC ANALYSIS:
  60. --------------------

  61. ons.config shows

  62. localport=6113
  63. remoteport=6200
  64. loglevel=5
  65. useocr=on

  66. This is same problem reported in the bug 6429778.

  67. The problem happens on boht of the customer's production and development
  68. cluster.
复制代码
就note 来看是 一个bug 容易发生在AIX上

回复 只看该作者 道具 举报

匿名
2#
匿名  发表于 2012-2-23 14:29:41
5. ons.log日志
[zwq_kfdb1:/oracle10/app/product/crs/10.2.0/opmn/logs]tail -20 ons.log
instanceName: databaseInstanceName

12/02/23 13:20:17 [2] Passive connection 0,134.100.9.11,6200 invalid connect server IP format
2254702859,6201,6114,0
ONSinfo: !!2254702859!0!6201!0!6114
hostName: zwq_kfdb1
clusterId: databaseClusterId
clusterName: databaseClusterName
instanceId: databaseInstanceId
instanceName: databaseInstanceName

12/02/23 13:21:24 [2] Passive connection 0,134.100.9.11,6200 invalid connect server IP format
2254702861,6200,6113,0
ONSinfo: !!2254702861!0!6200!0!6113
hostName: zwq_kfdb2
clusterId: databaseClusterId
clusterName: databaseClusterName
instanceId: databaseInstanceId
instanceName: databaseInstanceName

6. ora.zwq_kfdb1.ons.log文件
[zwq_kfdb1:/oracle10/app/product/crs/10.2.0/log/zwq_kfdb1/racg]ls -l
total 200
-rw-r--r--    1 oracle10 oinstall          0 Apr 30 2011  evtf.log
-rw-r--r--    1 oracle10 oinstall       4169 Apr 29 2011  ora.ahsheet.db.log
-rwxrwxr-x    1 oracle10 oinstall        564 Jan 16 2011  ora.ahunicom.db.log
-rwxrwxr-x    1 oracle10 dba           40887 Apr 29 2011  ora.zwq_kfdb1.gsd.log
-rwxrwxr-x    1 oracle10 dba           20480 Apr 30 2011  ora.zwq_kfdb1.ons.log
-rwxrwxr-x    1 root     system        16384 Apr 30 2011  ora.zwq_kfdb1.vip.log
-rwxrwxr-x    1 root     system         9966 Jan 16 2011  ora.zwq_kfdb2.vip.log
drwxrwxr-x    2 oracle10 dba             256 Jan 15 2011  racgeut
drwxrwxr-x    2 oracle10 dba             256 Jan 15 2011  racgevtf
drwxrwxr-x    2 oracle10 dba             256 Jan 15 2011  racgmain
[zwq_kfdb1:/oracle10/app/product/crs/10.2.0/log/zwq_kfdb1/racg]tail -20 ora.zwq_kfdb1.ons.log

2011-04-21 04:00:56.731: [    RACG][1] [5243026][1][ora.zwq_kfdb1.ons]: Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
   {node = zwq_kfdb1, port = 6200}
Adding remote host zwq_kfdb1:6200
onscfg[1]
   {node = zwq_kfdb2, port = 6200}
Adding remote host zwq_kfdb2:6200

2011-04-21 04:00:56.735: [    RACG][1] [5243026][1][ora.zwq_kfdb1.ons]: Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
   {node = zwq_kfdb1, port = 6200}
Adding remote host zwq_kfdb1:6200
onscfg[1]
   {node = zwq_kfdb2, port = 6200}
Adding remote host zwq_kfdb2:6200
onsctl: ons started

2011-04-30 18:23:57.514: [ CSSCLNT][1]clsssInitNative: connect failed, rc 2
我今天早上让客户执行了onsctl start和onsctl ping,但是都没有看到相关记录,我自己没有权限执行,如同上面提示Permission denied


节点2上相关情况
进程状态
[zwq_kfdb2:/proc]ps -ef|grep "ons -d"
oracle10  7667896  3342636   0   Apr 29      -  6:55 /oracle10/app/product/crs/10.2.0/opmn/bin/ons -d
oracle10  3342636        1   0   Apr 29      -  0:00 /oracle10/app/product/crs/10.2.0/opmn/bin/ons –d

端口和节点1一样,也是ons占用6200
端口配置也和节点1相同

onsctl ping
[zwq_kfdb2:/proc]onsctl ping
Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
   {node = zwq_kfdb1, port = 6200}
Adding remote host zwq_kfdb1:6200
onscfg[1]
   {node = zwq_kfdb2, port = 6200}
Adding remote host zwq_kfdb2:6200
/oracle10/app/product/db/10.2.0/opmn/logs/ons.log.zwq_kfdb2: Permission denied
ons is not running ...

ora.zwq_kfdb2.ons.log
[zwq_kfdb2:/oracle10/app/product/crs/10.2.0/log/zwq_kfdb2/racg]tail -20 ora.zwq_kfdb2.ons.log
Adding remote host zwq_kfdb2:6200
onsctl: ons started

2011-04-29 12:06:45.581: [    RACG][1] [7405762][1][ora.zwq_kfdb2.ons]: Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
   {node = zwq_kfdb1, port = 6200}
Adding remote host zwq_kfdb1:6200
onscfg[1]
   {node = zwq_kfdb2, port = 6200}
Adding remote host zwq_kfdb2:6200

2011-04-29 12:06:45.586: [    RACG][1] [7405762][1][ora.zwq_kfdb2.ons]: Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
   {node = zwq_kfdb1, port = 6200}
Adding remote host zwq_kfdb1:6200
onscfg[1]
   {node = zwq_kfdb2, port = 6200}
Adding remote host zwq_kfdb2:6200
onsctl: ons started

[zwq_kfdb2:/oracle10/app/product/crs/10.2.0/opmn/logs]tail -20 ons.log
instanceName: databaseInstanceName

12/02/23 13:29:45 [2] Passive connection 0,134.100.9.13,6200 invalid connect server IP format
2254702859,6200,6113,0
ONSinfo: !!2254702859!0!6200!0!6113
hostName: zwq_kfdb1
clusterId: databaseClusterId
clusterName: databaseClusterName
instanceId: databaseInstanceId
instanceName: databaseInstanceName

12/02/23 13:30:12 [2] Passive connection 0,134.100.9.13,6200 invalid connect server IP format
2254702859,6201,6114,0
ONSinfo: !!2254702859!0!6201!0!6114
hostName: zwq_kfdb1
clusterId: databaseClusterId
clusterName: databaseClusterName
instanceId: databaseInstanceId
instanceName: databaseInstanceName

我知道ons资源不太重要,但是我想知道为什么我不能让ons资源online,请ml帮忙诊断下。

回复 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-23 23:59 , Processed in 0.047038 second(s), 22 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569