Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

87

积分

0

好友

3

主题
1#
发表于 2012-3-22 15:42:30 | 查看: 10067| 回复: 7
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                 
20298 oracle    25   0 1689m 145m 137m R 99.8  0.6   1410:49 oracle

这个进程已经跑了很久了,所以对此做了 oradebug ,trace文件太大了,所以只截了一段
SQL>oradebug setospid 20298
SQL>oradebug dump processstate 10


  1. /u01/app/oracle/admin/ecpdisk/bdump/orcl11_m000_20298.trc
  2. Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
  3. With the Partitioning, Real Application Clusters, OLAP, Data Mining
  4. and Real Application Testing options
  5. ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1
  6. System name: Linux
  7. Node name: disk-db1
  8. Release: 2.6.18-194.el5
  9. Version: #1 SMP Tue Mar 16 21:52:39 EDT 2010
  10. Machine: x86_64
  11. Instance name: orcl11
  12. Redo thread mounted by this instance: 1
  13. Oracle process number: 194
  14. Unix process pid: 20298, image: oracle@disk-db1 (m000)
  15. *** ACTION NAME:(Auto-Flush Slave Action) 2012-03-21 16:00:55.474
  16. *** MODULE NAME:(MMON_SLAVE) 2012-03-21 16:00:55.473
  17. *** SERVICE NAME:(SYS$BACKGROUND) 2012-03-21 16:00:55.473
  18. *** SESSION ID:(1063.50673) 2012-03-21 16:00:55.473
  19. WARNING:io_submit failed due to kernel limitations MAXAIO for process=0 pending aio=0
  20. WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65504
  21. WARNING:1 Oracle process running out of OS kernelI/O resources aiolimit=0
  22. ksfdgo()+1488<-ksfdaio1()+9848<-kfkUfsIO()+594<-kfkDoIO()+631<-kfkIOPriv()+616<-kfdIOPriv()+95<-kfioSubmitIO()+503<-kfioRequestPriv()+166<-kfioRequest()+689<-ksfd_osmgo()+1286<-ksfdgo()+1488<-ksfdaio1()+9848<-kcflbi()+498<-kcbldio()+1897<-kcblio()+944<-kcblnb()+790
  23. <-kdblnb()+109<-kdblgb()+85<-kdblailb()+1093<-kdblai()+6407<-klclil1r()+910<-qerltRop()+928<-qerflRop()+95<-qerfxFetch()+978<-qerjotFetch()+209<-qerflFetch()+147<-rwsfcd()+88<-qerltFetch()+1008<-insdlexe()+372<-insExecStmtExecIniEngine()+85<-insexe()+867<-opiexe()+9334
  24. <-kpoal8()+2295<-opiodr()+1184<-kpoodrc()+38<-rpiswu2()+409ssd_unwind_bp: unhandled instruction at 0x150d2cf instr=6a
  25. ssd_unwind_bp: unhandled instruction at 0x150ba23 instr=68
  26. <-kpoodr()+554<-upirtrc()+2101<-kpurcsc()+125<-kpuexecv8()+1705<-kpuexec()+2643<-OCIStmtExecute()+41<-kewrose_oci_stmt_exec()+62<-kewrgwxf1_gwrsql_exft_1()+284<-kewrgwxf_gwrsql_exft()+451<-kewrews_execute_wr_sql()+52<-kewa_flusher()+2592<-kewa_flush_cb()+21
  27. <-kewrft_flush_table()+319<-kewrftec_flush_table_ehdlcx()+805<-kewrfat_flush_all_tables()+537<-kewrfos_flush_onesnap()+170<-kewrfsc_flush_snapshot_c()+650<-kewrafs_auto_flush_slave()+827<-kebm_slave_main()+221<-ksvrdp()+1159<-opirip()+748<-opidrv()+583<-sou2o()+114
  28. <-opimai_real()+317<-main()+116<-__libc_start_main()+244<-_start()+41
  29. *** 2012-03-21 19:56:27.678
  30. Received ORADEBUG command 'unlimit' from process Unix process pid: 30254, *** 2012-03-21 19:56:55.123
  31. Received ORADEBUG command 'dump systemstate 10' from process Unix process pid: 30254, image:
复制代码


有人说是Bug 9949948http://blog.chinaunix.net/uid-7628732-id-2980799.html

请帮忙看下哦

未命名.jpg (161.92 KB, 下载次数: 418)

未命名.jpg

2#
发表于 2012-3-22 16:06:11
WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65504
WARNING:1 Oracle process running out of OS kernelI/O resources aiolimit=0


AIO-MAX-NR=65536 << 推荐值


# Oracle-Validated setting for fs.aio-max-nr is 3145728
fs.aio-max-nr = 3145728

回复 只看该作者 道具 举报

3#
发表于 2012-3-22 16:07:02
ODM FINDING:

Bug 9949948 - Linux: Process spin under ksfdrwat0 if OS Async IO not configured high enough

Bug 9949948  Linux: Process spin under ksfdrwat0 if OS Async IO not configured high enough
This note gives a brief overview of bug 9949948.
The content was last updated on: 28-OCT-2011
Click here for details of each of the sections below.
Affects:

    Product (Component)        Oracle Server (Rdbms)
    Range of versions believed to be affected         Versions >= 10.2.0.4 but BELOW 11.1
    Versions confirmed as being affected        

        10.2.0.5

    Platforms affected       

        Linux X86-64bit
        Linux 32bit


     It is believed to be a regression in default behaviour thus:
       Regression introduced in 10.2.0.5

Fixed:

    This issue is fixed in       

        11.1.0.6 (Base Release)
        10.2.0.5.2 Patch Set Update
        10.2.0.5 Patch 5 on Windows Platforms

Symptoms:
       
Related To:

    Hang (Process Spins)
    Waits for "i/o slave wait"

       

    DISK_ASYNCH_IO

Description

    This problem is introduced in 10.2.0.5
     
    It only affects platforms where Oracle has to reserve async IO slots,
    such as Linux platforms.
     
    If the OS async IO layer is underconfigured and an Oracle process
    cannot get sufficient AIO slots then rather than reverting to
    using non AIO call the process may go into an infinite spin
    under ksfdrwat0.
     
    Rediscovery notes:
      The spin will be preceded by messages in the trace
      file of the form:
        WARNING:io_submit failed due to kernel limitations MAXAIO
                    for process=0 pending aio=0
        WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65518
        WARNING:1 Oracle process running out of OS kernelI/O resources aiolimit=0
     
      Notice specifically that the value for aiolimit is reported as "0"
      for this bug.
     
      The process then spins in ksfdrwat0 typically with a stack showing
             skgfqio ()
             ksfdgo ()
             ksfdwtio ()
             ksfdwat1 ()
             ksfdrwat0 ()   <<< Spin point
             ksfdblock ()
             kcflwi ()
             kcflci ()
             kcblci ()
             kcblcio ()
             kcblgt ()
             kcbldrget ()
     
      It will show repeated waits for "i/o slave wait", which can be
      misleading as that is normally considered an idle wait event.
     
    Workaround
      Raise the OS AIO limits such that the number of concurrent slot
       requirements never exceeds the OS limit.
       ie: Increase AIO-MAX-NR
     OR
      Disable async IO (Set DISK_ASYNCH_IO=FALSE)

回复 只看该作者 道具 举报

4#
发表于 2012-3-22 16:16:15
Advise:

1. 就STACK CALL 看和Bug 9949948差别较大。

更像另一个BUG NOTE:

Bug 8249407: WARNING:IO_SUBMIT FAILED DUE TO KERNEL LIMITATIONS      

Hdr: 8249407 10.2.0.4 RDBMS 10.2.0.4 ASM PRODID-5 PORTID-226 8412426
Abstract: WARNING:IO_SUBMIT FAILED DUE TO KERNEL LIMITATIONS

PROBLEM:
--------
Trace files with the following messages are getting generated in the
asm/udump directory for all 3 nodes.

WARNING:io_submit failed due to kernel limitations MAXAIO for process=128
pending aio=118

WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=3145728 AIO-NR=74963

WARNING:1 Oracle process running out of OS kernelI/O resources aiolimit=128


ksfdgo()+1791<-ksfdaio1()+4593<-kfkUfsIO()+587<-kfkDoIO()+631<-kfkIOPriv()+608
<-kfdIOPriv()+95<-kfdParallelIO()+1776<-kfdDiscoverDeep()+145<-kfgDiscoverDeep
()+310<-kfgDiscoverGroup()+616<-kfgTableCb()+2488<-kfgGrpTableCbInternal()+280
<-kfgGrpTableCb()+56<-qerfxFetch()+2944
<-opifch2()+3189<-kpoal8()+2994<-opiodr()+984<-ttcpip()+1012<-opitsk()+1322<-o
piino()+1026<-opiodr()+984<-opidrv()+547<-sou2o()+114<-opimai_real()+163<-main
()+116<-__libc_start_main()+219<-_start()+42
*** ERROR: PGA memory leak detected 79792 > 57768 ********
******************************************************

DIAGNOSTIC ANALYSIS:
--------------------
Issue looked similar to BUG 6908655,Bug 7439681 and Bug 7523755.
CT applied Patch for BUG 7523755.
But still the issue persists.

WORKAROUND:
-----------

RELATED BUGS:
-------------
BUG 6908655,Bug 7439681 and Bug 7523755

REPRODUCIBILITY:
----------------

TEST CASE:
----------

STACK TRACE:
------------

SUPPORTING INFORMATION:
-----------------------
Will upload ASM alert.log file of all the instances.
Trace file with the messages.
RDA




2. 建议你首先尝试修改  fs.aio-max-nr 内核参数 ,观察是否能绕过该问题。

回复 只看该作者 道具 举报

5#
发表于 2012-3-23 20:26:39
1、忘记说环境了
RHEL5.5 X64 RAC 10.2.0.5.0


2、尝试修改aio-max
[root@disk-db1 ~]# sysctl -a|grep aio-max
fs.aio-max-nr = 3145728


select * from dba_hist_snapshot where 1=1 order by begin_interval_time desc

无标题.png (9.72 KB, 下载次数: 415)

无标题.png

回复 只看该作者 道具 举报

6#
发表于 2012-3-23 20:27:09
但是没有见到效果:

$tail -f orcl11_mmon_19970.trc *** 2012-03-23 20:15:00.821
*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout
*** 2012-03-23 20:16:00.874
*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout
*** 2012-03-23 20:17:00.899
*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout
*** 2012-03-23 20:18:00.959
*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout

回复 只看该作者 道具 举报

7#
发表于 2012-3-23 21:23:25
需求更多诊断信息, 请上传AWR  、ALERT.LOG和DIAG、CKPT、LGWR、DBWR进程的日志。

回复 只看该作者 道具 举报

8#
发表于 2012-3-27 09:37:55
AWR做不了,21号开始就没有快照了

无标题.png (14.79 KB, 下载次数: 412)

无标题.png

racg.tar.gz

9.99 KB, 下载次数: 1200

alert_orcl11.rar

62.68 KB, 下载次数: 1184

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-24 03:15 , Processed in 0.061932 second(s), 24 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569