Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

2135

积分

502

好友

184

主题
1#
发表于 2013-11-10 23:53:11 | 查看: 3354| 回复: 0
1.执行自动统计信息收集的自动作业进程J001报ORA-04020: deadlock detected while trying to lock object orcl.TAB1错误

2.  陆续出现7次ORA-07445 [_ptrgl] [SIGSEGV] [Address not mapped to object]错误,触发该ORA-07445错误的均是CTXSYS全文索引相关的前台进程

3. 出现多次 PMON failed to acquire latch, see PMON dump错误

4. 因 ORA-00494: enqueue [CF]  CF队列锁超时导致ARC0归档进程终止实例


发现主机SWAP交换空间使用率较高,登陆系统异常缓慢,且有一远程登陆的服务进程占用大量内存。



Name            PID CPU% PgSp Class             ServerV3       0   "h" for help
rpc.stat     192714  5.6  2.8 Default            lientV3       0   "q" to quit
oracle      1135074  3.14585.7 Default
srcmstr      151992  2.2  0.7 System
diagd        209114  1.4  0.5 System
portmap      143792  0.9  0.9 System
shlap64      131476  0.7  0.4 System
rpc.lock     221264  0.2  0.3 System
dtlogin      225310  0.2  0.3 System
tnslsnr      381304  0.2 22.7 Default
lrud          16392  0.1  0.6 System
cimssys      160240  0.1  0.3 System
topas       1257636  0.1  8.5 Default
oracle      1233376  0.1 20.7 Default
Signal 2 received

$ ps -ef|grep 1135074
   kfxta 1257640 1392968   0 22:26:43  pts/1  0:00 grep 1135074
  oracle 1135074       1   0   Oct 15      - 145:08 oracleCCISKB (LOCAL=NO)

SQL> select sid,sql_id,event from v$session where paddr=(select addr from v$process where spid=1135074);

       SID SQL_ID        EVENT                                                           
---------- ------------- ----------------------------------------------------------------
      1562 fdz4bhkngd54x latch: row cache objects

SQL> select sql_text from v$sql where sql_id='fdz4bhkngd54x';

SQL_TEXT
--------------------------------------------------------------------------------
select KNID,question,ANSWER,operid,FUNCTYPE,modefydate,areacode,BUSITYPESTR,VERI
FYSTATE,dealdesc,clickdegree,CHANNEL              from (select a.KNID,b.question
,a.ANSWER,a.operid,a.FUNCTYPE, to_char(a.LOGTIME,'YYYYMMDD HH24:MI:SS') MODEFYDA
TE,a.areacode,a.BUSITYPESTR,a.VERIFYSTATE,s.typeiddescription dealdesc,a.clickde
gree,              row_number() over(order by a.clickdegree desc,a.LOGTIME desc)
rn,a.CHANNEL from VA_KNOWLEDGE a, TAB1 b, s_mapping_detail s where b.KNI
D=a.KNID and a.VERIFYSTATE=s.typeidcode(+) AND s.typecode(+)='VAVRY' and b.VERIF
YSTATE= 1  and contains(b.question,'ê??ú and ?ü?? and ????')>0 and a.FUNCTYPE =
0  ) tmp where rn>(:in_sPage-1)*:in_sPageSize AND rn<=:in_sPage*:in_sPageSize





2.        本次服务详细过程

        问题分析


ORA-04020错误是DDL和解析锁死锁(DDL DEAD LOCK)错误,与常见的ORA-00060死锁错误不同。Oracle DDL锁是为了串行化数据库对象源数据的使用和修改而存在,例如修改一张表的定义ALERT TABLE、修改某个存储过程ALTER PROCEDURE都需要持有该对象上的DDL锁。

"orcl"."IDX_TAB1_QUESTION"对象是基于"orcl"."TAB1"表的全文索引(TRANSACTIONAL TEXT INDEX),这2个对象的DDL如下:


CREATE TABLE "orcl"."TAB1"
   (    "QUESID" VARCHAR2(36) NOT NULL ENABLE,
        "QUESTION" VARCHAR2(500) NOT NULL ENABLE,
        "KNID" VARCHAR2(36) NOT NULL ENABLE,
        "ISMAIN" NUMBER(*,0) DEFAULT 1,
        "VERIFYSTATE" NUMBER(*,0),
        "INDEXSTATE" NUMBER(*,0) NOT NULL ENABLE,
        "CREATEDATE" DATE,
        "LASTMODIFYDATE" DATE,
        "LOGTIME" DATE DEFAULT sysdate,
        "SENDFLAG" NUMBER(*,0) NOT NULL ENABLE,
        "FUNCTYPE" NUMBER(*,0),
        "AUTONKID" VARCHAR2(36),
        "TEMPLATEFLAG" VARCHAR2(1),
        "CHANNEL" VARCHAR2(50),
        "RELTEMPATE" VARCHAR2(300),
        "BUSITYPESTR" VARCHAR2(1000),
         CONSTRAINT "PK_TAB1" PRIMARY KEY ("QUESID", "KNID")
  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "SYSTEM"  ENABLE
   ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "CCLOG2"

  
CREATE INDEX "orcl"."IDX_TAB1_QUESTION" ON "orcl"."TAB1" (
   INDEXTYPE IS "CTXSYS"."CONTEXT" PARAMETERS ('DATASTORE CTXSYS.DIRECT_DATASTOR
FILTER CTXSYS.NULL_FILTER
LEXER CCISLEX
WORDLIST CCISWORDLIST
STORAGE CCISSTORAGE
SYNC (ON COMMIT)
TRANSACTIONAL')












10g中自动收集统计信息作业DBMS_STATS GATHER_STATS_JOB会在工作日夜间22:00启动并收集相关对象的统计信息,由于统计信息的更新涉及到数据库对象源数据的变更,所以同样会在短时间内要求对象的DDL锁。


J001 (PID 43) 自动作业进程在22:00启动执行GATHER_STATS_JOB,当收集到表orcl.TAB1时触发了ORA-04020错误,该进程生成的ORA-04020 TRACE GRAPH如下:



A deadlock among DDL and parse locks is detected.
This deadlock is usually due to user errors in
the design of an application or from issuing a set
of concurrent statements which can cause a deadlock.
This should not be reported to Oracle Support.
The following information may aid in finding
the errors which cause the deadlock:
ORA-04020: deadlock detected while trying to lock object orcl.TAB1
--------------------------------------------------------
  object   waiting  waiting       blocking blocking
  handle   session     lock mode   session     lock mode
--------  -------- -------- ----  -------- -------- ----
7000000b059e8e0  7000000e9cd94a0 7000000cdb8b3f0    S  7000000e9c74f28 7000000af065d48    X
7000000ddbf8b08  7000000e9c74f28 7000000c8e9e9a8    X  7000000e9cd94a0 7000000b2014508    S



以上第一列为对象句柄object handle,引起死锁的2个对象分别对应为"orcl"."TAB1"及其全文索引"orcl"."IDX_TAB1_QUESTION":


SO: 7000000cdb8b3f0, type: 53, owner: 7000000e9e21b58, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=7000000cdb8b3f0 handle=7000000b059e8e0 request=S
call pin=0 session pin=0 hpc=4489 hlc=0000
htl=7000000cdb8b470[7000000af04bb00,7000000af04bb00] htb=7000000af04bb00 ssga=7000000af04b838
user=7000000e9cd94a0 session=7000000e9cd94a0 count=0 flags=RES/[0010] savepoint=0xca069a
LIBRARY OBJECT HANDLE: handle=7000000b059e8e0 mtx=7000000b059ea10(0) cdp=0
name=orcl.IDX_TAB1_QUESTION DR$STATS$OBJ
hash=0bbe5a41fd1dd8109fa5b7bed44e371b timestamp=NULL
namespace=CPOB flags=KGHP/TIM/FUP/SML/[0a000000]
kkkk-dddd-llll=0000-0001-0001 lock=X pin=X latch#=6 hpc=448a hlc=4488
lwt=7000000b059e988[7000000cdb8b420,7000000cdb8b420] ltm=7000000b059e998[7000000b059e998,7000000b059e998]
pwt=7000000b059e950[7000000b059e950,7000000b059e950] ptm=7000000b059e960[7000000b059e960,7000000b059e960]
ref=7000000b059e9b8[7000000b059e9b8,7000000b059e9b8] lnd=7000000b059e9d0[7000000b059e9d0,7000000b059e9d0]
  LIBRARY OBJECT: object=7000000b5964510
  type=CPOB flags=EXS/LOC/PRG[0085] pflags=[0000] status=VALD load=0
  DATA BLOCKS:
  data#     heap  pointer    status pins change whr
  ----- -------- -------- --------- ---- ------ ---
      0 7000000cdd7c318 7000000b5964628 I/P/A/-/-    0 NONE   00


SO: 7000000c8e9e9a8, type: 53, owner: 7000000e5ebdc10, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=7000000c8e9e9a8 handle=7000000ddbf8b08 request=X
call pin=0 session pin=0 hpc=0000 hlc=0000
htl=7000000c8e9ea28[7000000b2058858,7000000b2058858] htb=7000000b2058858 ssga=7000000b2057db0
user=7000000e9c74f28 session=7000000e9c74f28 count=0 flags=[0000] savepoint=0x42e5d
LIBRARY OBJECT HANDLE: handle=7000000ddbf8b08 mtx=7000000ddbf8c38(0) cdp=0
name=orcl.TAB1
hash=76a450813b68cc6891f5d91483aa8e99 timestamp=07-15-2012 03:55:45
namespace=TABL flags=KGHP/TIM/SML/[02000000]
kkkk-dddd-llll=0000-0749-0749 lock=S pin=S latch#=4 hpc=ba7e hlc=ba7e
lwt=7000000ddbf8bb0[7000000c8e9e9d8,7000000cfcb0ca0] ltm=7000000ddbf8bc0[7000000ddbf8bc0,7000000ddbf8bc0]
pwt=7000000ddbf8b78[7000000ddbf8b78,7000000ddbf8b78] ptm=7000000ddbf8b88[7000000ddbf8b88,7000000ddbf8b88]
ref=7000000ddbf8be0[7000000ddbf8be0,7000000ddbf8be0] lnd=7000000ddbf8bf8[7000000ddb69938,7000000de3f7238]
  LIBRARY OBJECT: object=7000000b23cadd0
  type=TABL flags=EXS/LOC[0005] pflags=[0000] status=VALD load=0
  DATA BLOCKS:
  data#     heap  pointer    status pins change whr
  ----- -------- -------- --------- ---- ------ ---
      0 7000000ca14ceb0 7000000b23caee8 I/P/A/-/-    0 NONE   00
      3 7000000c24ce138 7000000b6a531f8 I/-/A/-/-    0 NONE   00
      8 7000000b23cb0c8 7000000d2a96338 I/P/A/-/-    1 NONE   00
      9 7000000c24cdfe8 7000000cdd3b718 I/P/A/-/-    1 NONE   00
     10 7000000c24ce070 7000000bc90fb38 I/P/A/-/-    1 NONE   00



session  address= 7000000e9c74f28 的会话以排他模式锁住索引 orcl.IDX_TAB1_QUESTION DR$STATS$OBJ , 该会话又以排他模式Request 请求orcl.TAB1上的锁。

session  address=7000000e9cd94a0 的会话以共享模式锁住表 orcl.TAB1,该会话又以共享模式Request 请求索引 orcl.IDX_TAB1_QUESTION的锁。





后续有较多前台服务进程触发了ORA-07445: [_ptrgl] [SIGSEGV]错误,分析这些TRACE发现其均在执行全文索引相关的PL/SQL, 且均在等待索引 orcl.IDX_TAB1_QUESTION上的library cache load lock:


/db05/app/oracle/admin/CCISKB/udump/cciskb_ora_1040472.trc



SO: 7000000e9d24360, type: 4, owner: 7000000e91d2008, flag: INIT/-/-/0x00
    (session) sid: 1617 trans: 0, creator: 7000000e91d2008, flag: (8100041) USR/- BSY/-/-/-/-/-
              DID: 0001-0024-000855AB, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 47, prv: 0, sql: 7000000ddaefd38, psql: 7000000ddaefd38, user: 34/CTXSYS
    service name: SYS$USERS
    O/S info: user: wasup, term: unknown, ospid: 1234, machine: pdccsccisapp44
              program: JDBC Thin Client
    application name: JDBC Thin Client, hash value=2546894660
    last wait for 'library cache load lock' blocking sess=0x0 seq=2781 wait_time=496678 seconds since wait started=319
                object address=7000000b5964510, lock address=7000000df196e88, 100*mask+namespace=75
    Dumping Session Wait History
     for 'library cache load lock' count=1 wait_time=496678
                object address=7000000b5964510, lock address=7000000df196e88, 100*mask+namespace=75
     for 'library cache load lock' count=1 wait_time=2929712
                object address=7000000b5964510, lock address=7000000df196e88, 100*mask+namespace=75
     for 'library cache load lock' count=1 wait_time=2929710
                object address=7000000b5964510, lock address=7000000df196e88, 100*mask+namespace=75
     for 'library cache load lock' count=1 wait_time=2929722





SO: 7000000b202eec0, type: 53, owner: 7000000dc10a5c8, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=7000000b202eec0 handle=7000000b059e8e0 request=S
call pin=0 session pin=0 hpc=0000 hlc=0000
htl=7000000b202ef40[7000000ae027f90,7000000ae027f90] htb=7000000ae027f90 ssga=7000000ae027cc8
user=7000000e9d24360 session=7000000e9d24360 count=0 flags=[0000] savepoint=0x13d97db
LIBRARY OBJECT HANDLE: handle=7000000b059e8e0 mtx=7000000b059ea10(0) cdp=0
name=orcl.IDX_TAB1_QUESTION DR$STATS$OBJ
hash=0bbe5a41fd1dd8109fa5b7bed44e371b timestamp=NULL
namespace=CPOB flags=KGHP/TIM/FUP/SML/[0a000000]
kkkk-dddd-llll=0000-0001-0001 lock=X pin=X latch#=6 hpc=448c hlc=448a
lwt=7000000b059e988[7000000b202eef0,7000000c3631080] ltm=7000000b059e998[7000000b059e998,7000000b059e998]
pwt=7000000b059e950[7000000b059e950,7000000b059e950] ptm=7000000b059e960[7000000b059e960,7000000b059e960]
ref=7000000b059e9b8[7000000b059e9b8,7000000b059e9b8] lnd=7000000b059e9d0[7000000b059e9d0,7000000b059e9d0]
  LIBRARY OBJECT: object=7000000b5964510
  type=CPOB flags=EXS/LOC/PRG[0085] pflags=[0000] status=VALD load=0
  DATA BLOCKS:
  data#     heap  pointer    status pins change whr
  ----- -------- -------- --------- ---- ------ ---
      0 7000000cdd7c318 7000000b5964628 I/P/A/-/-    0 NONE   00



其所运行的PL/SQL如下:

      SO: 7000000b07ba3d8, type: 53, owner: 7000000e9d24360, flag: INIT/-/-/0x00
      LIBRARY OBJECT LOCK: lock=7000000b07ba3d8 handle=7000000ddaefd38 mode=N
      call pin=0 session pin=0 hpc=0000 hlc=0000
      htl=7000000b07ba458[7000000cf806b20,7000000b6031168] htb=7000000ae028390 ssga=7000000ae027cc8
      user=7000000e9d24360 session=7000000e9d24360 count=1 flags=[0000] savepoint=0x509bbb42
      LIBRARY OBJECT HANDLE: handle=7000000ddaefd38 mtx=7000000ddaefe68(1) cdp=1
      name=
  declare
     cost sys.ODCICost := sys.ODCICost(NULL, NULL, NULL, NULL);
     arg0 VARCHAR2(1) := null;

    begin
      :1 := "CTXSYS"."TEXTOPTSTATS".ODCIStatsFunctionCost(
                     sys.ODCIFuncInfo('CTXSYS',
                            'CTX_CONTAINS',
                            'TEXTCONTAINS',
                            2),
                     cost,
                     sys.ODCIARGDESCLIST(sys.ODCIARGDESC(2, 'TAB1', 'orcl', '"QUESTION"', NULL, NULL, NULL), sys.ODCIARGDES
      hash=3c0148b0a0499cb46fab545e4a7e9b5b timestamp=04-22-2012 00:50:12
      namespace=CRSR flags=RON/KGHP/TIM/PN0/KST/DBN/MTX/[100100d0]
      kkkk-dddd-llll=0000-0001-0001 lock=N pin=0 latch#=5 hpc=6126 hlc=6126
      lwt=7000000ddaefde0[7000000ddaefde0,7000000ddaefde0] ltm=7000000ddaefdf0[7000000ddaefdf0,7000000ddaefdf0]
      pwt=7000000ddaefda8[7000000ddaefda8,7000000ddaefda8] ptm=7000000ddaefdb8[7000000ddaefdb8,7000000ddaefdb8]
      ref=7000000ddaefe10[7000000ddaefe10,7000000ddaefe10] lnd=7000000ddaefe28[7000000ddaefe28,7000000ddaefe28]
        LIBRARY OBJECT: object=7000000ddaef868
        type=CRSR flags=EXS[0001] pflags=[0000] status=VALD load=0
        CHILDREN: size=16
        child#    table reference   handle
        ------ -------- --------- --------
             0 7000000ddaef768 7000000ddaef2d0 7000000ddaeee08
             1 7000000ddaef768 7000000d0ef9aa0 7000000d0ef96a8
             2 7000000ddaef768 7000000dafffe50 7000000b494f690
        DATA BLOCKS:
        data#     heap  pointer    status pins change whr
        ----- -------- -------- --------- ---- ------ ---
            0 7000000ddaefc78 7000000ddaef980 I/P/A/-/-    0 NONE   00






综合以上ORA-04020和ORA-07445: [_ptrgl]等现象,该问题与已知的BUG  5160122、8250557 - ORA-4020 WHEN CONCURRENTLYEX ECUTE DBMS_STATS GATHER_STATS_JOB, REBUILD TEXT IN十分相似,该BUG已确认影响当前版本 10.2.0.4 ,触发BUG的场景为10.2.0.4 上全文索引查询和GATHER_STATS_JOB作业同时进行引起DDL死锁, 现象为出现ORA-4020错误和ORA-07445错误:


Oracle Text - Version: 10.2.0.4 to 10.2.0.4 - Release: 10.2 to 10.2
Information in this document applies to any platform.
***Checked for relevance on 19-Jan-2012***
1.1        Symptoms
While running Text query sessions during statistic maintenance windows, using gather_stats_jobr or  gather_[Table|Schema|Database]_stats procedures, a deadlock with one of the Text query sessions occurs and eventually query session core dump with ORA-7445 [drexumcx], visible in alert logfile.

*** 2009-03-20 22:06:12.821
A deadlock among DDL and parse locks is detected.
This deadlock is usually due to user errors in
the design of an application or from issuing a set
of concurrent statements which can cause a deadlock.
This should not be reported to Oracle Support.
The following information may aid in finding
the errors which cause the deadlock:
ORA-04020: deadlock detected while trying to lock object TEXT.XML_IDX
--------------------------------------------------------
  object   waiting  waiting       blocking blocking
  handle   session     lock mode   session     lock mode
--------  -------- -------- ----  -------- -------- ----
0xa87376d8  0xa71a8868 0x98789f54    X  0xa719d5d0 0x9834ace4    S
0xa0caa1f4  0xa719d5d0 0x9834e894    S  0xa71a8868 0x9823e3d0    X
--------------------------------------------------------
---------- DUMP OF WAITING AND BLOCKING LOCKS ----------
--------------------------------------------------------
------------- WAITING LOCK -------------
----------------------------------------
SO: 0x98789f54, type: 53, owner: 0xa6f2576c, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=98789f54 handle=a87376d8 request=X
call pin=(nil) session pin=(nil) hpc=0000 hlc=0000
htl=0x98789fa0[0xa3d44088,0xa3e7ffac] htb=0xa3e7ffac ssga=0xa3e7fd20
user=a71a8868 session=a71a8868 count=0 flags=[0000] savepoint=0xc99f1
LIBRARY OBJECT HANDLE: handle=a87376d8 mtx=0xa873778c(0) cdp=0
name=TEXT.ITEM
hash=6cbb78e269f641642b2dca531feed73a timestamp=01-10-2009 07:05:50
namespace=TABL flags=KGHP/TIM/SML/[02000000]
kkkk-dddd-llll=0000-0749-074d lock=S pin=S latch#=2 hpc=31ae hlc=31ae
lwt=0xa8737734[0x98789f70,0x98789f70] ltm=0xa873773c[0xa873773c,0xa873773c]
pwt=0xa8737718[0xa8737718,0xa8737718] ptm=0xa8737720[0xa8737720,0xa8737720]
ref=0xa8737754[0xa8737754,0xa8737754] lnd=0xa8737760[0xa8947b30,0xa86f4e4c]
  LIBRARY OBJECT: object=a23b8328
  type=TABL flags=EXS/LOC[0005] pflags=[0000] status=VALD load=0
...
------------- BLOCKING LOCK ------------
----------------------------------------
SO: 0x9823e3d0, type: 53, owner: 0xa6f261c4, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=9823e3d0 handle=a0caa1f4 mode=X
call pin=(nil) session pin=(nil) hpc=0000 hlc=0000
htl=0x9823e41c[0xa3e7ffdc,0xa3e7ffdc] htb=0xa3e7ffdc ssga=0xa3e7fd20
user=a71a8868 session=a71a8868 count=1 flags=[0000] savepoint=0xc962e
LIBRARY OBJECT HANDLE: handle=a0caa1f4 mtx=0xa0caa2a8(0) cdp=0
name=TEXT.XML_IDX DR$STATS$OBJ
hash=c59fefefa06ced5e681f46e4cfbe5a40 timestamp=NULL
namespace=CPOB flags=KGHP/TIM/FUL/FUP/SML/[0e000000]
kkkk-dddd-llll=0000-0001-0001 lock=X pin=X latch#=1 hpc=21f8 hlc=21f6
lwt=0xa0caa250[0x9834e8b0,0x9834e8b0] ltm=0xa0caa258[0xa0caa258,0xa0caa258]
pwt=0xa0caa234[0xa0caa234,0xa0caa234] ptm=0xa0caa23c[0xa0caa23c,0xa0caa23c]
ref=0xa0caa270[0xa0caa270,0xa0caa270] lnd=0xa0caa27c[0xa0caa27c,0xa0caa27c]
  LIBRARY OBJECT: object=8f873988
  type=CPOB flags=EXS/LOC/PRG[0085] pflags=[0000] status=VALD load=0
  DATA BLOCKS:
  data#     heap  pointer    status pins change whr
  ----- -------- -------- --------- ---- ------ ---
      0 a0bff808 8f873a20 I/P/A/-/-    0 NONE   00
--------------------------------------------------------
This lock request was aborted.
A deadlock among DDL and parse locks is detected.
This deadlock is usually due to user errors in
the design of an application or from issuing a set
of concurrent statements which can cause a deadlock.
This should not be reported to Oracle Support.
The following information may aid in finding
the errors which cause the deadlock:
ORA-04020: deadlock detected while trying to lock object TEXT.XML_IDX
--------------------------------------------------------
  object   waiting  waiting       blocking blocking
  handle   session     lock mode   session     lock mode
--------  -------- -------- ----  -------- -------- ----
0xa87376d8  0xa71a8868 0x98518bac    X  0xa719d5d0 0x9834ace4    S
0xa0caa1f4  0xa719d5d0 0x983583a0    S  0xa71a8868 0x9823e3d0    X
--------------------------------------------------------
*** 2009-03-20 22:06:25.021
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [drexumcx()+64] [SIGSEGV] [Address
not mapped to object] [0x4] [] []
Current SQL statement for this session:
  declare
     cost sys.ODCICost := sys.ODCICost(NULL, NULL, NULL, NULL);
     arg0 CLOB := null;

    begin
      :1 := "CTXSYS"."TEXTOPTSTATS".ODCIStatsIndexCost(
                     sys.ODCIINDEXINFO('TEXT','XML_IDX', ...
...

or

----- PL/SQL Call Stack -----
  object                 line    object
  handle         number    name
0xa059345c           106    CTXSYS.TEXTOPTSTATS
0xa8900cd8              6     anonymous block
----- Call Stack Trace -----
... <- drexumcx <- drexoicost <- spefcpfa < ...

or

... <- drexumcx <- drexofcost <- spefcpfa < ...

or

... <- drumcmf <- drurnew <- draccbi <- drexumcx <- drexostatscoll <- spefcpfa < ...



1.2        Cause
The cause of this problem has been identified and verified in an unpublished Bug 5160122. The deadlock problem is caused by the fix for unpublished Bug 4439469.

1.3        Solution
When available, download and apply Patch 5160122 to resolve the problem.
This problem is fixed in 10.2.0.5 release version, Patch 8202632.





虽然出现了ORA-04020和ORA-07445: [_ptrgl],但这2个错误仍不足以导致实例终止Instance Crash。 实际引发实例终止的是ORA-00494: enqueue [CF] ,控制文件队列锁900s超时,ARC0归档进程由于长期得不到CF锁,进而终止实例:

Thu Nov  8 22:36:54 2012
Errors in file /db05/app/oracle/admin/CCISKB/bdump/cciskb_arc0_364906.trc:
ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 266518'
Thu Nov  8 22:36:59 2012
System State dumped to trace file /db05/app/oracle/admin/CCISKB/bdump/cciskb_arc0_364906.trc
Thu Nov  8 22:40:27 2012
Killing enqueue blocker (pid=266518) on resource CF-00000000-00000000
by killing session 1650.1
Killing enqueue blocker (pid=266518) on resource CF-00000000-00000000


由于缺少 ARC0 在Terminate Instance前生成的SYSTEM STATE DUMP cciskb_arc0_364906.trc 文件(没有真实生成或丢失),我们无法找出当时是哪一个进程长期HOLD住了 CF-00000000-00000000资源。

发生ORA-00494错误的时间点为22:36:54,CF Enqueue Lock超过900秒才会超时,说明该CF锁至少从 22:20左右被某个进程长期持有。

在22:18时发现主机SWAP交换空间使用率较高,登陆系统异常缓慢,且有一远程登陆的服务进程占用大量内存。该现象可能与文本索引的查询存在关联。 目前已知存在Bug 6624968 : A QUERY AGAINST A TRANSACTIONAL TEXT INDEX CONSUMES HUGE PGA MEMORY,对于TRANSACTIONAL文本索引的查询可能导致大量PGA的使用。



Name            PID CPU% PgSp Class             ServerV3       0   "h" for help
rpc.stat     192714  5.6  2.8 Default            lientV3       0   "q" to quit
oracle      1135074  3.14585.7 Default
srcmstr      151992  2.2  0.7 System
diagd        209114  1.4  0.5 System
portmap      143792  0.9  0.9 System
shlap64      131476  0.7  0.4 System
rpc.lock     221264  0.2  0.3 System
dtlogin      225310  0.2  0.3 System
tnslsnr      381304  0.2 22.7 Default
lrud          16392  0.1  0.6 System
cimssys      160240  0.1  0.3 System
topas       1257636  0.1  8.5 Default
oracle      1233376  0.1 20.7 Default
Signal 2 received

$ ps -ef|grep 1135074
   kfxta 1257640 1392968   0 22:26:43  pts/1  0:00 grep 1135074
  oracle 1135074       1   0   Oct 15      - 145:08 oracleCCISKB (LOCAL=NO)

SQL> select sid,sql_id,event from v$session where paddr=(select addr from v$process where spid=1135074);

       SID SQL_ID        EVENT                                                           
---------- ------------- ----------------------------------------------------------------
      1562 fdz4bhkngd54x latch: row cache objects

SQL> select sql_text from v$sql where sql_id='fdz4bhkngd54x';

SQL_TEXT
--------------------------------------------------------------------------------
select KNID,question,ANSWER,operid,FUNCTYPE,modefydate,areacode,BUSITYPESTR,VERI
FYSTATE,dealdesc,clickdegree,CHANNEL              from (select a.KNID,b.question
,a.ANSWER,a.operid,a.FUNCTYPE, to_char(a.LOGTIME,'YYYYMMDD HH24:MI:SS') MODEFYDA
TE,a.areacode,a.BUSITYPESTR,a.VERIFYSTATE,s.typeiddescription dealdesc,a.clickde
gree,              row_number() over(order by a.clickdegree desc,a.LOGTIME desc)
rn,a.CHANNEL from VA_KNOWLEDGE a, TAB1 b, s_mapping_detail s where b.KNI
D=a.KNID and a.VERIFYSTATE=s.typeidcode(+) AND s.typecode(+)='VAVRY' and b.VERIF
YSTATE= 1  and contains(b.question,'ê??ú and ?ü?? and ????')>0 and a.FUNCTYPE =
0  ) tmp where rn>(:in_sPage-1)*:in_sPageSize AND rn<=:in_sPage*:in_sPageSize



以上可以看到消耗大量内存的1135074 进程正在执行对TAB1表的contains查询,这将引发对orcl.IDX_TAB1_QUESTION全文索引的查询。 但是由于缺少1135074进程当时的PGA HEAPDUMP信息,无法得知其具体的内存使用分配情况。


大量SWAP交换空间的使用可能导致Oracle进程反应缓慢,无法及时释放关键的锁资源,从而导致CF控制文件队列锁超时,引起后台进程无法获得CF锁而终止实例。 但是由于缺少当时的SYSTEM STATE DUMP,无法获得当时长期持有CF队列锁资源的进程状态,从而进一步确认该SWAP导致的CF队列锁超时。



















        建议



1. BUG 5160122在10.2.0.4 AIX POWER平台上有对应的补丁Patch 5160122: TEXT QUERY DEADLOCK WITH GATHER_STATS_JOB, THEN COREDUMP ORA-7445 [DREXUMCX],考虑通过安装补丁的方式解决ORA-04020、ORA-7445错误

2. 隐藏参数_controlfile_enqueue_timeout控制CF锁超时引发ORA-00494的时长,默认为900秒;注意增加该参数仅推迟ORA-00494错误的发生,有助于减少后台进程终止实例的可能,但对解决问题本身没有帮助。



3. 若系统自动生成的SYSTEM STATE不可用,可以在问题发生时手动收集SYSTEM STATE DUMP,以便于今后诊断,手动收集的方法如下:


Conn / as sysdba

Oradebug setmypid;
Oradebug unlimit;
Oradebug dump systemstate 266;


Oradebug tracefile_name;

下载专业ORACLE数据库恢复工具PRM-DUL  For Oracle http://www.parnassusdata.com/zh-hans/emergency-services

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线 : 13764045638  QQ: 47079569   
您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-6-1 19:24 , Processed in 0.048271 second(s), 21 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569