- 最后登录
- 2023-8-16
- 在线时间
- 1686 小时
- 威望
- 2135
- 金钱
- 50532
- 注册时间
- 2011-10-12
- 阅读权限
- 200
- 帖子
- 5207
- 精华
- 39
- 积分
- 2135
- UID
- 2
|
1#
发表于 2013-11-10 23:53:11
|
查看: 3354 |
回复: 0
1.执行自动统计信息收集的自动作业进程J001报ORA-04020: deadlock detected while trying to lock object orcl.TAB1错误
2. 陆续出现7次ORA-07445 [_ptrgl] [SIGSEGV] [Address not mapped to object]错误,触发该ORA-07445错误的均是CTXSYS全文索引相关的前台进程
3. 出现多次 PMON failed to acquire latch, see PMON dump错误
4. 因 ORA-00494: enqueue [CF] CF队列锁超时导致ARC0归档进程终止实例
发现主机SWAP交换空间使用率较高,登陆系统异常缓慢,且有一远程登陆的服务进程占用大量内存。
Name PID CPU% PgSp Class ServerV3 0 "h" for help
rpc.stat 192714 5.6 2.8 Default lientV3 0 "q" to quit
oracle 1135074 3.14585.7 Default
srcmstr 151992 2.2 0.7 System
diagd 209114 1.4 0.5 System
portmap 143792 0.9 0.9 System
shlap64 131476 0.7 0.4 System
rpc.lock 221264 0.2 0.3 System
dtlogin 225310 0.2 0.3 System
tnslsnr 381304 0.2 22.7 Default
lrud 16392 0.1 0.6 System
cimssys 160240 0.1 0.3 System
topas 1257636 0.1 8.5 Default
oracle 1233376 0.1 20.7 Default
Signal 2 received
$ ps -ef|grep 1135074
kfxta 1257640 1392968 0 22:26:43 pts/1 0:00 grep 1135074
oracle 1135074 1 0 Oct 15 - 145:08 oracleCCISKB (LOCAL=NO)
SQL> select sid,sql_id,event from v$session where paddr=(select addr from v$process where spid=1135074);
SID SQL_ID EVENT
---------- ------------- ----------------------------------------------------------------
1562 fdz4bhkngd54x latch: row cache objects
SQL> select sql_text from v$sql where sql_id='fdz4bhkngd54x';
SQL_TEXT
--------------------------------------------------------------------------------
select KNID,question,ANSWER,operid,FUNCTYPE,modefydate,areacode,BUSITYPESTR,VERI
FYSTATE,dealdesc,clickdegree,CHANNEL from (select a.KNID,b.question
,a.ANSWER,a.operid,a.FUNCTYPE, to_char(a.LOGTIME,'YYYYMMDD HH24:MI:SS') MODEFYDA
TE,a.areacode,a.BUSITYPESTR,a.VERIFYSTATE,s.typeiddescription dealdesc,a.clickde
gree, row_number() over(order by a.clickdegree desc,a.LOGTIME desc)
rn,a.CHANNEL from VA_KNOWLEDGE a, TAB1 b, s_mapping_detail s where b.KNI
D=a.KNID and a.VERIFYSTATE=s.typeidcode(+) AND s.typecode(+)='VAVRY' and b.VERIF
YSTATE= 1 and contains(b.question,'ê??ú and ?ü?? and ????')>0 and a.FUNCTYPE =
0 ) tmp where rn>(:in_sPage-1)*:in_sPageSize AND rn<=:in_sPage*:in_sPageSize
2. 本次服务详细过程
问题分析
ORA-04020错误是DDL和解析锁死锁(DDL DEAD LOCK)错误,与常见的ORA-00060死锁错误不同。Oracle DDL锁是为了串行化数据库对象源数据的使用和修改而存在,例如修改一张表的定义ALERT TABLE、修改某个存储过程ALTER PROCEDURE都需要持有该对象上的DDL锁。
"orcl"."IDX_TAB1_QUESTION"对象是基于"orcl"."TAB1"表的全文索引(TRANSACTIONAL TEXT INDEX),这2个对象的DDL如下:
CREATE TABLE "orcl"."TAB1"
( "QUESID" VARCHAR2(36) NOT NULL ENABLE,
"QUESTION" VARCHAR2(500) NOT NULL ENABLE,
"KNID" VARCHAR2(36) NOT NULL ENABLE,
"ISMAIN" NUMBER(*,0) DEFAULT 1,
"VERIFYSTATE" NUMBER(*,0),
"INDEXSTATE" NUMBER(*,0) NOT NULL ENABLE,
"CREATEDATE" DATE,
"LASTMODIFYDATE" DATE,
"LOGTIME" DATE DEFAULT sysdate,
"SENDFLAG" NUMBER(*,0) NOT NULL ENABLE,
"FUNCTYPE" NUMBER(*,0),
"AUTONKID" VARCHAR2(36),
"TEMPLATEFLAG" VARCHAR2(1),
"CHANNEL" VARCHAR2(50),
"RELTEMPATE" VARCHAR2(300),
"BUSITYPESTR" VARCHAR2(1000),
CONSTRAINT "PK_TAB1" PRIMARY KEY ("QUESID", "KNID")
USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "SYSTEM" ENABLE
) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CCLOG2"
CREATE INDEX "orcl"."IDX_TAB1_QUESTION" ON "orcl"."TAB1" (
INDEXTYPE IS "CTXSYS"."CONTEXT" PARAMETERS ('DATASTORE CTXSYS.DIRECT_DATASTOR
FILTER CTXSYS.NULL_FILTER
LEXER CCISLEX
WORDLIST CCISWORDLIST
STORAGE CCISSTORAGE
SYNC (ON COMMIT)
TRANSACTIONAL')
10g中自动收集统计信息作业DBMS_STATS GATHER_STATS_JOB会在工作日夜间22:00启动并收集相关对象的统计信息,由于统计信息的更新涉及到数据库对象源数据的变更,所以同样会在短时间内要求对象的DDL锁。
J001 (PID 43) 自动作业进程在22:00启动执行GATHER_STATS_JOB,当收集到表orcl.TAB1时触发了ORA-04020错误,该进程生成的ORA-04020 TRACE GRAPH如下:
A deadlock among DDL and parse locks is detected.
This deadlock is usually due to user errors in
the design of an application or from issuing a set
of concurrent statements which can cause a deadlock.
This should not be reported to Oracle Support.
The following information may aid in finding
the errors which cause the deadlock:
ORA-04020: deadlock detected while trying to lock object orcl.TAB1
--------------------------------------------------------
object waiting waiting blocking blocking
handle session lock mode session lock mode
-------- -------- -------- ---- -------- -------- ----
7000000b059e8e0 7000000e9cd94a0 7000000cdb8b3f0 S 7000000e9c74f28 7000000af065d48 X
7000000ddbf8b08 7000000e9c74f28 7000000c8e9e9a8 X 7000000e9cd94a0 7000000b2014508 S
以上第一列为对象句柄object handle,引起死锁的2个对象分别对应为"orcl"."TAB1"及其全文索引"orcl"."IDX_TAB1_QUESTION":
SO: 7000000cdb8b3f0, type: 53, owner: 7000000e9e21b58, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=7000000cdb8b3f0 handle=7000000b059e8e0 request=S
call pin=0 session pin=0 hpc=4489 hlc=0000
htl=7000000cdb8b470[7000000af04bb00,7000000af04bb00] htb=7000000af04bb00 ssga=7000000af04b838
user=7000000e9cd94a0 session=7000000e9cd94a0 count=0 flags=RES/[0010] savepoint=0xca069a
LIBRARY OBJECT HANDLE: handle=7000000b059e8e0 mtx=7000000b059ea10(0) cdp=0
name=orcl.IDX_TAB1_QUESTION DR$STATS$OBJ
hash=0bbe5a41fd1dd8109fa5b7bed44e371b timestamp=NULL
namespace=CPOB flags=KGHP/TIM/FUP/SML/[0a000000]
kkkk-dddd-llll=0000-0001-0001 lock=X pin=X latch#=6 hpc=448a hlc=4488
lwt=7000000b059e988[7000000cdb8b420,7000000cdb8b420] ltm=7000000b059e998[7000000b059e998,7000000b059e998]
pwt=7000000b059e950[7000000b059e950,7000000b059e950] ptm=7000000b059e960[7000000b059e960,7000000b059e960]
ref=7000000b059e9b8[7000000b059e9b8,7000000b059e9b8] lnd=7000000b059e9d0[7000000b059e9d0,7000000b059e9d0]
LIBRARY OBJECT: object=7000000b5964510
type=CPOB flags=EXS/LOC/PRG[0085] pflags=[0000] status=VALD load=0
DATA BLOCKS:
data# heap pointer status pins change whr
----- -------- -------- --------- ---- ------ ---
0 7000000cdd7c318 7000000b5964628 I/P/A/-/- 0 NONE 00
SO: 7000000c8e9e9a8, type: 53, owner: 7000000e5ebdc10, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=7000000c8e9e9a8 handle=7000000ddbf8b08 request=X
call pin=0 session pin=0 hpc=0000 hlc=0000
htl=7000000c8e9ea28[7000000b2058858,7000000b2058858] htb=7000000b2058858 ssga=7000000b2057db0
user=7000000e9c74f28 session=7000000e9c74f28 count=0 flags=[0000] savepoint=0x42e5d
LIBRARY OBJECT HANDLE: handle=7000000ddbf8b08 mtx=7000000ddbf8c38(0) cdp=0
name=orcl.TAB1
hash=76a450813b68cc6891f5d91483aa8e99 timestamp=07-15-2012 03:55:45
namespace=TABL flags=KGHP/TIM/SML/[02000000]
kkkk-dddd-llll=0000-0749-0749 lock=S pin=S latch#=4 hpc=ba7e hlc=ba7e
lwt=7000000ddbf8bb0[7000000c8e9e9d8,7000000cfcb0ca0] ltm=7000000ddbf8bc0[7000000ddbf8bc0,7000000ddbf8bc0]
pwt=7000000ddbf8b78[7000000ddbf8b78,7000000ddbf8b78] ptm=7000000ddbf8b88[7000000ddbf8b88,7000000ddbf8b88]
ref=7000000ddbf8be0[7000000ddbf8be0,7000000ddbf8be0] lnd=7000000ddbf8bf8[7000000ddb69938,7000000de3f7238]
LIBRARY OBJECT: object=7000000b23cadd0
type=TABL flags=EXS/LOC[0005] pflags=[0000] status=VALD load=0
DATA BLOCKS:
data# heap pointer status pins change whr
----- -------- -------- --------- ---- ------ ---
0 7000000ca14ceb0 7000000b23caee8 I/P/A/-/- 0 NONE 00
3 7000000c24ce138 7000000b6a531f8 I/-/A/-/- 0 NONE 00
8 7000000b23cb0c8 7000000d2a96338 I/P/A/-/- 1 NONE 00
9 7000000c24cdfe8 7000000cdd3b718 I/P/A/-/- 1 NONE 00
10 7000000c24ce070 7000000bc90fb38 I/P/A/-/- 1 NONE 00
session address= 7000000e9c74f28 的会话以排他模式锁住索引 orcl.IDX_TAB1_QUESTION DR$STATS$OBJ , 该会话又以排他模式Request 请求orcl.TAB1上的锁。
session address=7000000e9cd94a0 的会话以共享模式锁住表 orcl.TAB1,该会话又以共享模式Request 请求索引 orcl.IDX_TAB1_QUESTION的锁。
后续有较多前台服务进程触发了ORA-07445: [_ptrgl] [SIGSEGV]错误,分析这些TRACE发现其均在执行全文索引相关的PL/SQL, 且均在等待索引 orcl.IDX_TAB1_QUESTION上的library cache load lock:
/db05/app/oracle/admin/CCISKB/udump/cciskb_ora_1040472.trc
SO: 7000000e9d24360, type: 4, owner: 7000000e91d2008, flag: INIT/-/-/0x00
(session) sid: 1617 trans: 0, creator: 7000000e91d2008, flag: (8100041) USR/- BSY/-/-/-/-/-
DID: 0001-0024-000855AB, short-term DID: 0000-0000-00000000
txn branch: 0
oct: 47, prv: 0, sql: 7000000ddaefd38, psql: 7000000ddaefd38, user: 34/CTXSYS
service name: SYS$USERS
O/S info: user: wasup, term: unknown, ospid: 1234, machine: pdccsccisapp44
program: JDBC Thin Client
application name: JDBC Thin Client, hash value=2546894660
last wait for 'library cache load lock' blocking sess=0x0 seq=2781 wait_time=496678 seconds since wait started=319
object address=7000000b5964510, lock address=7000000df196e88, 100*mask+namespace=75
Dumping Session Wait History
for 'library cache load lock' count=1 wait_time=496678
object address=7000000b5964510, lock address=7000000df196e88, 100*mask+namespace=75
for 'library cache load lock' count=1 wait_time=2929712
object address=7000000b5964510, lock address=7000000df196e88, 100*mask+namespace=75
for 'library cache load lock' count=1 wait_time=2929710
object address=7000000b5964510, lock address=7000000df196e88, 100*mask+namespace=75
for 'library cache load lock' count=1 wait_time=2929722
SO: 7000000b202eec0, type: 53, owner: 7000000dc10a5c8, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=7000000b202eec0 handle=7000000b059e8e0 request=S
call pin=0 session pin=0 hpc=0000 hlc=0000
htl=7000000b202ef40[7000000ae027f90,7000000ae027f90] htb=7000000ae027f90 ssga=7000000ae027cc8
user=7000000e9d24360 session=7000000e9d24360 count=0 flags=[0000] savepoint=0x13d97db
LIBRARY OBJECT HANDLE: handle=7000000b059e8e0 mtx=7000000b059ea10(0) cdp=0
name=orcl.IDX_TAB1_QUESTION DR$STATS$OBJ
hash=0bbe5a41fd1dd8109fa5b7bed44e371b timestamp=NULL
namespace=CPOB flags=KGHP/TIM/FUP/SML/[0a000000]
kkkk-dddd-llll=0000-0001-0001 lock=X pin=X latch#=6 hpc=448c hlc=448a
lwt=7000000b059e988[7000000b202eef0,7000000c3631080] ltm=7000000b059e998[7000000b059e998,7000000b059e998]
pwt=7000000b059e950[7000000b059e950,7000000b059e950] ptm=7000000b059e960[7000000b059e960,7000000b059e960]
ref=7000000b059e9b8[7000000b059e9b8,7000000b059e9b8] lnd=7000000b059e9d0[7000000b059e9d0,7000000b059e9d0]
LIBRARY OBJECT: object=7000000b5964510
type=CPOB flags=EXS/LOC/PRG[0085] pflags=[0000] status=VALD load=0
DATA BLOCKS:
data# heap pointer status pins change whr
----- -------- -------- --------- ---- ------ ---
0 7000000cdd7c318 7000000b5964628 I/P/A/-/- 0 NONE 00
其所运行的PL/SQL如下:
SO: 7000000b07ba3d8, type: 53, owner: 7000000e9d24360, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=7000000b07ba3d8 handle=7000000ddaefd38 mode=N
call pin=0 session pin=0 hpc=0000 hlc=0000
htl=7000000b07ba458[7000000cf806b20,7000000b6031168] htb=7000000ae028390 ssga=7000000ae027cc8
user=7000000e9d24360 session=7000000e9d24360 count=1 flags=[0000] savepoint=0x509bbb42
LIBRARY OBJECT HANDLE: handle=7000000ddaefd38 mtx=7000000ddaefe68(1) cdp=1
name=
declare
cost sys.ODCICost := sys.ODCICost(NULL, NULL, NULL, NULL);
arg0 VARCHAR2(1) := null;
begin
:1 := "CTXSYS"."TEXTOPTSTATS".ODCIStatsFunctionCost(
sys.ODCIFuncInfo('CTXSYS',
'CTX_CONTAINS',
'TEXTCONTAINS',
2),
cost,
sys.ODCIARGDESCLIST(sys.ODCIARGDESC(2, 'TAB1', 'orcl', '"QUESTION"', NULL, NULL, NULL), sys.ODCIARGDES
hash=3c0148b0a0499cb46fab545e4a7e9b5b timestamp=04-22-2012 00:50:12
namespace=CRSR flags=RON/KGHP/TIM/PN0/KST/DBN/MTX/[100100d0]
kkkk-dddd-llll=0000-0001-0001 lock=N pin=0 latch#=5 hpc=6126 hlc=6126
lwt=7000000ddaefde0[7000000ddaefde0,7000000ddaefde0] ltm=7000000ddaefdf0[7000000ddaefdf0,7000000ddaefdf0]
pwt=7000000ddaefda8[7000000ddaefda8,7000000ddaefda8] ptm=7000000ddaefdb8[7000000ddaefdb8,7000000ddaefdb8]
ref=7000000ddaefe10[7000000ddaefe10,7000000ddaefe10] lnd=7000000ddaefe28[7000000ddaefe28,7000000ddaefe28]
LIBRARY OBJECT: object=7000000ddaef868
type=CRSR flags=EXS[0001] pflags=[0000] status=VALD load=0
CHILDREN: size=16
child# table reference handle
------ -------- --------- --------
0 7000000ddaef768 7000000ddaef2d0 7000000ddaeee08
1 7000000ddaef768 7000000d0ef9aa0 7000000d0ef96a8
2 7000000ddaef768 7000000dafffe50 7000000b494f690
DATA BLOCKS:
data# heap pointer status pins change whr
----- -------- -------- --------- ---- ------ ---
0 7000000ddaefc78 7000000ddaef980 I/P/A/-/- 0 NONE 00
综合以上ORA-04020和ORA-07445: [_ptrgl]等现象,该问题与已知的BUG 5160122、8250557 - ORA-4020 WHEN CONCURRENTLYEX ECUTE DBMS_STATS GATHER_STATS_JOB, REBUILD TEXT IN十分相似,该BUG已确认影响当前版本 10.2.0.4 ,触发BUG的场景为10.2.0.4 上全文索引查询和GATHER_STATS_JOB作业同时进行引起DDL死锁, 现象为出现ORA-4020错误和ORA-07445错误:
Oracle Text - Version: 10.2.0.4 to 10.2.0.4 - Release: 10.2 to 10.2
Information in this document applies to any platform.
***Checked for relevance on 19-Jan-2012***
1.1 Symptoms
While running Text query sessions during statistic maintenance windows, using gather_stats_jobr or gather_[Table|Schema|Database]_stats procedures, a deadlock with one of the Text query sessions occurs and eventually query session core dump with ORA-7445 [drexumcx], visible in alert logfile.
*** 2009-03-20 22:06:12.821
A deadlock among DDL and parse locks is detected.
This deadlock is usually due to user errors in
the design of an application or from issuing a set
of concurrent statements which can cause a deadlock.
This should not be reported to Oracle Support.
The following information may aid in finding
the errors which cause the deadlock:
ORA-04020: deadlock detected while trying to lock object TEXT.XML_IDX
--------------------------------------------------------
object waiting waiting blocking blocking
handle session lock mode session lock mode
-------- -------- -------- ---- -------- -------- ----
0xa87376d8 0xa71a8868 0x98789f54 X 0xa719d5d0 0x9834ace4 S
0xa0caa1f4 0xa719d5d0 0x9834e894 S 0xa71a8868 0x9823e3d0 X
--------------------------------------------------------
---------- DUMP OF WAITING AND BLOCKING LOCKS ----------
--------------------------------------------------------
------------- WAITING LOCK -------------
----------------------------------------
SO: 0x98789f54, type: 53, owner: 0xa6f2576c, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=98789f54 handle=a87376d8 request=X
call pin=(nil) session pin=(nil) hpc=0000 hlc=0000
htl=0x98789fa0[0xa3d44088,0xa3e7ffac] htb=0xa3e7ffac ssga=0xa3e7fd20
user=a71a8868 session=a71a8868 count=0 flags=[0000] savepoint=0xc99f1
LIBRARY OBJECT HANDLE: handle=a87376d8 mtx=0xa873778c(0) cdp=0
name=TEXT.ITEM
hash=6cbb78e269f641642b2dca531feed73a timestamp=01-10-2009 07:05:50
namespace=TABL flags=KGHP/TIM/SML/[02000000]
kkkk-dddd-llll=0000-0749-074d lock=S pin=S latch#=2 hpc=31ae hlc=31ae
lwt=0xa8737734[0x98789f70,0x98789f70] ltm=0xa873773c[0xa873773c,0xa873773c]
pwt=0xa8737718[0xa8737718,0xa8737718] ptm=0xa8737720[0xa8737720,0xa8737720]
ref=0xa8737754[0xa8737754,0xa8737754] lnd=0xa8737760[0xa8947b30,0xa86f4e4c]
LIBRARY OBJECT: object=a23b8328
type=TABL flags=EXS/LOC[0005] pflags=[0000] status=VALD load=0
...
------------- BLOCKING LOCK ------------
----------------------------------------
SO: 0x9823e3d0, type: 53, owner: 0xa6f261c4, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=9823e3d0 handle=a0caa1f4 mode=X
call pin=(nil) session pin=(nil) hpc=0000 hlc=0000
htl=0x9823e41c[0xa3e7ffdc,0xa3e7ffdc] htb=0xa3e7ffdc ssga=0xa3e7fd20
user=a71a8868 session=a71a8868 count=1 flags=[0000] savepoint=0xc962e
LIBRARY OBJECT HANDLE: handle=a0caa1f4 mtx=0xa0caa2a8(0) cdp=0
name=TEXT.XML_IDX DR$STATS$OBJ
hash=c59fefefa06ced5e681f46e4cfbe5a40 timestamp=NULL
namespace=CPOB flags=KGHP/TIM/FUL/FUP/SML/[0e000000]
kkkk-dddd-llll=0000-0001-0001 lock=X pin=X latch#=1 hpc=21f8 hlc=21f6
lwt=0xa0caa250[0x9834e8b0,0x9834e8b0] ltm=0xa0caa258[0xa0caa258,0xa0caa258]
pwt=0xa0caa234[0xa0caa234,0xa0caa234] ptm=0xa0caa23c[0xa0caa23c,0xa0caa23c]
ref=0xa0caa270[0xa0caa270,0xa0caa270] lnd=0xa0caa27c[0xa0caa27c,0xa0caa27c]
LIBRARY OBJECT: object=8f873988
type=CPOB flags=EXS/LOC/PRG[0085] pflags=[0000] status=VALD load=0
DATA BLOCKS:
data# heap pointer status pins change whr
----- -------- -------- --------- ---- ------ ---
0 a0bff808 8f873a20 I/P/A/-/- 0 NONE 00
--------------------------------------------------------
This lock request was aborted.
A deadlock among DDL and parse locks is detected.
This deadlock is usually due to user errors in
the design of an application or from issuing a set
of concurrent statements which can cause a deadlock.
This should not be reported to Oracle Support.
The following information may aid in finding
the errors which cause the deadlock:
ORA-04020: deadlock detected while trying to lock object TEXT.XML_IDX
--------------------------------------------------------
object waiting waiting blocking blocking
handle session lock mode session lock mode
-------- -------- -------- ---- -------- -------- ----
0xa87376d8 0xa71a8868 0x98518bac X 0xa719d5d0 0x9834ace4 S
0xa0caa1f4 0xa719d5d0 0x983583a0 S 0xa71a8868 0x9823e3d0 X
--------------------------------------------------------
*** 2009-03-20 22:06:25.021
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [drexumcx()+64] [SIGSEGV] [Address
not mapped to object] [0x4] [] []
Current SQL statement for this session:
declare
cost sys.ODCICost := sys.ODCICost(NULL, NULL, NULL, NULL);
arg0 CLOB := null;
begin
:1 := "CTXSYS"."TEXTOPTSTATS".ODCIStatsIndexCost(
sys.ODCIINDEXINFO('TEXT','XML_IDX', ...
...
or
----- PL/SQL Call Stack -----
object line object
handle number name
0xa059345c 106 CTXSYS.TEXTOPTSTATS
0xa8900cd8 6 anonymous block
----- Call Stack Trace -----
... <- drexumcx <- drexoicost <- spefcpfa < ...
or
... <- drexumcx <- drexofcost <- spefcpfa < ...
or
... <- drumcmf <- drurnew <- draccbi <- drexumcx <- drexostatscoll <- spefcpfa < ...
1.2 Cause
The cause of this problem has been identified and verified in an unpublished Bug 5160122. The deadlock problem is caused by the fix for unpublished Bug 4439469.
1.3 Solution
When available, download and apply Patch 5160122 to resolve the problem.
This problem is fixed in 10.2.0.5 release version, Patch 8202632.
虽然出现了ORA-04020和ORA-07445: [_ptrgl],但这2个错误仍不足以导致实例终止Instance Crash。 实际引发实例终止的是ORA-00494: enqueue [CF] ,控制文件队列锁900s超时,ARC0归档进程由于长期得不到CF锁,进而终止实例:
Thu Nov 8 22:36:54 2012
Errors in file /db05/app/oracle/admin/CCISKB/bdump/cciskb_arc0_364906.trc:
ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 266518'
Thu Nov 8 22:36:59 2012
System State dumped to trace file /db05/app/oracle/admin/CCISKB/bdump/cciskb_arc0_364906.trc
Thu Nov 8 22:40:27 2012
Killing enqueue blocker (pid=266518) on resource CF-00000000-00000000
by killing session 1650.1
Killing enqueue blocker (pid=266518) on resource CF-00000000-00000000
由于缺少 ARC0 在Terminate Instance前生成的SYSTEM STATE DUMP cciskb_arc0_364906.trc 文件(没有真实生成或丢失),我们无法找出当时是哪一个进程长期HOLD住了 CF-00000000-00000000资源。
发生ORA-00494错误的时间点为22:36:54,CF Enqueue Lock超过900秒才会超时,说明该CF锁至少从 22:20左右被某个进程长期持有。
在22:18时发现主机SWAP交换空间使用率较高,登陆系统异常缓慢,且有一远程登陆的服务进程占用大量内存。该现象可能与文本索引的查询存在关联。 目前已知存在Bug 6624968 : A QUERY AGAINST A TRANSACTIONAL TEXT INDEX CONSUMES HUGE PGA MEMORY,对于TRANSACTIONAL文本索引的查询可能导致大量PGA的使用。
Name PID CPU% PgSp Class ServerV3 0 "h" for help
rpc.stat 192714 5.6 2.8 Default lientV3 0 "q" to quit
oracle 1135074 3.14585.7 Default
srcmstr 151992 2.2 0.7 System
diagd 209114 1.4 0.5 System
portmap 143792 0.9 0.9 System
shlap64 131476 0.7 0.4 System
rpc.lock 221264 0.2 0.3 System
dtlogin 225310 0.2 0.3 System
tnslsnr 381304 0.2 22.7 Default
lrud 16392 0.1 0.6 System
cimssys 160240 0.1 0.3 System
topas 1257636 0.1 8.5 Default
oracle 1233376 0.1 20.7 Default
Signal 2 received
$ ps -ef|grep 1135074
kfxta 1257640 1392968 0 22:26:43 pts/1 0:00 grep 1135074
oracle 1135074 1 0 Oct 15 - 145:08 oracleCCISKB (LOCAL=NO)
SQL> select sid,sql_id,event from v$session where paddr=(select addr from v$process where spid=1135074);
SID SQL_ID EVENT
---------- ------------- ----------------------------------------------------------------
1562 fdz4bhkngd54x latch: row cache objects
SQL> select sql_text from v$sql where sql_id='fdz4bhkngd54x';
SQL_TEXT
--------------------------------------------------------------------------------
select KNID,question,ANSWER,operid,FUNCTYPE,modefydate,areacode,BUSITYPESTR,VERI
FYSTATE,dealdesc,clickdegree,CHANNEL from (select a.KNID,b.question
,a.ANSWER,a.operid,a.FUNCTYPE, to_char(a.LOGTIME,'YYYYMMDD HH24:MI:SS') MODEFYDA
TE,a.areacode,a.BUSITYPESTR,a.VERIFYSTATE,s.typeiddescription dealdesc,a.clickde
gree, row_number() over(order by a.clickdegree desc,a.LOGTIME desc)
rn,a.CHANNEL from VA_KNOWLEDGE a, TAB1 b, s_mapping_detail s where b.KNI
D=a.KNID and a.VERIFYSTATE=s.typeidcode(+) AND s.typecode(+)='VAVRY' and b.VERIF
YSTATE= 1 and contains(b.question,'ê??ú and ?ü?? and ????')>0 and a.FUNCTYPE =
0 ) tmp where rn>(:in_sPage-1)*:in_sPageSize AND rn<=:in_sPage*:in_sPageSize
以上可以看到消耗大量内存的1135074 进程正在执行对TAB1表的contains查询,这将引发对orcl.IDX_TAB1_QUESTION全文索引的查询。 但是由于缺少1135074进程当时的PGA HEAPDUMP信息,无法得知其具体的内存使用分配情况。
大量SWAP交换空间的使用可能导致Oracle进程反应缓慢,无法及时释放关键的锁资源,从而导致CF控制文件队列锁超时,引起后台进程无法获得CF锁而终止实例。 但是由于缺少当时的SYSTEM STATE DUMP,无法获得当时长期持有CF队列锁资源的进程状态,从而进一步确认该SWAP导致的CF队列锁超时。
建议
1. BUG 5160122在10.2.0.4 AIX POWER平台上有对应的补丁Patch 5160122: TEXT QUERY DEADLOCK WITH GATHER_STATS_JOB, THEN COREDUMP ORA-7445 [DREXUMCX],考虑通过安装补丁的方式解决ORA-04020、ORA-7445错误
2. 隐藏参数_controlfile_enqueue_timeout控制CF锁超时引发ORA-00494的时长,默认为900秒;注意增加该参数仅推迟ORA-00494错误的发生,有助于减少后台进程终止实例的可能,但对解决问题本身没有帮助。
3. 若系统自动生成的SYSTEM STATE不可用,可以在问题发生时手动收集SYSTEM STATE DUMP,以便于今后诊断,手动收集的方法如下:
Conn / as sysdba
Oradebug setmypid;
Oradebug unlimit;
Oradebug dump systemstate 266;
Oradebug tracefile_name;
|
|