遭遇cursor: pin S wait on X数据库无法访问
一套生产系统 HPUX 11.31 10.2.0.5 RAC 最近一周遇到两次数据库在一段时间内hang住 登录服务器使用sqlplus 也无法连接,监控日志报出大量的"WARNING: inbound connection timed out (ORA-3136)" ,所有的session都不能连接。附件是两个故障时间点的AWR和ASH 里面最严重的等待事件是cursor: pin S wait on X 请大家帮忙看一下 检察是否符合以下bug情形Processes Hang Waiting on 'cursor: pin S wait on X' (and other Wait Events) on HP-UX 11.23 and 11.31 Itanium Systems (Doc ID 580273.1) To BottomTo Bottom
Modified:22-Jul-2013Type:PROBLEM
Rate this document Email link to this document Open document in new window Printable Page
In this Document
Symptoms
Changes
Cause
Solution
APPLIES TO:
Oracle Database - Enterprise Edition - Version 9.2.0.1 and later
HP-UX Itanium
***Checked for relevance on 23-JUL-2013***
SYMPTOMS
Process hangs indefinitely waiting on one of the following wait events:
cursor: pin S wait on X (10g wait)
kksfbc child completion
SGA: allocation forcing component growth
There is no blocker process.
The variety in wait events is explained by the fact that the underlying issue is with the pw_wait() HP-UX system call.
Therefore, the above list is not exhaustive, use the pstack output below to match the problem in all cases.
The problem is commonly seen on Parallel Execution Slave processes but can affect any Oracle process.
A command to generate a systemstate, processstate, or errorstack dump e.g. via ALTER SESSION or oradebug command frees the hanging process.
An example of the top of the call stack of an affected process, obtained using pstack:
0: c0000000003e2ff0 : pw_wait() + 0x30 (/usr/lib/hpux64/libc.so.1)
1: 4000000002e98a20 : pw_wait() + 0x80
(/oracle/DwhProd/orabin/oradb/product/10.2.0/bin/oracle)
2: 4000000002e492f0 : sskgpwwait() + 0x170 (/oracle/DwhProd/orabin/oradb/pr
oduct/10.2.0/bin/oracle)
3: 4000000002e3af80 : skgpwwait() + 0xa0 (/oracle/DwhP
rod/orabin/oradb/product/10.2.0/bin/oracle)
4: 4000000002d6ffd0 : ksliwat() + 0x670
(/oracle/DwhProd/orabin/oradb/product/10.2.0/bin/oracle)
5: 4000000002d4a1b0 : kslwaitns_timed() + 0x70
(/oracle/DwhProd/orabin/oradb/product/10.2.0/bin/oracle)
6: 4000000002d4b9b0 : kskthbwt() + 0x190 (/oracle/DwhProd/orabin/oradb/p
roduct/10.2.0/bin/oracle)
7: 4000000002e35220 : kslwait() + 0xe0 (/oracle/DwhPr
od/orabin/oradb/product/10.2.0/bin/oracle)
...
First few calls of the call stacks will differ depending on the particular wait event.
Output of the tusc system call tracer tool will be similar to the following:
( Attached to process 6144 ("ora_j032_SATAVA02") )
15:30:27 {631499} #1 pw_wait(0x9ffffffffffe1700) ...
User time: 0.00
Interrupt time: 0.00
Elapsed time: 66.3
CHANGES
The following HP Patch Bundle for 11.31 may have been recently applied
Mar 2008: FEATURE11i,B.11.31.0803.318b.
Note: it has been brought to our attention that the problem may also occur on HP-UX 11.23 systems.
CAUSE
The problem is a defect in the pw_wait() O/S system call.
On HP-UX 11.31, this is caused by a problem with the HP-UX scheduler patch PHKL_37456:
PHKL_37456 scheduler cumulative patch
Additional info may be sought in HP Change Request QXCR1000811756.
Please note that the wait events shown above 'cursor: pin S wait on X' and 'kksfbc child completion' can occur for a number of other legitimate reasons, so presence of these waits alone does not indicate that the problem described in this article has been encountered. You will need to verify in detail the other symptoms described here when investigating the issue you are facing.
The problem is also reported to occur on HP-UX 11.23. We have no information at this time on specific PHKL patch levels are susceptible to it, only that it is fixed starting with PHKL_37809.
SOLUTION
A number of options are available to work around or resolve this problem:
Workaround: oradebug setospid+unlimit to free the hanging process:
identify the Unix process id of the hanging process
use oradebug on it as follows (the unlimit command is sufficient to stop the hang)
SQL> oradebug setospid <unix process id>
SQL> oradebug unlimit
alternatively, if there are many processes hanging on 'cursor: pin S wait on X', or time is short, a systemstate dump (at minimum level 1 to minimize the amount of trace that will be written) will be the quickest way to go over all hanging processes and wake them up:
SQL> oradebug setmypid
SQL> oradebug unlimit
SQL> oradebug dump systemstate 1
Solution: on 11.31 install PHKL_38397 or later equivalent.
Solution: on 11.23 install PHKL_37809 or later equivalent.
Note: the patches whose names begin with PHKL_ are HP-UX patches. Please check with HP-UX support on the latest applicable patch numbers as such patches are regularly updated and superceded. Hi,
From your AWR report, your DB workload was extremely high.
And your Oracle version was stable, I don't think that was bug to cause that.
Execute to Parse %: 19.84, ---- too much Parse existed in your database, is that OTPL database?
It was RAC Concurrency contention to cause that .
Most time, users session was below table from AWR (gments by Row Lock Waits)
SA_TASK_LOCKS
Any application changed? Any change in database in the past two weeks?
Beside, Could you get below sql plan?
c56sbwns406ym
d9dkp2jmgfvr3
bn7dr3cnvpqgt
Benson2014 发表于 2014-3-25 17:36 static/image/common/back.gif
Hi,
From your AWR report, your DB workload was extremely high.
两周内数据库没有变化,业务需要明天确认一下,这三个SQL明天也会打包上传 谢谢! Besides that, could you provide some AWR reports before the case occurred?
Compared them, some finding may help you investigate it.
Benson2014 发表于 2014-3-25 17:36 static/image/common/back.gif
Hi,
From your AWR report, your DB workload was extremely high.
上面三个语句的SQLAWR
页:
[1]