Oracle数据库数据恢复、性能优化»论坛 › Oracle › Oracle数据库管理 › cursor: pin S wait on X latch: row cache objects ...

37 积分	0 好友	3 主题

发消息

cursor: pin S wait on X latch: row cache objects hang RAC 11.1.0.7.0

1^#

发表于 2012-5-18 10:44:17 | 查看: 11171| 回复: 10

cursor: pin S wait on X reliable message latch: row cache objects hang RAC 11.1.0.7.0
系统出现短时间hang住，前台无法登陆系统，帮忙看看。

awrrpt_1_3375_3376.rar

58.76 KB, 下载次数: 1073

分享0

收藏0 回复只看该作者道具举报

songdeyouxiang

2^#

发表于 2012-5-18 11:07:54

补充sgastat

sga.rar

9.95 KB, 下载次数: 1029

回复只看该作者道具举报

Maclean Liu(刘相兵

3^#

发表于 2012-5-18 11:54:43

AWR analysis

11.1.0.7 RAC on Linux x86 64-bit

极高的DB TIME 预示着极高的实例性能负载

Shared Pool Size 从 5,184M 下降到 5,056M 收缩了100多M ，对应的 DB CACHE SIZE 增长了100多M ，说明使用了 AUTO-SGA 或 AMM技术：

TOP 5中出现了大量parse相关的等待事件 cursor: pin S wait on X 和 latch: row cache objects，但是实际的每秒硬解析并不多 hard parse 0.1次/s

如预期的 Parse Time占Time Model中的大头：

引发 latch: row cache objects latch miss的主要源头是 kqrbip 和 kqrpre: find obj 函数

KQRBIP 是row /dictionary cache 管理的相关函数

kqr dict/rowcache row cache management. The row cache consists of a set of facilities to provide fast access to table definitions and locking capabilities.

而引起 cursor pin S on X 的源头是 kkslce [KKSCHLPIN2] 和 kksfbc [KKSCHLFSP2] :

通过 Memory Resize Ops 可以发现 shared pool在快照时间段内 shrink 了 2次 128M 之后 grow了一次 64M

到 SGA breakdown difference 中可以找到是那些 shared pool component收到了 SHRINK的影响：

KQR L PO_shrink.png

AWR 没有采集到 KQR L PO KQR M SO 等shared pool组件的END MB，但我们还是可以了解到这些组件受到了重大影响：
事后查询V$SGASTAT视图可以找到以下信息

shared pool	KQR L PO	93461088
shared pool	KQR L SO	953376
shared pool	KQR M SO	4812320
shared pool	KQR S SO	155904
shared pool	KQR X PO	49401536

KQR L PO    从 631.24 M 收缩到  89M
KQR M SO 从  114.07M 收缩到 4M

KQR 即 row cache 也叫 dictionary cache，  shared pool的2次shrink 导致 KQR组件受到影响收缩，进而导致了 latch:row cache objects的争用，SQL 解析速度受到 latch:row cache objects的影响，导致更多的cursor pin S on X的争用，导致极高的实例负载，  导致系统登录HANG住。

memory_target 17179869184          这套系统启用了11g AMM 自动内存管理特性

回复只看该作者道具举报

vincent

4^#

发表于 2012-5-18 12:27:18

刘大，这个解决方案是否可以。固定db_cache ,shard_pool一个最低值。然后在通过自动内存管理特性自动调节，减少内存抖动的范围？

回复只看该作者道具举报

songdeyouxiang

5^#

发表于 2012-5-18 12:43:52

cursor: pin S wait on X latch: row cache

谢谢，现在sql解析占了77.64%的db time，能否考虑把内存设置为手动管理方式，数据库启用了自动审计，现在已经停了，尽量减少解析，请教还有别的方法减低解析次数吗

回复只看该作者道具举报

laobu

6^#

发表于 2012-5-18 15:39:11

"实际的每秒硬解析并不多 hard parse 0.1次/s"

应急先改SGA手动管理，有机会了就版本升级
--用X.1版的oracle，那就是找不痛快
：）

回复只看该作者道具举报

myownstars

7^#

发表于 2012-5-18 17:47:37

可以显示的指定 shared_pool_size大小，且尽量设置大一些（你OS还有很多内存），比如
alter system set shared_pool_size=6G，可确保shared_pool至少有6G内存，只有在不够时会resize增大，而不会resize <6G，这样可以确保减少shared_pool resize次数；

回复只看该作者道具举报

Maclean Liu(刘相兵

8^#

发表于 2012-5-18 20:18:09

set _enable_shared_pool_durations = false to avoid that one duration (a memory area in the shared pool used for a specific usage) need to give all space required for that usage, i.e. in case the duration containing the dictionary cache need to free memory, then that duration is extra stressed since no other type of memory from other durations can be used. Setting it to false make that any type of memory can be used to free space (i.e. any type of memory in the subpool). As a consequence, the number of subpools will be reduced by the factor of the number of durations (4 in 10gR2). Hence tuning the _kghdsidx_count is advisable, e.g. increasing it to have manageable subpool sizes (see note:396940.1).

建议设置 "_enable_shared_pool_durations"=false，将防止shared pool 发生SHRINK 操作：

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0    Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL>  alter system set "_enable_shared_pool_durations"=false scope=spfile;

System altered.

SQL> startup force;

ORACLE instance started.

Total System Global Area 1570009088 bytes
Fixed Size                2228704 bytes
Variable Size          1023413792 bytes
Database Buffers       536870912 bytes
Redo Buffers             7495680 bytes
Database mounted.
Database opened.

SQL> show parameter durations

NAME                               TYPE
------------------------------------ --------------------------------
VALUE
------------------------------
_enable_shared_pool_durations       boolean
FALSE

回复只看该作者道具举报

songdeyouxiang

9^#

发表于 2012-5-19 12:40:52

昨晚的awr附上，library cache load lock 变成了占用db time最高的等待事件，分析后发现数据库瓶颈还是在解析这一块，请大家帮忙找找原因，metalink上找到一篇文章，是关于频繁自动调整共享池和缓冲区解决方案，今早已经给数据库打了 Patch:9267837，等晚上再分析awr报表，看性能是否有改善

谢谢Maclean Liu的建议，如果打补丁没有解决这个问题，我会尝试修改这个参数

awrrpt_1_3395_3396.rar

52.53 KB, 下载次数: 941

ID_742599.1.rar

15.65 KB, 下载次数: 913

回复只看该作者道具举报