Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

95

积分

192

好友

16

主题
1#
发表于 2012-2-29 09:50:54 | 查看: 17480| 回复: 18
环境:aix 5.3+rac+10.2.0.4.0
状况:两节点rac中,B节点登陆不上去,A节点可以正常登陆。
操作:
sqlplus -prelim "[email=sys/p@sptdi1]sys/p@sptdi1[/email] as sysdba"
SQL> oradebug setmypid
SQL>oradebug setinst all
SQL>oradebug -g def hanganalyze 12

SQL> oradebug -g all dump systemstate 267

SQL> oradebug -g all dump systemstate 267

SQL> oradebug -g all dump systemstate 267
执行到oradebug -g def hanganalyze 12时 B节点可以登陆上去。
B节点 alert报错如下:
Mon Feb 27 15:30:34 2012
MMNL absent for 1233 secs; Foregrounds taking over
Mon Feb 27 16:01:57 2012
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=90
System State dumped to trace file /oracle/admin/sptdi/bdump/sptdi2_q000_372994.trc
Mon Feb 27 16:02:51 2012
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=115
System State dumped to trace file /oracle/admin/sptdi/bdump/sptdi2_j000_325396.trc
Mon Feb 27 16:02:56 2012
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=141
Mon Feb 27 16:03:14 2012
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=150
Mon Feb 27 16:18:27 2012
System State dumped to trace file /oracle/admin/sptdi/bdump/sptdi2_diag_410206.trc
Mon Feb 27 16:18:29 2012
MMNL absent for 4049 secs; Foregrounds taking over
Mon Feb 27 16:19:27 2012
System State dumped to trace file /oracle/admin/sptdi/bdump/sptdi2_diag_410206.trc
Mon Feb 27 16:19:56 2012
Thread 2 advanced to log sequence 151885 (LGWR switch)
  Current log# 10 seq# 151885 mem# 0: +DATA_SYS/sptdi/onlinelog/group_10.273.717593685
Mon Feb 27 16:24:31 2012
System State dumped to trace file /oracle/admin/sptdi/bdump/sptdi2_diag_410206.trc
Mon Feb 27 16:28:48 2012
Thread 2 advanced to log sequence 151886 (LGWR switch)
  Current log# 3 seq# 151886 mem# 0: +DATA_SYS/sptdi/onlinelog/group_3.259.717593999
Mon Feb 27 16:29:55 2012
System State dumped to trace file /oracle/admin/sptdi/bdump/sptdi2_diag_410206.trc
附件中为sptdi2_q000_372994.trc,sptdi2_j000_325396.trc 及其ass109后的文件,  还有 awr,ash。
sptdi2_diag_410206.trc大小为4G,太大了,附上sptdi2_diag_410206.trc的ass109后的文件。
ps:红色部分原来没有执行过,因为得到的ass109后的文件中holder一直为 问号 ,才加上的这个语句,也是加上这个语句,造成sptdi2_diag_410206.trc增长到4G


附件在群里  zip.zip

[ 本帖最后由 Ling.QIu 于 2012-2-29 09:52 编辑 ]
Oracle ALLSTARS II:171092051(Oracle基础讨论群)
提问之前请阅读以下链接
http://t.askmaclean.com/thread-714-1-1.html
http://train.askmaclean.com/node/5
Oracle ALLSTARS III:180013778(扯蛋打酱油专用群)
2#
发表于 2012-2-29 10:10:21
老白有个经典分析案例,去搜下

回复 只看该作者 道具 举报

3#
发表于 2012-2-29 10:47:11

回复 2# 的帖子

老白的看过了,  他那个还有思路,这个没有思路。


这贴子  是酱油 帮我发的。

回复 只看该作者 道具 举报

4#
发表于 2012-2-29 20:31:02
SO: 7000007842484d0, type: 2, owner: 0, flag: INIT/-/-/0x00
  (process) Oracle pid=29, calls cur/top: 70000078539fef0/70000078539fef0, flag: (6) SYSTEM
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 21
              last post received-location: ksbria
              last process to post me: 7000007842484d0 1 6
              last post sent: 0 0 21
              last post sent-location: ksbria
              last process posted by me: 7000007842484d0 1 6
    (latch info) wait_event=0 bits=0
        Location from where call was made: kqrbip:
      waiting for 70000076e8f8630 Child row cache objects level=4 child#=3
        Location from where latch is held: kghfrunp: clatch: nowait:
        Context saved from call: 0
        state=busy, wlstate=free
          waiters [orapid (seconds since: put on list, posted, alive check)]:
           24 (3, 1329632366, 3)
           82 (3, 1329632366, 3)
           72 (3, 1329632366, 3)
           84 (3, 1329632366, 3)
           71 (3, 1329632366, 3)
           77 (3, 1329632366, 3)
           80 (3, 1329632366, 3)
           29 (3, 1329632366, 3)
           25 (3, 1329632366, 3)
           30 (3, 1329632366, 3)
           26 (3, 1329632366, 3)
           66 (3, 1329632366, 3)
           130 (3, 1329632366, 3)
           waiter count=13
          gotten 222053180 times wait, failed first 2917551 sleeps 3498594
          gotten 1364450 times nowait, failed: 2409551
        possible holder pid = 86 ospid=196866
      on wait list for 70000076e8f8630
    Process Group: DEFAULT, pseudo proc: 7000007852bdfc8
    O/S info: user: oracle, term: UNKNOWN, ospid: 328256
    OSD pid info: Unix process pid: 328256, image: oracle@orasrvb (LCK0)
    Short stack dump:
ksdxfstk+002c<-ksdxcb+04e4<-sspuser+0074<-000044C0<-skgpwwait+00bc<-kslges+054c<-kslgetl+033c<-kqrbip+03c8<-kqrbfr+01ec<-kqrbtm+00c8<-ksbcti+03c8<-ksbabs+03fc<-kclabs+0
0d8<-ksbrdp+04b4<-opirip+03fc<-opidrv+0458<-sou2o+0090<-opimai_real+0150<-main+0098<-__start+0098


    SO: 7000007823585f8, type: 4, owner: 7000007842484d0, flag: INIT/-/-/0x00
    (session) sid: 1072 trans: 0, creator: 7000007842484d0, flag: (51) USR/- BSY/-/-/-/-/-
              DID: 0000-001D-00000003, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS
    service name: SYS$BACKGROUND
    waiting for 'latch: row cache objects' blocking sess=0x0 seq=17953 wait_time=0 seconds since wait started=3
                address=70000076e8f8630, number=c8, tries=1
    Dumping Session Wait History
     for 'latch: row cache objects' count=1 wait_time=154226
                address=70000076e8f8630, number=c8, tries=0
     for 'latch: row cache objects' count=1 wait_time=14
                address=70000076e8f8630, number=c8, tries=6
     for 'latch: row cache objects' count=1 wait_time=21
                address=70000076e8f8630, number=c8, tries=5
     for 'latch: row cache objects' count=1 wait_time=1366
                address=70000076e8f8630, number=c8, tries=4
     for 'latch: row cache objects' count=1 wait_time=8913
                address=70000076e8f8630, number=c8, tries=3
     for 'latch: row cache objects' count=1 wait_time=10957
                address=70000076e8f8630, number=c8, tries=2
     for 'latch: row cache objects' count=1 wait_time=59
                address=70000076e8f8630, number=c8, tries=1
     for 'latch: row cache objects' count=1 wait_time=197184
                address=70000076e8f8630, number=c8, tries=0
     for 'latch: row cache objects' count=1 wait_time=33
                address=70000076e8f8630, number=c8, tries=5
     for 'latch: row cache objects' count=1 wait_time=1703
                address=70000076e8f8630, number=c8, tries=4
    temporary object counter: 0
      ---------------------------------------


许多session  都在等  'latch: row cache objects' 70000076e8f8630
其中也包括了 LCK0 是 RAC环境中的关键的后台进程  

LCK0
Instance Enqueue Process
Manages
instance resource requests
cross-instance call operations



                    Resource Holder State
       Latch 70000076e8f8630    ??? Blocker

很可惜在systemstate 中找不到   'latch: row cache objects' 70000076e8f8630的holder

这不要紧

回复 只看该作者 道具 举报

5#
发表于 2012-2-29 20:34:09
请注意 LCK进程stack call:

ksdxfstk+002c<-ksdxcb+04e4<-sspuser+0074<-000044C0<-skgpwwait+00bc<-kslges+054c<-kslgetl+033c<-kqrbip+03c8<-kqrbfr+01ec<-kqrbtm+00c8<-ksbcti+03c8<-ksbabs+03fc<-kclabs+0


kqrbip =>  kslgetl=> kslges   是kqrbip 函数 发起了对  'latch: row cache objects' 70000076e8f8630的 kslgetl acuqire要求

这个信息很有价值

KQRBIP   是row /dictionary cache 管理的相关函数

kqr         dict/rowcache         row cache management. The row cache consists of a set of facilities to provide fast access to table definitions and locking capabilities.

回复 只看该作者 道具 举报

6#
发表于 2012-2-29 20:38:04
我们放弃对 systemstate dump的进一步分析转 , 实际上已经获得了不少的信息, 转而分析AWR。

db time 很高 2,104.04 (mins)= 126240s ,  每秒的 parse 和 hard parse数并不高



db_time_hard_parse.png


但是在Top 5 waited event 中latch: row cache objects 、cursor: pin S wait on X、 library cache load lock、row cache lock、latch: library cache等 解析等待占了榜首


top_5_timed_events_latch_row_cache_objects.png


硬解析这么少, 而latch: row cache objects 、cursor: pin S wait on X 这么多,而且 avg wait 平均每次 108ms 是这么回事?

回复 只看该作者 道具 举报

7#
发表于 2012-2-29 21:05:31
我们来仔细分析 latch: row cache objects 这个 闩的信息

latch: row cache objects  miss率很高


latch_row_cache_objects.png


我们来看一下 latch miss的源头函数是哪些:

latch_miss_row_cache_objects.png


这里可以看到 function kqrbip 也就是我们上面获得的这个函数是 造成  latch: row cache objects  miss的源头


对于此类 shared pool 中 library cache和 row/dictionary cache的问题, 我们还需要注意的是 shared pool中的组件、 空闲内存量及 Auto Sga的抖动情况:

可以发现 在78min内 shared pool 发生过 收缩

asmm_shared_pool_size.png

SGA breakdown difference
shared_pool_kqr_L_PO.png


可以发现 shared pool共享池中的 KQR L PO 发生过大的 变化 % Diff 达到  75%

但是shared pool的 free memory 还是很充足的 有1,699MB

回复 只看该作者 道具 举报

8#
发表于 2012-2-29 21:24:12
好了 我们总结一下上面获得的信息:

10.2.0.4 + RAC + 'latch: row cache objects' + kqrbip+ KQR L PO 通过这一系列关键词 去MOS 上尝试定位 已有的case:

ODM Finding:
  1. LCK temporarily stuck waiting for latch 'Child row cache objects' [ID 843638.1]

  2. Applies to:
  3. Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.1.0.7 - Release: 10.2 to 11.1
  4. Information in this document applies to any platform.
  5. Symptoms

  6. One RAC instance hangs from time to time. The hang resolves itself after a while until it hangs again after some time.

  7. 1. alert.log info

  8. => alert log of the hanging instance show on one node that the LCK process is
  9. blocked on different enqueues and hanging messages are reported afterwards
  10. (e.g. ORA-3136 and  WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK)

  11. LCK0 started with pid=10, OS id=7217
  12. Wed Feb 11 12:25:36 2009
  13. ...
  14. Mon Feb 16 19:01:24 2009
  15. GES: Potential blocker (pid=7217) on resource IV-0051D934-1013071D;
  16. enqueue info in file /u01/oracle/admin/SNM/bdump/snm2_lmd0_7006.trc and
  17. DIAG trace file
  18. Mon Feb 16 19:15:59 2009
  19. GES: Potential blocker (pid=7217) on resource CI-00000046-00000002;
  20. enqueue info in file /u01/oracle/admin/SNM/bdump/snm2_lmd0_7006.trc and
  21. DIAG trace file
  22. ...
  23. Mon Feb 16 19:26:16 2009
  24. > />> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=25
  25. System State dumped to trace file
  26. /u01/oracle/admin/SNM/bdump/snm2_mmon_7032.trc
  27. ...
  28. Tue Feb 16 19:35:22 2009
  29. WARNING: inbound connection timed out (ORA-3136)

  30. =>  alert.log on the other nodes show CI crossinstance messages locks

  31. Mon Feb 16 19:27:53 2009
  32. GES: Potential blocker (pid=13607) on resource CI-00000046-00000005;
  33. enqueue info in file /u01/oracle/admin/SNM/udump/snm1_ora_8418.trc and
  34. DIAG trace file

  35. 2. Systemstate dumps show the LCK process is blocked in the 'Child row cache objects' latch
  36. and plenty of processes are waiting on it (waiter count high). The LCK stack contains
  37. terms like 'sskgpwwait kslges kqrbip kqrbfr'.

  38.      (latch info) wait_event=0 bits=0
  39.          Location from where call was made: kqrbip:
  40.        waiting for 5795df1a0 Child row cache objects level=4 child#=3
  41.          Location from where latch is held: kghfrunp: clatch: wait:
  42.          Context saved from call: 0
  43.          state=busy, wlstate=free
  44.            waiters [orapid (seconds since: put on list, posted, alive check)]:
  45.             249 (1, 1234808779, 1)
  46. ...
  47.             161 (1, 1234808779, 1)
  48.             waiter count=34
  49.            gotten 108711869 times wait, failed first 40623351 sleeps 1869257
  50.            gotten 14091082 times nowait, failed: 43699893
  51.          possible holder pid = 255 ospid=3663
  52.        on wait list for 5795df1a0

  53. 3. awrrpt info

  54. => After the hang, the awrrpt show plenty of 'KQR L PO' memory is freed during the hang

  55. Pool   Name                                 Begin MB         End MB  % Diff
  56. ------ ------------------------------ -------------- -------------- -------
  57. shared KQR L PO                              2,118.6        1,005.9  -52.52
  58. shared free memory                             932.9        2,054.4  120.21

  59. => The rowcache (dictionary cache, the dc_*) usage decrease, too

  60. dc_histogram_defs                 332,935
  61. dc_object_ids                      57,894
  62. dc_objects                        160,639
  63. dc_segments                       382,909

  64. => big cursors are entering the shared pool regularly
  65. Cause
  66. The shared pool is stressed and memory need to be freed for the new cursors. As a consequence, the dictionary cache is reduced in size by the LCK process causing a temporal hang of the instance since the LCK can't do other activity during that time. Since the dictionary cache is a memory area protected clusterwide in RAC, the LCK is responsible to free it in collaboration with the dictionary cache users (the sessions using cursors referenced in the dictionary cache). This process can be time consuming when the dictionary cache is big.
  67. Solution

  68. a. reduce the stress on the shared pool

  69. => by increasing it above the automatically reached value with dynamic sga, e.g.
  70. when sga_target is set to 16G and the shared_pool_size was 6G during the hang time,  set it to e.g. 8G.

  71. => by reducing the number of big cursors entering the shared pool, e.g. cursors using more than 1M sharable_mem e.g. via binding
  72. select sql_text from v$sqlarea where sharable_mem > 1M;

  73. b. reduce the dictionary cache usage in order to reduce the size of the dictionary cache, e.g.

  74. => when dc_histogram_defs is too high, it can point towards histograms calculations on all columns
  75. of the tables. histograms should only be calculated on indexed columns
  76. => when dc_segments is high compared to dc_object_ids, it can point towards excessive partitioning usage. Reducing the partitions/subpartition usage will help reduce the dictionary cache usage to manage it.

  77. c. set _enable_shared_pool_durations = false to avoid that one duration (a memory area in the shared pool used for a specific usage) need to give all space required for that usage, i.e. in case the duration containing the dictionary cache need to free memory, then that duration is extra stressed since no other type of memory from other durations can be used. Setting it to false make that any type of memory can be used to free space (i.e. any type of memory in the subpool). As a consequence, the number of subpools will be reduced by the factor of the number of durations (4 in 10gR2). Hence tuning the _kghdsidx_count is advisable, e.g. increasing it to have manageable subpool sizes (see note:396940.1).
  78. d. check patch:8666117 has been applied. This patch speedup the processing to free memory.


  79. Hdr: 9014785 10.2.0.3 RDBMS 10.2.0.3 ROW CACHE PRODID-5 PORTID-23
  80. Abstract: LCK PROCESS STUCK, PREVENTS USER CONNECTIONS


  81. PROBLEM:
  82. --------
  83. 1. Clear description of the problem encountered:

  84. - 2-node RAC on Solaris.

  85. - One of the ways the database health is monitored is a script which attempts
  86. to connect to the database.
  87. - The script was not able to connect to the database for about an hour.
  88. - This issue affected instance SSUSA1

  89. - alert.log during this for the period :

  90. Fri Sep 18 01:04:42 2009
  91. >>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=132
  92. System State dumped to trace file
  93. /u01/app/oracle/admin/SSUS/udump/ssusa1_ora_13744.trc
  94. ...
  95. Fri Sep 18 01:14:49 2009
  96. >>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=381
  97. System State dumped to trace file
  98. /u01/app/oracle/admin/SSUS/udump/ssusa1_ora_240.trc
  99. ...
  100. Fri Sep 18 01:20:48 2009
  101. >>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=511
  102. System State dumped to trace file
  103. /u01/app/oracle/admin/SSUS/udump/ssusa1_ora_20042.trc
  104. ...
  105. Fri Sep 18 01:21:22 2009
  106. WARNING: inbound connection timed out (ORA-3136)

  107. 3. Indication of the frequency and predictability of the problem  

  108. - At least once, possibly more but don't have the full traces.

  109. 4. Sequence of events leading to the problem  

  110. - No particular sequence.

  111. 5. Technical impact on the customer. Include persistent after effects.

  112. - Unable to make user connections.

  113. DIAGNOSTIC ANALYSIS:
  114. --------------------
  115. 1) Systemstate from 01.04 -  ssusa1_ora_13744.trc:

  116. - LCK waiting for 'latch: row cache objects'
  117. - blocker is Process 444 on CPU doing :

  118. kghfrunp           KGH: Ask client to free unpinned space
  119. kghfnd             KGH: Find a chunk of at least the minimum size
  120. kghalo             KGH: main allocation entry point
  121. kghgex             KGH: Get a new extent
  122. kghalf             KGH: Non-recoverably allocate a freeable chunk of memory


  123. 2) 01.14 -  ssusa1_ora_240.trc :

  124. - LCK waiting for 'latch: row cache objects'
  125. - blocker is again P444 on same stack as above.

  126. 3) 01.20 - ssusa1_ora_20042.trc

  127. - LCK seems fine here.
  128. - Problems with dc_sequences .
  129. - Asked ct. to increase the cache size for sequences as several are
  130. under-sized.

  131. WORKAROUND:
  132. -----------
  133. - Incident resolved itself and eventually connections could be made.

  134. RELATED BUGS:
  135. -------------
  136. - bug:8666117 "LCK0 PROCESS STUCK AT WAITING FOR "LATCH: ROW CACHE OBJECTS"
  137. looks very similar.

  138. - Particularly in system states 'ssusa1_ora_13744.trc' + 'ssusa1_ora_240.trc'
  139. where LCK is waiting for 'latch: row cache objects' and
  140. the blocker is also on the same stack reported in the bug.

  141. - The bug mentions to compare "...huge decrease in dc_segments Final Usage"
  142. and
  143. "very high GES Releases in dictionary statistics part of AWR " and "and
  144. significant amount of increase for "KQR L PO"

  145. - Have AWRs from 00:00->01.21 + 01.21->02.00 but not sure what constitutes
  146. 'huge' and 'very high'.

  147. REPRODUCIBILITY:
  148. ----------------

  149. TEST CASE:
  150. ----------

  151. STACK TRACE:
  152. ------------

  153. SUPPORTING INFORMATION:
  154. -----------------------
  155. - alert.log
  156. - ssusa1_ora_240.trc   ( systemstate which seems to match 8666117 )
  157. - ssusa1_ora_13744.trc ( systemstate which seems to match 8666117 )
  158. - ssusa1_ora_20042.trc
  159. - awrs_18092009_01_02_Hangs_sr.zip


  160. - LCK State Object ( "ssusa1_ora_13744.trc"):

  161.        Location from where call was made: kqrbip:
  162.       waiting for d32f6c740 Child row cache objects level=4 child#=16
  163.         Location from where latch is held: kghfrunp: clatch: wait:
  164.         Context saved from call: 0
  165.         state=busy, wlstate=free
  166.           waiters [orapid (seconds since: put on list, posted, alive check)]:
  167.            47 (1, 1253235945, 1)
  168.            waiter count=1
  169.           gotten 1402030212 times wait, failed first 39523947 sleeps 362507
  170.           gotten 888811 times nowait, failed: 2288990
  171.         possible holder pid = 444 ospid=12826
  172. ....
  173.    waiting for 'latch: row cache objects' blocking sess=0x0 seq=40367
  174. wait_time=0 seconds since wait started=0
  175.                 address=d32f6c740, number=c7, tries=1
  176.     Dumping Session Wait History
  177.      for 'latch: row cache objects' count=1 wait_time=84873
  178.                 address=d32f6c740, number=c7, tries=0
  179.      for 'latch: shared pool' count=1 wait_time=526
  180.                 address=38016eb18, number=d5, tries=0
  181.      for 'latch: row cache objects' count=1 wait_time=145
  182.                 address=d32f6c740, number=c7, tries=3
  183.      for 'latch: row cache objects' count=1 wait_time=109356
  184.                 address=d32f6c740, number=c7, tries=2
  185.      for 'latch: row cache objects' count=1 wait_time=88
  186.                 address=d32f6c740, number=c7, tries=1
  187.      for 'latch: row cache objects' count=1 wait_time=303532
  188.                 address=d32f6c740, number=c7, tries=0
  189.      for 'latch: shared pool' count=1 wait_time=493
  190.                 address=38016eb18, number=d5, tries=0
  191.      for 'latch: row cache objects' count=1 wait_time=151
  192.                 address=d32f6c740, number=c7, tries=3
  193.      for 'latch: row cache objects' count=1 wait_time=106144
  194.                 address=d32f6c740, number=c7, tries=2
  195.      for 'latch: row cache objects' count=1 wait_time=271
  196.                 address=d32f6c740, number=c7, tries=1
复制代码

回复 只看该作者 道具 举报

9#
发表于 2012-2-29 21:26:34
根据systemdump中        131 (2, 1330329779, 2)
           waiter count=21
          gotten 1003473699 times wait, failed first 10013735 sleeps 10852017
          gotten 4774155 times nowait, failed: 7174155
        possible holder pid = 120 ospid=194116

找到一个链表24->120->133-->30->26        24,120,133,30全部都在waiting for 70000076e8f8630 Child row cache objects,最后指向26进程,26进程是MMNL 进程正在 waiting for 'cursor: pin S wait on X'  ,而在systemdump中没有看到26号进程持有的row cache objects,大致猜测应该是产生了死锁,应该是waing 70000076e8f8630 的进程都被26号进程所阻塞,而26号进程可能是有这个row cache,而同时本身正在pin s某个对象(也可能是bug),而另一个进程正在持有这个对象的x锁,不过还是需要详细分析报告,不过报告里面systemdump文件貌似做了几次的state dump都在一个文件里面

回复 只看该作者 道具 举报

10#
发表于 2012-2-29 21:38:10
Note [ID 843638.1] 描述了一个 因为LCK process 尝试释放 dictionary cache 而引起 系统hang住的问题,

判断的几个因素包括:

lck 进程的'sskgpwwait kslges kqrbip kqrbfr'.   匹配
shared KQR L PO free 的现象 匹配
等待事件 'latch: row cache objects'    匹配
Version: 10.2.0.1 to 11.1.0.7 - Release: 10.2 to 11.1   版本匹配

该Note 同时也介绍了一些建议 包括

1.  设置shared_pool_size 这个auto sga ASMM (sga_target>0) 下的 共享池最小值
2.  减少对 字典缓存dictionary cache  的使用  
3.  设置 _enable_shared_pool_durations  设置该参数可以避免 shared pool的收缩

_enable_shared_pool_durations
False if sga_target not set
False if cursor_space_for_time is true

4. apply patch 8666117 This patch speedup the processing to free memory.
References

该patch 目前存在 AIX 10.2.0.4 的对应版本



patch_8666117.png

回复 只看该作者 道具 举报

11#
发表于 2012-2-29 21:42:57
shared pool共享池中的 KQR L PO  这部分是做什么的,有没有关于共享池中各个组件的说明给推荐一下,谢谢

回复 只看该作者 道具 举报

12#
发表于 2012-2-29 21:56:25

回复 11# 的帖子

> KQR L PO stands for *K*ernel *Q*uery layer *R*ow cache *L*arge *P*arent *O
> *bject. These are dictionary cache objects.
>
> So, you either have a row cache object leak (less likely) or just a lot of
> objects to keep in row cache (row cache = dictionary cache).
>
> For example, if you have a lots of tables with lots of columns and all of
> these columns have histograms, then you'll have lots of histogram data
> cached in dictionary cache's *dc_histogram_defs* section.


KQR L PO
*K*ernel *Q*uery layer *R*ow cache *L*arge *P*arent *O*bject.


SQL> select 'shared pool', ksmssnam, sum(ksmsslen)
  2    from x$ksmss
  3   where ksmsslen > 1
  4     and ksmssnam like 'KQR%'
  5   group by inst_id, 'shared pool', ksmssnam
  6  ;

'SHAREDPOOL KSMSSNAM                   SUM(KSMSSLEN)
----------- -------------------------- -------------
shared pool KQR L SO                          809984
shared pool KQR S SO                           16912
shared pool KQR M SO                         3508656
shared pool KQR X PO                         3422248
shared pool KQR L PO                         7159864

回复 只看该作者 道具 举报

13#
发表于 2012-2-29 22:30:54
maclean分析得太牛X了。相当受用。呵呵

回复 只看该作者 道具 举报

14#
发表于 2012-2-29 23:19:32
非常好的案例,值得分析。。佩服。

回复 只看该作者 道具 举报

15#
发表于 2012-3-1 08:59:10
> For example, if you have a lots of tables with lots of columns and all of
> these columns have histograms, then you'll have lots of histogram data
> cached in dictionary cache's *dc_histogram_defs* section.

学习了~~~原来柱状图的缓存在数据字典cache里~

回复 只看该作者 道具 举报

16#
发表于 2012-3-1 14:31:58

更改

把alter system set shared_pool_size=3500m scope=spfile; 之后, 先观察效果。
这个数据库总是这个样子,Hang着了找不到holder是谁。      晕菜

回复 只看该作者 道具 举报

17#
发表于 2012-3-1 15:16:50

回复 16# 的帖子

实际上我更建议 设置   设置 _enable_shared_pool_durations=false
设置该参数可以避免 shared pool的收缩

回复 只看该作者 道具 举报

18#
发表于 2012-7-11 11:04:34
如果SGA不是自动管理
那shared pool还会收缩吗?

回复 只看该作者 道具 举报

19#
发表于 2013-1-17 18:12:47
chen28 发表于 2012-7-11 11:04
如果SGA不是自动管理
那shared pool还会收缩吗?

99%不会了,1% 是因为11g的某些黑匣子隐藏参数

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-26 03:00 , Processed in 0.058297 second(s), 23 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569