Oracle数据库数据恢复、性能优化»论坛 › Oracle › Oracle数据库管理 › 请问刘大一个关于cache buffers chains的问题。 ...

lhpapa

86 积分	0 好友	2 主题

发消息

请问刘大一个关于cache buffers chains的问题。

1^#

发表于 2012-5-2 08:53:00 | 查看: 13432| 回复: 12

由于没有代码，所以sql语句没法改。
1、4_28_awr.html是最初的状态cache buffers chains 征用严重，后来我把热点的2个索引INFO_KEYWORD、PKKEY_ARTICLE单独挪到了16k和4k的非标准表空间，这样多增加了几个池子希望可以降低征用。
2、4_28_2awr.html虽然cache buffers chains 还是比较多，但是waitstime比较低，没有影响前台使用。
3、428_3_awr.html 是刚才cpu的使用率又上去了，做的awr报告，发现cache buffers chains仍然较多，waitstime明显上去了。

请教：因为没办法修改sql语句的程序代码，只能从数据库层面入手。尽量降低热点块。是不是可以尝试将2个索引表空间的pctfree再增加到30%或者更高。或者2个索引采用4k的非标准块，这样分布在每个块中的行会比较少一点。我觉得我设置的16k块大小可能有点问题，盼赐教。

[ 本帖最后由 lhpapa 于 2012-5-2 08:54 编辑 ]

4_28_2awr.html

217.44 KB, 下载次数: 769

4_28_awr.html

260.18 KB, 下载次数: 764

428_3_awr.html

215.35 KB, 下载次数: 770

分享0

收藏0 回复只看该作者道具举报

jxztj2010

13^#

发表于 2012-7-4 15:51:03

今天我客户top 5中也有这个，但是我感觉那个客户是由低效的sql引起的，但是我看了这个有点模糊。还要继续学习，加油

回复只看该作者道具举报

ricky

12^#

发表于 2012-7-4 15:27:43

回复 8# 的帖子

写的好，我对cbc等待又多了一层理解。

回复只看该作者道具举报

orafans

11^#

发表于 2012-5-2 14:08:57

excellent explain。

回复只看该作者道具举报

dai_xuej

10^#

发表于 2012-5-2 11:22:58

我也有这个错误理解

我也有这个错误理解以为读读，不会出现争用

回复只看该作者道具举报

不了峰

9^#

发表于 2012-5-2 11:02:00

真是明了了 thx

原来是我是一直纠结于(以为) 读读模式下，是不会产生等待。现在明白了

[ 本帖最后由不了峰于 2012-5-2 11:04 编辑 ]

回复只看该作者道具举报

Maclean Liu(刘相兵

8^#

发表于 2012-5-2 10:54:54

"Cache Buffers Chains Latch waits are caused by contention where multiple sessions waiting to read the same block.
那么与
read by others session 这个等待事件有什么区别？"

通俗的说 (不深究技术细节)
CBC latch  是大家都要逻辑读取同一个块，对于Consistent Read会使用kcbgtcr函数，  kcbgtcr 函数可能使用2中模式去get cache buffer chains，分别是       kcbgtcr: kslbegin excl 和 kcbgtcr: kslbegin shared，在上述AWR中主要是  kslbegin excl

“我们一般认为Latch结构是Mostly exclusive access的，也就是极少会有共享访问闩的机会。但Oracle一般对外宣称读取数据时服务进程是以共享模式使用cache buffers chains闩，这就造成了许多人误以为读读是不会出现latch: cache buffers chains争用的。
但是实际上查询语句大部分情况下仍需要以exclusive模式持有该类子闩(有时会以SHARED模式持有，这取决于读取时是使用kcbgtcr: kslbegin shared还是kcbgtcr: kslbegin excl；kcbgtcr是Oracle rdbms中重要的获取一致性读的函数，其含义为Kernal Cache Buffer GeT Cosistents Read，显然该函数存在两种获取cache buffers chains的方式即kslbegin shared和excl。与之相对应的是kcbgcur: kslbegin，kcbgcur的含义为Kernel Cache Buffer Get Current，该函数用以获取当前块以便修改，也就是”写”；很显然kcbgcur: kslbegin函数只有以excl排他方式持有child cache buffers chains latch的必要)，原因在于虽然是查询语句但同样需要修改buffer header结构，譬如修改tch抢手度、holder list的hash变量us_nxt、us_prv以及waiter list的hash变量wa_prv、wa_nxt等。换而言之读读是会引起Latch free:cache buffers chains等待的，而非许多人认为的仅有读写、写写会导致缓存链闩争用。”

FROM http://www.oracledatabase12g.com ... buffers-chains.html

cache buffers chains    kcbgtcr: kslbegin excl    0    28,633,140    27,628,172
cache buffers chains    kcbrls: kslbegin    0    27,162,431    28,455,820
cache buffers chains    kcbgtcr: fast path    0    1,969    2,664

read by others session  是这个block不在缓存(buffer cache)中，大家都想要访问这个block，但是只需要一个人去做物理读取到Buffer cache，其他人在有一个人去做physical read 的情况下非空闲(non-idle wait)等待即可。

回复只看该作者道具举报

不了峰

7^#

发表于 2012-5-2 10:45:41

回复 6# 的帖子

请教一下：
按这句来说
Cache Buffers Chains Latch waits are caused by contention where multiple sessions waiting to read the same block.
那么与
read by others session 这个等待事件有什么区别？

回复只看该作者道具举报

Maclean Liu(刘相兵

6^#

发表于 2012-5-2 10:40:39

就症结来看是SQL语句存在优化空间，大多都Buffer Gets过高  这意味着大量的逻辑读

Logical reads: 952,053.46 511,723.68

在其中一个AWR中每秒逻辑读达到 7G

buffer gets 较高存在优化的SQL包括：

SQL ordered by Gets

Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
Total Buffer Gets: 1,733,719,818
Captured SQL account for 42.9% of Total

Buffer Gets Executions Gets per Exec %Total CPU Time (s) Elapsed Time (s) SQL Id SQL Module SQL Text
150,857,840 6 25,142,973.33 8.70 1296.33 1769.90 gmm6ktqrft6n1       select rowid, title, filep...
57,452,737 234 245,524.52 3.31 676.45 678.49 06r32459s5kzc       select rowid, ArticleID, S...
50,264,264 2 25,132,132.00 2.90 311.70 329.90 731rq4a5rd9by       select rowid, title, filep...
28,441,522 2 14,220,761.00 1.64 310.11 570.12 bxa7c0y5p6smg       select rowid, title, filep...
28,440,949 2 14,220,474.50 1.64 279.11 464.68 68pwkvkgs7taj       select rowid, title, filep...
28,433,242 2 14,216,621.00 1.64 234.45 391.91 f442cdh9hp6sk       select rowid, title, filep...

粗略看可能是因为希望执行计划准确而没有去绑定变量

这里考虑到无法修改SQL 语句，  但是有没有想过通过其他手段(不修改语句添加hint)而改善其执行计划？

Segments by Logical Reads

Total Logical Reads: 1,733,719,818
Captured Segments account for 98.8% of Total

Owner Tablespace Name Object Name Subobject Name Obj. Type Logical Reads %Total
JSCMS JSCMS INFO_KEYWORD       INDEX 1,159,873,824 66.90
JSCMS JSCMS PKKEY_ARTICLE       INDEX 215,263,104 12.42
JSCMS JSCMS ARTICLE SYS_P76 TABLE PARTITION 198,657,088 11.46

逻辑较高的 segment 包括 INFO_KEYWORD 和 PKKEY_ARTICLE  把他们移动到 block size的非标准表空间上确实可以一定程度缓解对这个segment 上buffer的交叉访问争用，但是使用 16KB的block size意味着数据行的分布更为集中，这可能导致段内buffer的争用加剧。

建议：

1.  考虑在不修改SQL文本的情况下调优SQL ，包括使用SQL Profile等技术
2.  考试使用 global hash index ，当然hash index不是一定能缓解cbc
3.  内存允许的话可以吧一个索引 keep到 db cache keep pool中(而非16k pool)，另一个使用 4k pool

一些Cache buffer chains相关的master Notes:

ODM FINDING:

Cache Buffers Chains Latch waits are caused by contention where multiple sessions waiting to read the same block.

Typical solutions are:-
o Look for SQL that accesses the blocks in question and determine if the repeated reads are necessary.
o Check for suboptimal SQL (this is the most common cause of the events) - look at the execution plan for the
SQL being run and try to reduce the gets per executions which will minimise the number of blocks being accessed
and therefore reduce the chances of multiple sessions contending for the same block

Note 34405.1 WAITEVENT: "buffer busy waits" Reference Note
@Note 42152.1 LATCH: CACHE BUFFERS CHAINS
Note 155971.1 Ext/Pub Resolving Intense and "Random" Buffer Busy Wait Performance Problems:
Note 163424.1 Ext/Pub How To Identify a Hot Block Within The Database Buffer Cache.:

These queries would benefit from tuning. They either do too much buffer gets (logical reads) per execution or just do a lot of buffer gets. Tuning these queries would lower the load on the CPU and reduce the CPU wait time. Check if all objects in these queries have representative and up to date stats present. Also check if all the indexes are present.
If a query does not do an excessive amount of gets for 1 run but when the query runs often, then lowering the amount of buffer gets per run with for example 10% will have a big impact overall.
To see the full SQL open the html AWR report and select SQL Statistics in the Main Report section, then select SQL ordered by Gets clicking on the SQL id then gives the complete statement.

回复只看该作者道具举报