12 积分	0 好友	2 主题

发消息

hash 分区表的partition数是否有特殊要求？

1^#

发表于 2012-8-7 14:14:52 | 查看: 5376| 回复: 2

刘大，求教一个问题，两张大表 A（100W数据）与 B（5000W数据）关联查询执行计划是HASH JOIN ，而且查询时间要200多秒，我想把表A与B根据关联id分别建成HASH分区表，但在网上看到一篇文章说 “hash分区表的分区个数较多,那每个分区段上的事务都会在v$lock上有所反映；如果hash partition的分区个数不是2的幂,则可能会造成每个hash分区段中记录数不够均衡 ”，是否只能建成 2、4、8、16 这种类型的？还有我们表空间是使用 assm uniform size 2M，在这个表空间上建hash分区表是否会浪费大量空间？最后一个问题，刘大以您以往经验，hash分区表在系统中用的比较多，以前都是用range 和 list分区，第一次使用hash分区比较没底

分享0

收藏0 回复只看该作者道具举报

Maclean Liu(刘相兵

2^#

发表于 2012-8-7 23:24:27

ODM FINDING:

A hash-partitioned table has 5 hash partitions, which are partitioned on a numeric column.  When 850 rows are inserted, the counts per partition appear skewed.

The rows are distributed as follows:

SQL> select count(*) from IDORRCC.PRUEBA_NACIMIENTOS partition(P1);

COUNT(*)
----------
      0

SQL> select count(*) from IDORRCC.PRUEBA_NACIMIENTOS partition(P2);

COUNT(*)
----------
      0

SQL> select count(*) from IDORRCC.PRUEBA_NACIMIENTOS partition(P3);

COUNT(*)
----------
   500

SQL> select count(*) from IDORRCC.PRUEBA_NACIMIENTOS partition(P4);

COUNT(*)
----------
   250

SQL> select count(*) from IDORRCC.PRUEBA_NACIMIENTOS partition(P5);

COUNT(*)
----------
   100
Cause
Bug 9348894 - ROWS PER PARTITION SKEWED IN HASH PARTITIONED TABLE was filed for this issue, which was closed as not a bug.
Solution
In order for rows inserted into a hash-partitioned table to be evenly spread among all partitions, the following conditions must be met:

The number of partitions needs to be a power of 2 (ex: 2, 4, 8, 16, 32, etc.).
The partitioning keys need to be sequential (or near sequential).

Note: A partitioning key will always map to the same partition; that is, it always gets the same hash value that is used to map the row to a partition. You could, for example, see partitions that do not get rows if you load a hash-partitioned table from a partitioned table that has had partitions that were dropped or had data exchanged out to an exchange table (i.e. the partition is empty).

For keys that are of data type varchar2, the hash will be generated from the ascii value of the key and then mapped to a partition, based on the number of partitions.

If you need to know the hash value for a particular key, please refer to Document 67631.1 FUNCTION DBMS_UTILITY.GET_HASH_VALUE.

为了避免分区内的行数倾斜，如上面的例子使用5个hash partition 导致分区内的行数有多又少不均匀，  所以oracle推荐使用 2, 4, 8, 16, 32, 这样的分区数目

uniform size 2M，在这个表空间上建hash分区表是否会浪费大量空间？

如果你的单个segment的大小较大例如超过 1个G 那么使用2MB的extent size 谈不上什么浪费

hash partition 的存在主要是为了缓解热块和插入并行冲突，  和 range 、list 的使用目的不一样，利用分区partitioning特性是为了实现一些目的，如果hash partiton可以满足你的目的那么就ok

回复只看该作者道具举报

fufuh2o

3^#

发表于 2012-12-6 13:23:27

不错,学习了

回复只看该作者道具举报

返回列表

		自动登录	找回密码
密码			注册