龙盟编程博客 | 无障碍搜索 | 云盘搜索神器
快速搜索
主页 > 数据库类 > Oracle 技术 >

利用维对象来优化数据仓库的高级技巧

时间:2011-04-12 23:18来源:未知 作者:admin 点击:
分享到:
在Oracle的数据仓库(OLAP)中,实体化视图(MVIEW),查询重写(Query Rewrite)和维(Dimension)是非常重要的优化手段,对于前两者我不想在这儿重复讲了,主要来体验一下维的作用。要发挥维的作用

在Oracle的数据仓库(OLAP)中,实体化视图(MVIEW),查询重写(Query Rewrite)和维(Dimension)是非常重要的优化手段,对于前两者我不想在这儿重复讲了,主要来体验一下维的作用。要发挥维的作用,还是需要用到前面两者,下面是我设计的只有一个维表的最简单的例子。数据库用户除了connect, resource外, 还要给予Query Rewrite,Create Materialized View,Create Dimension权限。

1、创建一个维护表

CREATE TABLE TIME_DIM
AS
SELECT TO_CHAR(SYSDATE+ROWNUM,'YYYY') F_YEAR,
TO_CHAR(SYSDATE+ROWNUM,'YYYY-Q') F_QUATER,
TO_CHAR(SYSDATE+ROWNUM,'YYYY-MM') F_MONTH,
TRUNC(SYSDATE+ROWNUM,'DD') F_DAY
FROM DBA_OBJECTS
WHERE ROWNUM < 1000;
ALTER TABLE TIME_DIM MODIFY F_YEAR NOT NULL;
ALTER TABLE TIME_DIM MODIFY F_QUATER NOT NULL;
ALTER TABLE TIME_DIM MODIFY F_MONTH NOT NULL;
ALTER TABLE TIME_DIM MODIFY F_DAY NOT NULL;
ALTER TABLE TIME_DIM ADD PRIMARY KEY (F_DAY);

2、创建一个事实表

CREATE TABLE FACT_SALES
AS
SELECT TRUNC(SYSDATE+ROWNUM,'DD') F_DAY,
TRUNC(DBMS_RANDOM.VALUE * 10000,2)  M_AMOUNT1,
TRUNC(DBMS_RANDOM.VALUE * 10000,2)  M_AMOUNT2
FROM DBA_OBJECTS
WHERE ROWNUM < 1000;

-- Please execute the following insert multiple times
INSERT INTO FACT_SALES
SELECT F_DAY,
TRUNC(DBMS_RANDOM.VALUE * 10000,2)  M_AMOUNT1,
TRUNC(DBMS_RANDOM.VALUE * 10000,2)  M_AMOUNT2
FROM FACT_SALES

ALTER TABLE FACT_SALES MODIFY F_DAY NOT NULL;
ALTER TABLE FACT_SALES ADD FOREIGN KEY (F_DAY) REFERENCES TIME_DIM(F_DAY);

3、将数据按月份进行汇总生成中间表

CREATE MATERIALIZED VIEW MV_FACT_SALES
ENABLE QUERY REWRITE
AS
SELECT D.F_MONTH,
SUM(F.M_AMOUNT1) M_AMOUNT1,
SUM(F.M_AMOUNT2) M_AMOUNT2
FROM TIME_DIM D, FACT_SALES F
WHERE D.F_DAY = F.F_DAY
GROUP BY D.F_MONTH
/

4、分析表, 并在会话级启用查询重写

ANALYZE TABLE TIME_DIM COMPUTE STATISTICS;
ANALYZE TABLE FACT_SALES  COMPUTE STATISTICS;
ANALYZE TABLE MV_FACT_SALES COMPUTE STATISTICS;
ALTER SESSION SET QUERY_REWRITE_ENABLED=TRUE;
ALTER SESSION SET QUERY_REWRITE_INTEGRITY=TRUSTED;

5、先来看一下按季汇总的SQL语句的执行计划, 理论上, 可以从按月的汇总中进一步汇总生成, 但这时去执行的话, Oracle并不能这样执行

ASQL> SELECT D.F_QUATER,
2        SUM(F.M_AMOUNT1) M_AMMOUNT1,
3        SUM(F.M_AMOUNT2) M_AMMOUNT2
4 FROM TIME_DIM D, FACT_SALES F
5 WHERE D.F_DAY = F.F_DAY
6 GROUP BY D.F_QUATER
7 /

Execute Plan
SQLPLAN                                                  COST   CARD KBYTE PS PE
-------------------------------------------------------- ---- ------ ----- -- --
0     SELECT STATEMENT Optimizer=ALL_ROWS               626     12     1     
1   0   HASH (GROUP BY)                                 626     12     1     
2   1     HASH JOIN                                     609 287712 16858     
3   2       TABLE ACCESS (FULL) OF TIME_DIM (TABLE)       4    999    13     
4   2       TABLE ACCESS (FULL) OF FACT_SALES (TABLE)   602 287712 13206 

6、接下来来创建一个维对象, 用来告诉Oracle在TIME_DIM表的四个字段上存在的树状关系, 如果没有这个声明, Oracle会认为数据是不附合这个树状关系的

CREATE DIMENSION TIME_DIM
LEVEL YEAR IS (TIME_DIM.F_YEAR)
LEVEL QUATER IS (TIME_DIM.F_QUATER)
LEVEL MONTH IS (TIME_DIM.F_MONTH)
LEVEL DAY IS (TIME_DIM.F_DAY)
HIERARCHY Y_Q_M_D
(
DAY CHILD OF
MONTH CHILD OF
QUATER CHILD OF  YEAR
)
HIERARCHY Y_M_D
(
DAY CHILD OF
MONTH CHILD OF YEAR
)

7、建好维后, 重新跑按季度汇总或按年汇总的SQL, 看他们的执行计划有什么不同?

ASQL> SELECT D.F_QUATER,
2        SUM(F.M_AMOUNT1) M_AMMOUNT1,
3        SUM(F.M_AMOUNT2) M_AMMOUNT2
4 FROM TIME_DIM D, FACT_SALES F
5 WHERE D.F_DAY = F.F_DAY
6 GROUP BY D.F_QUATER
7 /

Execute Plan
SQLPLAN                                                     COST CARD KBYTE PS PE
----------------------------------------------------------- ---- ---- ----- -- --
0     SELECT STATEMENT Optimizer=ALL_ROWS                   10   12     1     
1   0   HASH (GROUP BY)                                     10   12     1     
2   1     HASH JOIN                                          9  289    17     
3   2       MAT_VIEW REWRITE ACCESS (FULL) OF MV_FACT_...    3   34     2     
4   2       VIEW OF                                          5  289     4     
5   4         HASH (UNIQUE)                                  5  289     4     
6   5           TABLE ACCESS (FULL) OF TIME_DIM (TABLE)      4  999    13     

ASQL> SELECT /*+ all_rows */ D.F_YEAR,
2        SUM(F.M_AMOUNT1) M_AMMOUNT1,
3        SUM(F.M_AMOUNT2) M_AMMOUNT2
4 FROM TIME_DIM D, FACT_SALES F
5 WHERE D.F_DAY = F.F_DAY
6 GROUP BY D.F_YEAR
7 /

Execute Plan
SQLPLAN                                                      COST CARD KBYTE PS PE
------------------------------------------------------------ ---- ---- ----- -- --
0     SELECT STATEMENT Optimizer=HINT: ALL_ROWS              10    4     0     
1   0   HASH (GROUP BY)                                      10    4     0     
2   1     HASH JOIN                                           9   97     6     
3   2       MAT_VIEW REWRITE ACCESS (FULL) OF MV_FACT_S...    3   34     2     
4   2       VIEW OF                                           5   97     1     
5   4         HASH (UNIQUE)                                   5   97     1     
6   5           TABLE ACCESS (FULL) OF TIME_DIM (TABLE)       4  999    11      

我不太会用语言表解说, 因此设计了这个实例来说明, 如果看不懂, 请多看几次吧!

(责任编辑 火凤凰 sunsj@51cto.com  QQ:34067741  TEL:(010)68476636-8007)


精彩图集

赞助商链接