Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

89

积分

0

好友

0

主题
1#
发表于 2012-7-10 21:24:35 | 查看: 6368| 回复: 6
最近被老板要求写一个监控的脚本去检测DB hang的case:就是用户进程还是可以连接到oracle,但DB很忙,用户执行一个sql语句几分钟内都没反应。
要写这个监控脚本首先要重现这个情景,  请问我应该如何做才能重现?
2#
发表于 2012-7-16 17:29:37
顶上来                                    

回复 只看该作者 道具 举报

3#
发表于 2012-7-16 21:12:35
找个性能差的PC,往几张无索引的表里面插入500W+的记录应该可以达到效果

回复 只看该作者 道具 举报

4#
发表于 2012-7-16 22:01:33
ODM TEST:




select to_number(addr,'xxxxxxxxxxxxxxxx') from v$latch_parent where name='process allocation';
select name from v$latch_parent where name like '%library%';
select name from v$latch_children where name like '%library%';
select name from v$latch_parent where name like '%process%';



session A:


SQL> select to_number(addr,'xxxxxxxxxxxxxxxx') from v$latch_parent where name='process allocation';

TO_NUMBER(ADDR,'XXXXXXXXXXXXXXXX')
----------------------------------
                        1610642928

                                               
                                               
SQL> select pid,program from v$process;

       PID PROGRAM
---------- ------------------------------------------------
         1 PSEUDO
         2 oracle@vrh8.oracle.com (PMON)
         3 oracle@vrh8.oracle.com (PSP0)
         4 oracle@vrh8.oracle.com (MMAN)
         5 oracle@vrh8.oracle.com (DBW0)
         6 oracle@vrh8.oracle.com (LGWR)
         7 oracle@vrh8.oracle.com (CKPT)
         8 oracle@vrh8.oracle.com (SMON)
         9 oracle@vrh8.oracle.com (RECO)
        10 oracle@vrh8.oracle.com (CJQ0)
        11 oracle@vrh8.oracle.com (MMON)

SQL> oradebug setorapid 2;
Unix process pid: 6587, image: oracle@vrh8.oracle.com (PMON)


SQL> oradebug suspend;
Statement processed.               
               

此时仍可以登录成功


[oracle@vrh8 ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 16 09:17:06 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>



session A:

SQL> oradebug call kslgetl 1610642928 1;
Function returned 1               





此时SYSDBA仍能登录,但是非SYSDBA已经不能登录

[oracle@vrh8 ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 16 09:17:06 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@vrh8 ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 16 09:18:54 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

Connected.
SQL> Disconnected
[oracle@vrh8 ~]$ sqlplus system/oracle

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 16 09:19:02 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

ERROR:
ORA-03113: end-of-file on communication channel



释放的方法:

退出原session A, 并conn / as sysdba后

SQL> oradebug setorapid 2;
Unix process pid: 6587, image: oracle@vrh8.oracle.com (PMON)


SQL> oradebug resume;
Statement processed






以上实现了模拟 用户无法正常登录的情况

SQL> oradebug setorapid 2;
Unix process pid: 6587, image: oracle@vrh8.oracle.com (PMON)
SQL>
SQL>
SQL> oradebug suspend;     
Statement processed.


我们把所有的shared pool latch全部 poke住


select 'oradebug poke 0x'||addr||' 1 0x01' from v$latch_children where name='shared pool';


'ORADEBUGPOKE0X'||ADDR||'10X01'
--------------------------------------------------------------------------------
oradebug poke 0x00000000600E91A0 1 0x01
oradebug poke 0x00000000600E9240 1 0x01
oradebug poke 0x00000000600E92E0 1 0x01
oradebug poke 0x00000000600E9380 1 0x01
oradebug poke 0x00000000600E9420 1 0x01
oradebug poke 0x00000000600E94C0 1 0x01
oradebug poke 0x00000000600E9560 1 0x01

7 rows selected.

SQL>
SQL> oradebug poke 0x00000000600E91A0 1 0x01
BEFORE: [0600E91A0, 0600E91A4) = 00000000
AFTER:  [0600E91A0, 0600E91A4) = 00000001
SQL> oradebug poke 0x00000000600E9240 1 0x01
BEFORE: [0600E9240, 0600E9244) = 00000000
AFTER:  [0600E9240, 0600E9244) = 00000001
SQL> oradebug poke 0x00000000600E92E0 1 0x01
BEFORE: [0600E92E0, 0600E92E4) = 00000000
AFTER:  [0600E92E0, 0600E92E4) = 00000001
SQL> oradebug poke 0x00000000600E9380 1 0x01
BEFORE: [0600E9380, 0600E9384) = 00000000
AFTER:  [0600E9380, 0600E9384) = 00000001
SQL> oradebug poke 0x00000000600E9420 1 0x01
BEFORE: [0600E9420, 0600E9424) = 00000000
AFTER:  [0600E9420, 0600E9424) = 00000001
SQL> oradebug poke 0x00000000600E94C0 1 0x01
BEFORE: [0600E94C0, 0600E94C4) = 00000000
AFTER:  [0600E94C0, 0600E94C4) = 00000001
SQL> oradebug poke 0x00000000600E9560 1 0x01
BEFORE: [0600E9560, 0600E9564) = 00000000
AFTER:  [0600E9560, 0600E9564) = 00000001






这将导致系统全部hang, 已经连接的会话无法执行SQL, 而新的登录将无法成功


[oracle@vrh8 ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 16 09:58:37 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

hang................................



10g新特性的prelim可以登录成功:


[oracle@vrh8 ~]$ sqlplus -prelim / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 16 09:58:49 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


[oracle@vrh8 ~]$ sqlplus -prelim / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 16 09:58:49 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

SQL> oradebug setmypid;
Statement processed.
SQL>
SQL> oradebug dump systemstate 266;

SQL> oradebug tracefile_name
/s01/admin/G10R25/udump/g10r25_ora_12839.trc







ASS AWK的分析结果


[oracle@vrh8 ~]$ awk -f ass109.awk.1  /s01/admin/G10R25/udump/g10r25_ora_12839.trc

Starting Systemstate 1
................................
Ass.Awk Version 1.0.9 - Processing /s01/admin/G10R25/udump/g10r25_ora_12839.trc

System State 1
~~~~~~~~~~~~~~~~
1:                                      
2:  waiting for 'pmon timer'            
3:  waiting for 'rdbms ipc message'     
4:  waiting for 'rdbms ipc message'     
5:  waiting for 'rdbms ipc message'     
6:  waiting for 'rdbms ipc message'     
7:  last wait for 'control file sequential read'
8:  waiting for 'latch: library cache' [Latch 9c1479a0]
9:  waiting for 'latch: shared pool'   [Latch 600e91a0]
10: waiting for 'latch: shared pool'   [Latch 600e91a0]
11: waiting for 'latch: shared pool'   [Latch 600e91a0]
12: waiting for 'rdbms ipc message'     
13:                                    
14:                                    
15: waiting for 'SQL*Net message from client'
16: waiting for 'rdbms ipc message'     
17:                                    
18:                                    
19: waiting for 'rdbms ipc message'     
20: waiting for 'rdbms ipc message'     
21: waiting for 'Streams AQ: qmn coordinator idle wait'
22: waiting for 'latch: shared pool'   [Latch 600e91a0]
23: waiting for 'latch: shared pool'   [Latch 600e91a0]
24: waiting for 'SQL*Net message from client'
25: waiting for 'latch: library cache' [Latch 9c1479a0]
26: waiting for 'latch: library cache' [Latch 9c147860]
     Cmd: PL/SQL Execute
27: for 'Streams AQ: waiting for messages in the queue'
28: for 'Streams AQ: waiting for time management or cleanup tasks'[Latch 9c147860]
29: waiting for 'Streams AQ: qmn slave idle wait'
30: waiting for 'latch: shared pool'   [Latch 600e91a0]
31: waiting for 'latch: library cache' [Latch 9c147860]
     Cmd: PL/SQL Execute
32:                                    [Latch 600e91a0]
Blockers
~~~~~~~~

        Above is a list of all the processes. If they are waiting for a resource
        then it will be given in square brackets. Below is a summary of the
        waited upon resources, together with the holder of that resource.
        Notes:
        ~~~~~
         o A process id of '???' implies that the holder was not found in the
           systemstate.

                    Resource Holder State
              Latch 9c1479a0    ??? Blocker
              Latch 600e91a0    ??? Blocker
              Latch 9c147860    ??? Blocker

Object Names
~~~~~~~~~~~~
Latch 9c1479a0  Child library cache           
Latch 600e91a0  Child shared pool            
Latch 9c147860  Child library cache           


28454 Lines Processed.

回复 只看该作者 道具 举报

5#
发表于 2012-7-17 14:09:51
虽然看的有点模糊,但我还要继续努力研究下,谢谢

回复 只看该作者 道具 举报

6#
发表于 2012-7-24 17:26:32
原帖由 maclean 于 2012-7-16 22:01 发表
ODM TEST:
我们把所有的shared pool latch全部 poke住
select 'oradebug poke 0x'||addr||' 1 0x01' from v$latch_children where name='shared pool';


谢谢maclean的详细回复! 我的oracle版本是11.2.0.2, 把一个shared pool latch poke住之后 用另一个session虽然可以sqlplus -prelim '/ as sysdba' 连进DB, 但oradebug dump systemstate出错:



  1. oracle@dvm10:/home/oracle>sqlplus -prelim '/ as sysdba'

  2. SQL*Plus: Release 10.2.0.1.0 - Production on Tue Jul 24 17:21:49 2012

  3. Copyright (c) 1982, 2005, Oracle.  All rights reserved.

  4. SQL> oradebug setmypid;
  5. Statement processed.
  6. SQL> oradebug dump systemstate 266;
  7. ORA-03113: end-of-file on communication channel
  8. ORA-24323: value not allowed
  9. SQL> oradebug tracefile_name
  10. ORA-24324: service handle not initialized
  11. ORA-01041: internal error. hostdef extension doesn't exist
  12. ORA-24323: value not allowed
复制代码

回复 只看该作者 道具 举报

7#
发表于 2012-7-24 22:45:41

回复 4# 的帖子

模拟不了
SQL> oradebug call kslgetl 536886644 1
ORA-00074: no process has been specified
SQL> oradebug call kslgetl 536886644 2
ORA-00074: no process has been specified
SQL> oradebug call kslgetl 536886644 3
ORA-00074: no process has been specified

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-11-16 00:18 , Processed in 0.054415 second(s), 21 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569