Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

75

积分

1

好友

8

主题
1#
发表于 2015-1-11 13:58:12 | 查看: 3901| 回复: 7
本帖最后由 yehc@epsoft.com 于 2015-1-11 13:59 编辑

1、环境描述:
  1. 3节点RAC
  2. 数据库版本:10.2.0.5.0
  3. OS : HP-UX 11i.V3
  4. 存储结构:Storage Foundation
复制代码
2、故障现象描述
---- Node 3 alert日志截取:
  1. Mon Dec 29 08:59:16 EAT 2014
  2. Thread 3 advanced to log sequence 26432 (LGWR switch)
  3.   Current log# 26 seq# 26432 mem# 0: /oradata/oradata/szyb/szyb_redo3_26-500m.log
  4. Mon Dec 29 10:11:48 EAT 2014
  5. Errors in file /oradata/admin/szyb/udump/szyb3_ora_7686.trc:
  6. ORA-27300: 操作系统系统相关操作: invalid_process_id 失败, 状态为: 0
  7. ORA-27301: 操作系统故障消息: Error 0
  8. ORA-27302: 错误发生在: skgpalive1
  9. Mon Dec 29 10:14:14 EAT 2014
  10. Errors in file /oradata/admin/szyb/udump/szyb3_ora_7686.trc:
  11. ORA-27300: 操作系统系统相关操作: invalid_process_id 失败, 状态为: 0
  12. ORA-27301: 操作系统故障消息: Error 0
  13. ORA-27302: 错误发生在: skgpalive1
  14. Mon Dec 29 10:16:53 EAT 2014
  15. >>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=16
  16. System State dumped to trace file /oradata/admin/szyb/bdump/szyb3_mmon_2733.trc
复制代码
----Node2 alert 日志截取:
  1. Mon Dec 29 09:32:26 EAT 2014
  2. Thread 2 cannot allocate new log, sequence 45567
  3. Checkpoint not complete
  4.   Current log# 9 seq# 45566 mem# 0: /oradata/oradata/szyb/szyb_redo2_9-500m.log
  5. Mon Dec 29 09:54:31 EAT 2014
  6. MMNL absent for 1201 secs; Foregrounds taking over
  7. Mon Dec 29 10:22:06 EAT 2014
  8. Process startup failed, error stack:
  9. Mon Dec 29 10:22:06 EAT 2014
  10. Errors in file /oradata/admin/szyb/bdump/szyb2_psp0_5034.trc:
  11. ORA-27300: OS system dependent operation:fork failed with status: 12
  12. ORA-27301: OS failure message: Not enough space
  13. ORA-27302: failure occurred at: skgpspawn3
  14. Mon Dec 29 10:22:07 EAT 2014
  15. Process J008 died, see its trace file
  16. Mon Dec 29 10:22:07 EAT 2014
  17. kkjcre1p: unable to spawn jobq slave process
  18. Mon Dec 29 10:22:07 EAT 2014
  19. Errors in file /oradata/admin/szyb/bdump/szyb2_cjq0_5056.trc:

  20. Mon Dec 29 10:22:07 EAT 2014
  21. Process startup failed, error stack:
  22. Mon Dec 29 10:22:07 EAT 2014
  23. Errors in file /oradata/admin/szyb/bdump/szyb2_psp0_5034.trc:
  24. ORA-27300: OS system dependent operation:fork failed with status: 12
  25. ORA-27301: OS failure message: Not enough space
  26. ORA-27302: failure occurred at: skgpspawn3
  27. Mon Dec 29 10:22:08 EAT 2014
  28. Process J008 died, see its trace file
  29. Mon Dec 29 10:22:08 EAT 2014
  30. kkjcre1p: unable to spawn jobq slave process
  31. Mon Dec 29 10:22:08 EAT 2014
  32. Errors in file /oradata/admin/szyb/bdump/szyb2_cjq0_5056.trc:
复制代码
---Node 1 alert 日志截取:
  1. Mon Dec 29 08:44:49 EAT 2014
  2. Thread 1 advanced to log sequence 35397 (LGWR switch)
  3.   Current log# 14 seq# 35397 mem# 0: /oradata/oradata/szyb/szyb_redo1_14-500m.log
  4. Mon Dec 29 09:15:05 EAT 2014
  5. Thread 1 advanced to log sequence 35398 (LGWR switch)
  6.   Current log# 15 seq# 35398 mem# 0: /oradata/oradata/szyb/szyb_redo1_15-500m.log
  7. Mon Dec 29 10:17:34 EAT 2014
  8. Shutting down instance (immediate)
  9. Mon Dec 29 10:17:34 EAT 2014
  10. Shutting down instance: further logons disabled
  11. Mon Dec 29 10:18:22 EAT 2014
  12. kkjcre1p: unable to spawn jobq slave process, error 1089
  13. Mon Dec 29 10:18:27 EAT 2014
  14. kkjcre1p: unable to spawn jobq slave process, error 1089
  15. Mon Dec 29 10:18:32 EAT 2014
  16. kkjcre1p: unable to spawn jobq slave process, error 1089
  17. Mon Dec 29 10:18:37 EAT 2014
  18. kkjcre1p: unable to spawn jobq slave process, error 1089
  19. Mon Dec 29 10:18:42 EAT 2014
  20. kkjcre1p: unable to spawn jobq slave process, error 1089
  21. Mon Dec 29 10:18:47 EAT 2014
  22. kkjcre1p: unable to spawn jobq slave process, error 1089
  23. Mon Dec 29 10:18:52 EAT 2014
  24. kkjcre1p: unable to spawn jobq slave process, error 1089
复制代码
类似情况出现了2次,到目前也没用头绪,希望各位大神帮定位定位!!!

szyb3_mmon_2733.rar

910.48 KB, 下载次数: 1555

2#
发表于 2015-1-11 15:03:55
odm finding:

[oracle@vrh8 ~]$ awk -f ass1033.awk szyb3_mmon_2733.trc

Starting Systemstate 1
..............................................................................
...............................................................................
..............
Ass.Awk Version 1.0.33
~~~~~~~~~~~~~~~~~~~~~~
Source file : szyb3_mmon_2733.trc

System State 1  (2014-12-29 10:16:53.844)
~~~~~~~~~~~~~~   ~~~~~~~~~~~~~~~~~~~~~~~
WARNING: The following processes had a corrupted / in-flux state object tree :
Process 156: at line 176333

1:                                      [DEAD]
2:  waiting for 'pmon timer'            
3:  waiting for 'DIAG idle wait'        
4:  waiting for 'rdbms ipc message'     
5:  waiting for 'rdbms ipc message'     
6:  waiting for 'ges remote message'   
7:  last wait for 'gcs remote message'  
8:  last wait for 'gcs remote message'  
9:  waiting for 'rdbms ipc message'     
10: waiting for 'rdbms ipc message'     
11: waiting for 'rdbms ipc message'     
12: waiting for 'rdbms ipc message'     
13: waiting for 'enq: RO - fast object reuse'[Enq RO-0003000D-00000001]
14: waiting for 'rdbms ipc message'     
15: waiting for 'rdbms ipc message'     
16: last wait for 'ksdxexeotherwait'   [Rcache object=c0000004fdd21218,]
17: last wait for 'rdbms ipc message'   
18:                                    
19:                                    
20: waiting for 'rdbms ipc message'     
21: waiting for 'DFS lock handle'      
     Cmd: Insert
22: waiting for 'DFS lock handle'      
     Cmd: Insert
23: waiting for 'rdbms ipc message'     
24: waiting for 'rdbms ipc message'     
25: waiting for 'rdbms ipc message'     
26: waiting for 'Streams AQ: qmn coordinator idle wait'
27: waiting for 'SQL*Net message from client'
28: waiting for 'SQL*Net message from client'
29: waiting for 'SQL*Net message from client'
30: waiting for 'Streams AQ: waiting for time management or cleanup tasks'
31:                                    
32: waiting for 'SQL*Net message from client'
33: waiting for 'SQL*Net message from client'
34: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
35: waiting for 'enq: TX - row lock contention'[Enq TX-00070017-008921CA]
     Cmd: Call Method
36: waiting for 'SQL*Net message from client'
37: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
38: waiting for 'SQL*Net message from client'
39: waiting for 'enq: SQ - contention'  
     Cmd: Call Method
40: waiting for 'DFS lock handle'      
     Cmd: Insert
41: waiting for 'DFS lock handle'      
     Cmd: Insert
42: waiting for 'enq: SQ - contention'  
     Cmd: Call Method
43:                                    
44: waiting for 'enq: TX - row lock contention'[Enq TX-00070017-008921CA]
     Cmd: Call Method
45:                                    
46: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
47: waiting for 'DFS lock handle'      
     Cmd: Insert
48: waiting for 'DFS lock handle'      
     Cmd: Insert
49: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
50: waiting for 'gc current request'   
     Cmd: Call Method
51: waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
52: waiting for 'enq: TX - row lock contention'[Enq TX-00070017-008921CA]
     Cmd: Call Method
53: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
54: waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
55: waiting for 'DFS lock handle'      
     Cmd: Select
56: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
57: waiting for 'gc buffer busy'        
     Cmd: Call Method
58: waiting for 'SQL*Net message from client'
59: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
60: waiting for 'SQL*Net message from client'
61: waiting for 'enq: TX - row lock contention'[Enq TX-00130003-008C762F]
     Cmd: Call Method
62:                                    
63: waiting for 'enq: TX - row lock contention'[Enq TX-00070017-008921CA]
     Cmd: Call Method
64: waiting for 'DFS lock handle'      
     Cmd: Insert
65:                                    
66: waiting for 'DFS lock handle'      
     Cmd: Select
67: waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
68: waiting for 'gc current request'   
     Cmd: Call Method
69: waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
70: waiting for 'enq: SQ - contention'  
     Cmd: Call Method
71: waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
72: waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
73: waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
74: waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
75: waiting for 'enq: TX - row lock contention'[Enq TX-00130003-008C762F]
     Cmd: Call Method
76:                                    
77: waiting for 'Streams AQ: qmn slave idle wait'
78: waiting for 'DFS lock handle'      
     Cmd: Insert
79: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
80: waiting for 'SQL*Net message from client'
81: waiting for 'SQL*Net message from client'
82:                                    
83: waiting for 'SQL*Net message from client'
84: waiting for 'enq: SQ - contention'  
     Cmd: Call Method
85: waiting for 'SQL*Net message from client'
86: waiting for 'SQL*Net message from client'
87: waiting for 'enq: SQ - contention'  
     Cmd: Call Method
88: waiting for 'SQL*Net message from client'
     Cmd: Select
89: waiting for 'enq: SQ - contention'  
     Cmd: Call Method
90: waiting for 'enq: RO - fast object reuse'[Enq RO-0003005A-00000001]
91: waiting for 'DFS lock handle'      
     Cmd: Select
92:                                    
93: waiting for 'SQL*Net message from client'
94: waiting for 'SQL*Net message from client'
95: waiting for 'enq: SQ - contention'  
     Cmd: Call Method
96: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
97: waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
98: waiting for 'SQL*Net message from client'
     Cmd: Select
99: waiting for 'DFS lock handle'      
     Cmd: Insert
100:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
101:waiting for 'SQL*Net message from client'
102:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
103:waiting for 'SQL*Net message from client'
     Cmd: Select
104:waiting for 'enq: SQ - contention'  
     Cmd: Call Method
105:waiting for 'enq: SQ - contention'  
     Cmd: Call Method
106:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
107:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
108:waiting for 'enq: SQ - contention'  
     Cmd: Call Method
109:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
110:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
111:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
112:waiting for 'SQL*Net message from client'
113:waiting for 'SQL*Net message from client'
114:waiting for 'SQL*Net message from client'
115:waiting for 'SQL*Net message from client'
116:waiting for 'enq: SQ - contention'  
     Cmd: Call Method
117:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
118:waiting for 'enq: SQ - contention'  
     Cmd: Call Method
119:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
120:waiting for 'SQL*Net message from client'
121:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
122:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
123:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
124:waiting for 'SQL*Net message from client'
     Cmd: Select
125:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
126:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
127:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
128:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
129:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
130:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
131:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
132:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
133:waiting for 'gc current request'   
     Cmd: PL/SQL Execute
134:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
135:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
136:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
137:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
138:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
139:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
140:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
141:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
142:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
143:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
144:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
145:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
146:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
147:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
148:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
149:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
150:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
151:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
152:waiting for 'gc buffer busy'       [Buffer 0x07c26d24]
     Cmd: Call Method
153:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
154:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
155:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
156:waiting for 'gc buffer busy'        
     Cmd: Call Method
161:waiting for 'row cache lock'       [Rcache object=c0000004fdd21218,]
     Cmd: Call Method
256:waiting for 'DFS lock handle'      
     Cmd: Insert
307:waiting for 'SQL*Net message from client'
321:                                    
441:waiting for 'SQL*Net message from client'
446:                                    
558:waiting for 'DFS lock handle'       [DEAD]
     Cmd: Select
572:                                    
584:waiting for 'enq: SQ - contention'  [DEAD]
     Cmd: Call Method
623:waiting for 'enq: SQ - contention'  [DEAD]
     Cmd: Call Method
628:waiting for 'enq: SQ - contention'  [DEAD]
     Cmd: Call Method
629:waiting for 'enq: SQ - contention'  [DEAD]
     Cmd: Call Method
637:waiting for 'enq: SQ - contention'  [DEAD]
     Cmd: Call Method
656:waiting for 'DFS lock handle'       [DEAD]
     Cmd: Select
699:waiting for 'SQL*Net message from client'

Blockers
~~~~~~~~

        Above is a list of all the processes. If they are waiting for a resource
        then it will be given in square brackets. Below is a summary of the
        waited upon resources, together with the holder of that resource.
        Notes:
        ~~~~~
         o A process id of '???' implies that the holder was not found in the
           systemstate. (The holder may have released the resource before we
           dumped the state object tree of the blocking process).
         o Lines with 'Enqueue conversion' below can be ignored *unless*
           other sessions are waiting on that resource too. For more, see
           http://dlsunuk11.uk.oracle.com/Public/TOOLS/Ass.html#enqcnv)

                    Resource Holder State
    Enq RO-0003000D-00000001    12: waiting for 'rdbms ipc message'
    Enq RO-0003000D-00000001    13: 13: is waiting for 12: 13:
Rcache object=c0000004fdd21218,    68: waiting for 'gc current request'
           Buffer 0x07c26d24    50: waiting for 'gc current request'
    Enq TX-00070017-008921CA    ??? Blocker
    Enq TX-00130003-008C762F    ??? Blocker
    Enq RO-0003005A-00000001    12: waiting for 'rdbms ipc message'
    Enq RO-0003005A-00000001    90: 90: is waiting for 12: 90:

Blockers According to Tracefile Wait Info:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. This may not work for 64bit platforms. See bug 2902997 for details.
2. If the blocking process is shown as 0 then that session may no longer be
   present.
3. If resources are held across code layers then sometimes the tracefile wait
   info will not recognise the problem.

No blockers seen.

Object Names
~~~~~~~~~~~~
Enq RO-0003000D-00000001                                      
Rcache object=c0000004fdd21218, cid=3(dc_rollback_segments)   
Buffer 0x07c26d24                                             
Enq TX-00070017-008921CA                                      
Enq TX-00130003-008C762F                                      
Enq RO-0003005A-00000001                                    

回复 只看该作者 道具 举报

3#
发表于 2015-1-11 15:04:24
2节点:

ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-27301: OS failure message: Not enough space
ORA-27302: failure occurred at: skgpspawn3

Not enough space的具体原因是否已定位?

回复 只看该作者 道具 举报

4#
发表于 2015-1-11 15:39:20
本帖最后由 yehc@epsoft.com 于 2015-1-11 15:41 编辑
Maclean Liu(刘相兵 发表于 2015-1-11 15:04
2节点:

ORA-27300: OS system dependent operation:fork failed with status: 12


1号节点最先抛出Not enough space, Dec 29 00:30左右应该发起的NBU备份没有正常发起,前后差距显示对比如下:
--正常情况下:
  1. Sat Dec 27 22:20:14 EAT 2014
  2. Thread 1 advanced to log sequence 35378 (LGWR switch)
  3.   Current log# 15 seq# 35378 mem# 0: /oradata/oradata/szyb/szyb_redo1_15-500m.log
  4. Sun Dec 28 00:30:04 EAT 2014
  5. Thread 1 advanced to log sequence 35379 (LGWR switch)
  6.   Current log# 16 seq# 35379 mem# 0: /oradata/oradata/szyb/szyb_redo1_16-500m.log
  7. Sun Dec 28 03:11:06 EAT 2014
  8. Starting control autobackup
  9. Sun Dec 28 03:12:06 EAT 2014
  10. Control autobackup written to SBT_TAPE device
  11.         comment 'API Version 2.0,MMS Version 5.0.0.0',
  12.         media 'AA02L2'
  13.         handle 'c-2952192234-20141228-00'
  14. Sun Dec 28 03:12:13 EAT 2014
  15. ALTER SYSTEM ARCHIVE LOG
  16. Sun Dec 28 03:12:17 EAT 2014
  17. Thread 1 advanced to log sequence 35380 (LGWR switch)
  18.   Current log# 17 seq# 35380 mem# 0: /oradata/oradata/szyb/szyb_redo1_17-500m.log
  19. Sun Dec 28 03:12:22 EAT 2014
  20. ALTER SYSTEM ARCHIVE LOG
  21. Sun Dec 28 03:12:22 EAT 2014
  22. Thread 1 advanced to log sequence 35381 (LGWR switch)
  23.   Current log# 18 seq# 35381 mem# 0: /oradata/oradata/szyb/szyb_redo1_18-500m.log
复制代码
--29日未发起备份操作:
  1. Sun Dec 28 22:55:42 EAT 2014
  2. Thread 1 advanced to log sequence 35390 (LGWR switch)
  3.   Current log# 17 seq# 35390 mem# 0: /oradata/oradata/szyb/szyb_redo1_17-500m.log
  4. Mon Dec 29 00:22:10 EAT 2014
  5. Thread 1 advanced to log sequence 35391 (LGWR switch)
  6.   Current log# 18 seq# 35391 mem# 0: /oradata/oradata/szyb/szyb_redo1_18-500m.log
  7. Mon Dec 29 01:00:30 EAT 2014
  8. Thread 1 advanced to log sequence 35392 (LGWR switch)
  9.   Current log# 19 seq# 35392 mem# 0: /oradata/oradata/szyb/szyb_redo1_19-500m.log
  10. Mon Dec 29 03:29:46 EAT 2014
  11. Thread 1 advanced to log sequence 35393 (LGWR switch)
  12.   Current log# 20 seq# 35393 mem# 0: /oradata/oradata/szyb/szyb_redo1_20-500m.log
  13. Mon Dec 29 04:42:15 EAT 2014
  14. Thread 1 advanced to log sequence 35394 (LGWR switch)
  15.   Current log# 11 seq# 35394 mem# 0: /oradata/oradata/szyb/szyb_redo1_11-500m.log
  16. Mon Dec 29 07:37:56 EAT 2014
  17. ……
  18. Shutting down instance (immediate)
复制代码
随后数据库出现hang现象
--2号节点出现redo无法切换:
  1. Current log# 1 seq# 45548 mem# 0: /oradata/oradata/szyb/szyb_redo2_1-500m.log
  2. Sun Dec 28 03:12:14 EAT 2014
  3. Thread 2 advanced to log sequence 45549 (LGWR switch)
  4.   Current log# 2 seq# 45549 mem# 0: /oradata/oradata/szyb/szyb_redo2_2-500m.log
  5. Sun Dec 28 03:12:23 EAT 2014
  6. Thread 2 advanced to log sequence 45550 (LGWR switch)
  7.   Current log# 3 seq# 45550 mem# 0: /oradata/oradata/szyb/szyb_redo2_3-500m.log
  8. Sun Dec 28 08:35:50 EAT 2014
  9. Thread 2 advanced to log sequence 45551 (LGWR switch)
  10.   Current log# 4 seq# 45551 mem# 0: /oradata/oradata/szyb/szyb_redo2_4-500m.log
  11. Sun Dec 28 09:30:54 EAT 2014
  12. Thread 2 advanced to log sequence 45552 (LGWR switch)
  13.   Current log# 5 seq# 45552 mem# 0: /oradata/oradata/szyb/szyb_redo2_5-500m.log
  14. Sun Dec 28 10:28:00 EAT 2014
  15. Thread 2 advanced to log sequence 45553 (LGWR switch)
  16.   Current log# 6 seq# 45553 mem# 0: /oradata/oradata/szyb/szyb_redo2_6-500m.log
  17. Sun Dec 28 11:15:37 EAT 2014
  18. Thread 2 advanced to log sequence 45554 (LGWR switch)
  19.   Current log# 7 seq# 45554 mem# 0: /oradata/oradata/szyb/szyb_redo2_7-500m.log
  20. Sun Dec 28 11:49:32 EAT 2014
  21. Thread 2 advanced to log sequence 45555 (LGWR switch)
  22.   Current log# 8 seq# 45555 mem# 0: /oradata/oradata/szyb/szyb_redo2_8-500m.log
  23. Sun Dec 28 14:40:21 EAT 2014
  24. Thread 2 advanced to log sequence 45556 (LGWR switch)
  25.   Current log# 9 seq# 45556 mem# 0: /oradata/oradata/szyb/szyb_redo2_9-500m.log
  26. Sun Dec 28 18:23:13 EAT 2014
  27. Thread 2 advanced to log sequence 45557 (LGWR switch)
  28.   Current log# 10 seq# 45557 mem# 0: /oradata/oradata/szyb/szyb_redo2_10-500m.log
  29. Mon Dec 29 00:03:29 EAT 2014
  30. Thread 2 advanced to log sequence 45558 (LGWR switch)
  31.   Current log# 1 seq# 45558 mem# 0: /oradata/oradata/szyb/szyb_redo2_1-500m.log
  32. Mon Dec 29 03:29:47 EAT 2014
  33. Thread 2 advanced to log sequence 45559 (LGWR switch)
  34.   Current log# 2 seq# 45559 mem# 0: /oradata/oradata/szyb/szyb_redo2_2-500m.log
  35. Mon Dec 29 06:37:10 EAT 2014
  36. Thread 2 advanced to log sequence 45560 (LGWR switch)
  37.   Current log# 3 seq# 45560 mem# 0: /oradata/oradata/szyb/szyb_redo2_3-500m.log
  38. Mon Dec 29 07:17:32 EAT 2014
  39. Thread 2 advanced to log sequence 45561 (LGWR switch)
  40.   Current log# 4 seq# 45561 mem# 0: /oradata/oradata/szyb/szyb_redo2_4-500m.log
  41. Mon Dec 29 07:37:54 EAT 2014
  42. Thread 2 advanced to log sequence 45562 (LGWR switch)
  43.   Current log# 5 seq# 45562 mem# 0: /oradata/oradata/szyb/szyb_redo2_5-500m.log
  44. Mon Dec 29 07:58:25 EAT 2014
  45. Thread 2 advanced to log sequence 45563 (LGWR switch)
  46.   Current log# 6 seq# 45563 mem# 0: /oradata/oradata/szyb/szyb_redo2_6-500m.log
  47. Mon Dec 29 08:39:49 EAT 2014
  48. Thread 2 advanced to log sequence 45564 (LGWR switch)
  49.   Current log# 7 seq# 45564 mem# 0: /oradata/oradata/szyb/szyb_redo2_7-500m.log
  50. Mon Dec 29 08:58:55 EAT 2014
  51. Thread 2 advanced to log sequence 45565 (LGWR switch)
  52.   Current log# 8 seq# 45565 mem# 0: /oradata/oradata/szyb/szyb_redo2_8-500m.log
  53. Mon Dec 29 09:14:05 EAT 2014
  54. Thread 2 advanced to log sequence 45566 (LGWR switch)
  55.   Current log# 9 seq# 45566 mem# 0: /oradata/oradata/szyb/szyb_redo2_9-500m.log
  56. Mon Dec 29 09:32:26 EAT 2014
  57. Thread 2 cannot allocate new log, sequence 45567
  58. Checkpoint not complete
  59.   Current log# 9 seq# 45566 mem# 0: /oradata/oradata/szyb/szyb_redo2_9-500m.log
  60. Mon Dec 29 09:54:31 EAT 2014
  61. MMNL absent for 1201 secs; Foregrounds taking over
  62. Mon Dec 29 10:22:06 EAT 2014
  63. Process startup failed, error stack:
复制代码
从SF日志显示,时间段段内swap使用率居高不下,100%
  1. 2014/12/29 10:09:21 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 13% and Swap usage = 99%.
  2. 2014/12/29 10:09:51 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 17% and Swap usage = 100%.
  3. 2014/12/29 10:10:20 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 13% and Swap usage = 100%.
  4. 2014/12/29 10:10:51 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 13% and Swap usage = 100%.
  5. 2014/12/29 10:11:21 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 15% and Swap usage = 100%.
  6. 2014/12/29 10:11:50 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 4% and Swap usage = 100%.
  7. 2014/12/29 10:12:21 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 6% and Swap usage = 100%.
  8. 2014/12/29 10:12:51 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 7% and Swap usage = 100%.
  9. 2014/12/29 10:13:20 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 2% and Swap usage = 100%.
  10. 2014/12/29 10:13:51 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 2% and Swap usage = 100%.
  11. 2014/12/29 10:14:21 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 1% and Swap usage = 100%.
  12. 2014/12/29 10:14:50 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 3% and Swap usage = 100%.
  13. 2014/12/29 10:15:21 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 2% and Swap usage = 100%.
  14. 2014/12/29 10:15:51 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 2% and Swap usage = 100%.
复制代码
主机的基本配置信息及Oracle Memory配置结构信息:
--OS
  1. Physical memory:32G
  2. Swap:8G
复制代码
--Oracle SGA
  1. SQL> show parameter sga

  2. NAME                                 TYPE        VALUE
  3. ------------------------------------ ----------- ------------------------------
  4. lock_sga                             boolean     FALSE
  5. pre_page_sga                         boolean     FALSE
  6. sga_max_size                         big integer 16G
  7. sga_target                           big integer 16G

  8. SQL> show parameter pga

  9. NAME                                 TYPE        VALUE
  10. ------------------------------------ ----------- ------------------------------
  11. pga_aggregate_target                 big integer 4G
复制代码

alert.rar

1.33 MB, 下载次数: 1516

回复 只看该作者 道具 举报

5#
发表于 2015-1-11 15:46:39
需要其他日志 分析  swap 100%的原因,否则这类案例 最后很难说清楚 是谁把swap用了

回复 只看该作者 道具 举报

6#
发表于 2015-1-11 15:55:07
Maclean Liu(刘相兵 发表于 2015-1-11 15:46
需要其他日志 分析  swap 100%的原因,否则这类案例 最后很难说清楚 是谁把swap用了 ...

是否可以给个你的建议,这套系统日常swap使用率基本维持在70%~80%之间。
目前,我们给出的建议:
1)加大现有的swap大小,从目前8G上升至32*0.75=24G
2)暂时关闭统计信息收集功能,改为手动收集(原因:2015年1月7号晚上23点左右,出现
  1. call dbms_stats.gather_database_stats_job_proc ( )
  2. call dbms_space.auto_space_advisor_job_proc ( )
复制代码
长时间无响应,并造成大量的I/O等待事件

回复 只看该作者 道具 举报

7#
发表于 2015-1-11 18:46:33
这套系统日常swap使用率基本维持在70%~80%之间。  ==>这并不算正常

回复 只看该作者 道具 举报

8#
发表于 2015-1-13 21:58:26
三节点 还没玩过呢

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-6-13 16:49 , Processed in 0.053683 second(s), 23 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569