Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

17

积分

0

好友

10

主题
1#
发表于 2013-4-21 10:02:33 | 查看: 5818| 回复: 8
环境 2节点的RAC
os oracle linux 6.3
database 11.2.0.3.0

2节点的RAC,Dataguard 备份到一台主机上.


2号节点的RAC从通过ogg把sqlserver的数据应用到RAC
1号节点的RAC提取数据,把数据投递到其它几台服务器上.

由于维护的需要,准备重启数据库.
把ogg关闭.
在1号节点执行
alte system checkpoint;
关闭数据库.
在2号节点执行
alter system checkpoint;
的时候系统宕机,重启.


下面宕机系统部份日志

Apr 21 08:25:12 chnap-itd65 kernel: Code: 66 90 e8 cb 84 b3 ff 66 90 c9 c3 0f 1f 80 00 00 00 00 55 48 89 e5 53 66 66 66 66 90 48 89 f3 e8 fe 84 b3 ff 66 90 48 89 df 57 9d
Apr 21 08:25:12 chnap-itd65 kernel: Call Trace:
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81151d77>] isolate_freepages+0x3c7/0x470
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81151e6c>] compaction_alloc+0x4c/0x60
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff8115beb7>] unmap_and_move+0x47/0x340
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81151e20>] ? isolate_freepages+0x470/0x470
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff8115c253>] migrate_pages+0xa3/0x150
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81152759>] compact_zone+0x1d9/0x2e0
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff8150e66e>] ? apic_timer_interrupt+0xe/0x20
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81152ae1>] compact_zone_order+0xa1/0xe0
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81110cef>] ? zone_watermark_ok+0x1f/0x30
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81152be5>] try_to_compact_pages+0xc5/0x100
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81115259>] __alloc_pages_direct_compact+0xc9/0x190
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81115729>] __alloc_pages_slowpath+0x409/0x6d0
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81110cef>] ? zone_watermark_ok+0x1f/0x30
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81115b94>] __alloc_pages_nodemask+0x1a4/0x1f0
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff811508fa>] alloc_pages_vma+0x9a/0x150
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81160c23>] do_huge_pmd_anonymous_page+0x143/0x210
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff8113534d>] handle_mm_fault+0x15d/0x350
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81509440>] do_page_fault+0x140/0x470
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff8113a8c4>] ? do_mmap_pgoff+0x354/0x3a0
Apr 21 08:25:12 chnap-itd65 kernel: [<ffffffff81506115>] page_fault+0x25/0x30
Apr 21 08:25:12 chnap-itd65 abrt-dump-oops: Reported 3 kernel oopses to Abrt
Apr 21 08:25:12 chnap-itd65 kernel: BUG: soft lockup - CPU#15 stuck for 22s! [oracle:53167]

Apr 21 08:25:12 chnap-itd65 kernel: Modules linked in: bridge nls_utf8 autofs4 sunrpc target_core_iblock target_core_file target_core_pscsi target_core_mod configfs bnx2fc cnic uio fcoe libfcoe libfc 8021q garp stp llc ipv6 emcpdm(P)(U) emcpgpx(P)(U) emcpmpx(P)(U) emcp(P)(U) uinput microcode dcdbas serio_raw pcspkr ghes hed sg ses enclosure iTCO_wdt iTCO_vendor_support i7core_edac edac_core bnx2 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Apr 21 08:25:12 chnap-itd65 kernel: CPU 15
Apr 21 08:25:12 chnap-itd65 kernel: Modules linked in: bridge nls_utf8 autofs4 sunrpc target_core_iblock target_core_file target_core_pscsi target_core_mod configfs bnx2fc cnic uio fcoe libfcoe libfc 8021q garp stp llc ipv6 emcpdm(P)(U) emcpgpx(P)(U) emcpmpx(P)(U) emcp(P)(U) uinput microcode dcdbas serio_raw pcspkr ghes hed sg ses enclosure iTCO_wdt iTCO_vendor_support i7core_edac edac_core bnx2 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Apr 21 08:25:12 chnap-itd65 kernel:

不知道是不是哪个参数没有设置好,还是操作系统的bug.

附件是2个节点的alter log ,css log,crs log,message

谁有空帮我分析一下,谢谢!


oracle log.zip

3.33 MB, 下载次数: 871

9#
发表于 2013-5-24 09:07:48
am196 发表于 2013-4-25 22:23
vm.min_free_kbytes=1024000
vm.vfs_cache_pressure=200
vm.swappiness=60

可能是安装后系统参数没有调整好,是按Oracle Linux系统自定义的参数,
改过之后现在系统已经正常运行了20多天了。


[root@chnap-itd63 ~]# uptime
09:06:01 up 23 days, 13:54,  1 user,  load average: 0.56, 0.40, 0.99

vm.min_free_kbytes=10485760
vm.vfs_cache_pressure=200
vm.swappiness=60

# oracle-rdbms-server-11gR2-preinstall setting for fs.file-max is 6815744
fs.file-max = 6815744

# oracle-rdbms-server-11gR2-preinstall setting for kernel.sem is '250 32000 100 128'
#kernel.sem = 250 32000 100 128
kernel.sem = 10010 1281280 10010 128

# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmmni is 4096
kernel.shmmni = 4096

# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmall is 1073741824 on x86_64
# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmall is 2097152 on i386
#kernel.shmall = 1073741824
kernel.shmall = 65536000

# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmmax is 4398046511104 on x86_64
# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmmax is 4294967295 on i386
#kernel.shmmax = 4398046511104
kernel.shmmax = 268435456000

回复 只看该作者 道具 举报

8#
发表于 2013-4-25 22:23:27
am196 发表于 2013-4-25 22:15
我的一台主机上创建了两个单实例数据库,频繁的出现该问题。
装的是OEL6.3的系统,只能先升级到OEL6.4的 ...

vm.min_free_kbytes=1024000
vm.vfs_cache_pressure=200
vm.swappiness=60

# oracle-rdbms-server-11gR2-preinstall setting for kernel.sem is '250 32000 100 128'
#kernel.sem = 250 32000 100 128
kernel.sem = 10010 1281280 10010 128

# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmmni is 4096
kernel.shmmni = 4096

# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmall is 1073741824 on x86_64
# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmall is 2097152 on i386
#kernel.shmall = 1073741824
kernel.shmall = 131072000

# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmmax is 4398046511104 on x86_64
# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmmax is 4294967295 on i386
#kernel.shmmax = 4398046511104
kernel.shmmax = 536870912000

我调整一下系统参数试一上,看看系统运行怎么样!
上面虚拟内存是新增的。
注释的是旧值。

回复 只看该作者 道具 举报

7#
发表于 2013-4-25 22:15:07
Maclean Liu(刘相兵 发表于 2013-4-23 21:09
就日志看 心跳超时  早于 BUG: soft lockup

我的一台主机上创建了两个单实例数据库,频繁的出现该问题。
装的是OEL6.3的系统,只能先升级到OEL6.4的内核试一下。

回复 只看该作者 道具 举报

6#
发表于 2013-4-23 21:09:41
就日志看 心跳超时  早于 BUG: soft lockup

回复 只看该作者 道具 举报

5#
发表于 2013-4-23 20:25:11
Maclean Liu(刘相兵 发表于 2013-4-21 20:35
2013-04-21 08:24:39.433

就发现存在网络心跳超时 ,所以  BUG: soft lockup - CPU# 可能只是结果 不是原 ...

查看系统的message
有出现类似的BUG: soft lockup - CPU#65 stuck for 22s!
是CPU资源被锁,造成两节点无法通讯,所以才会downs掉的。
我的一个单实例主机也出现BUG: soft lockup - CPU#65 stuck for 22s!
这种情况,而且比较频繁,不知道是不是oracle的bug.

回复 只看该作者 道具 举报

4#
发表于 2013-4-21 20:35:56
2013-04-21 08:24:39.433

就发现存在网络心跳超时 ,所以  BUG: soft lockup - CPU# 可能只是结果 不是原因


为什么会出现 网络心跳超时 , 这是我们需要检查的

回复 只看该作者 道具 举报

3#
发表于 2013-4-21 20:34:41
Apr 21 08:24:41 chnap-itd65 kernel: BUG: soft lockup - CPU#28 stuck for 23s! [oracle:53163]


cat /proc/cpuinfo
  1. 2013-04-21 08:24:39.433: [    CSSD][4272297728]clssnmPollingThread: node chnap-itd64 (1) at 50% heartbeat fatal, removal in 14.420 seconds
  2. 2013-04-21 08:24:39.433: [    CSSD][4272297728]clssnmPollingThread: node chnap-itd64 (1) is impending reconfig, flag 2491406, misstime 15580
  3. 2013-04-21 08:24:39.433: [    CSSD][4272297728]clssnmPollingThread: local diskTimeout set to 27000 ms, remote disk timeout set to 27000, impending reconfig status(1)
  4. 2013-04-21 08:24:39.433: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4245977, LATS 1950994514, lastSeqNo 2301041, uniqueness 1363183529, timestamp 1366503879/3318982324
  5. 2013-04-21 08:24:40.434: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4245978, LATS 1950995514, lastSeqNo 4245977, uniqueness 1363183529, timestamp 1366503880/3318983334
  6. 2013-04-21 08:24:41.435: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4245979, LATS 1950996514, lastSeqNo 4245978, uniqueness 1363183529, timestamp 1366503881/3318984334
  7. 2013-04-21 08:24:42.435: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4245980, LATS 1950997514, lastSeqNo 4245979, uniqueness 1363183529, timestamp 1366503882/3318985334
  8. 2013-04-21 08:24:43.436: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4245981, LATS 1950998514, lastSeqNo 4245980, uniqueness 1363183529, timestamp 1366503883/3318986334
  9. 2013-04-21 08:24:44.437: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4245982, LATS 1950999514, lastSeqNo 4245981, uniqueness 1363183529, timestamp 1366503884/3318987334
  10. 2013-04-21 08:24:44.978: [    CSSD][4270720768]clssnmSendingThread: sending status msg to all nodes
  11. 2013-04-21 08:24:44.978: [    CSSD][4270720768]clssnmSendingThread: sent 4 status msgs to all nodes
  12. 2013-04-21 08:24:45.437: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4245983, LATS 1951000514, lastSeqNo 4245982, uniqueness 1363183529, timestamp 1366503885/3318988334
  13. 2013-04-21 08:24:46.438: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4245984, LATS 1951001514, lastSeqNo 4245983, uniqueness 1363183529, timestamp 1366503886/3318989334
  14. 2013-04-21 08:24:47.438: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4245985, LATS 1951002514, lastSeqNo 4245984, uniqueness 1363183529, timestamp 1366503887/3318990334
  15. 2013-04-21 08:24:47.893: [    CSSD][4267566848]clssscMonitorThreads clssnmPollingThread not scheduled for 8460 msecs
  16. 2013-04-21 08:24:48.244: [    CSSD][4272297728]clssnmPollingThread: local diskTimeout set to 200000 ms, remote disk timeout set to 200000, impending reconfig status(0)
  17. 2013-04-21 08:24:56.136: [    CSSD][4289566464]clssscMonitorThreads clssnmvDiskPingMonitorThread not scheduled for 8320 msecs
  18. 2013-04-21 08:24:56.136: [    CSSD][4289566464]clssscMonitorThreads clssnmSendingThread not scheduled for 8400 msecs
  19. 2013-04-21 08:25:03.248: [    CSSD][4272297728]clssnmPollingThread: node chnap-itd64 (1) is impending reconfig, flag 2491406, misstime 15350
  20. 2013-04-21 08:25:03.248: [    CSSD][4272297728]clssnmPollingThread: local diskTimeout set to 27000 ms, remote disk timeout set to 27000, impending reconfig status(1)
  21. 2013-04-21 08:25:03.249: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4246001, LATS 1951018324, lastSeqNo 4245985, uniqueness 1363183529, timestamp 1366503903/3319006344
  22. 2013-04-21 08:25:04.136: [    CSSD][4289566464]clssscMonitorThreads clssnmvDiskPingMonitorThread not scheduled for 16320 msecs
  23. 2013-04-21 08:25:04.136: [    CSSD][4289566464]clssscMonitorThreads clssnmSendingThread not scheduled for 16400 msecs
  24. 2013-04-21 08:25:04.249: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4246003, LATS 1951019324, lastSeqNo 4246001, uniqueness 1363183529, timestamp 1366503904/3319007344
  25. 2013-04-21 08:25:05.250: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4246005, LATS 1951020324, lastSeqNo 4246003, uniqueness 1363183529, timestamp 1366503905/3319008344
  26. 2013-04-21 08:25:06.250: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4246007, LATS 1951021324, lastSeqNo 4246005, uniqueness 1363183529, timestamp 1366503906/3319009344
  27. 2013-04-21 08:25:07.580: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4246009, LATS 1951022654, lastSeqNo 4246007, uniqueness 1363183529, timestamp 1366503907/3319010344
  28. 2013-04-21 08:25:08.250: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4246011, LATS 1951023324, lastSeqNo 4246009, uniqueness 1363183529, timestamp 1366503908/3319011344
  29. 2013-04-21 08:25:09.250: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4246013, LATS 1951024324, lastSeqNo 4246011, uniqueness 1363183529, timestamp 1366503909/3319012344
  30. 2013-04-21 08:25:10.251: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4246014, LATS 1951025324, lastSeqNo 4246013, uniqueness 1363183529, timestamp 1366503909/3319012854
  31. 2013-04-21 08:25:11.251: [    CSSD][4272297728]clssnmPollingThread: node chnap-itd64 (1) at 75% heartbeat fatal, removal in 6.650 seconds
  32. 2013-04-21 08:25:11.251: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt, 4246015, LATS 1951026324, lastSeqNo 4246014, uniqueness 1363183529, timestamp 1366503910/3319013854
  33. 2013-04-21 08:25:12.134: [    CSSD][4289566464]clssscMonitorThreads clssnmvDiskPingMonitorThread not scheduled for 24320 msecs
  34. 2013-04-21 08:25:12.252: [    CSSD][4278605568]clssnmvDHBValidateNCopy: node 1, chnap-itd64, has a disk HB, but no network HB, DHB has rcfg 257726831, wrtcnt,
复制代码

回复 只看该作者 道具 举报

2#
发表于 2013-4-21 20:32:20
  1. Apr 21 08:24:41 chnap-itd65 kernel: BUG: soft lockup - CPU#28 stuck for 23s! [oracle:53163]
  2. Apr 21 08:24:41 chnap-itd65 kernel: Modules linked in: nls_utf8 autofs4 sunrpc target_core_iblock target_core_file target_core_pscsi target_core_mod configfs bnx2fc cnic uio fcoe libfcoe libfc 8021q garp stp llc ipv6 emcpdm(P)(U) emcpgpx(P)(U) emcpmpx(P)(U) emcp(P)(U) uinput microcode dcdbas serio_raw pcspkr ghes hed sg ses enclosure iTCO_wdt iTCO_vendor_support i7core_edac edac_core bnx2 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
  3. Apr 21 08:24:41 chnap-itd65 kernel: CPU 28
  4. Apr 21 08:24:41 chnap-itd65 kernel: Modules linked in: nls_utf8 autofs4 sunrpc target_core_iblock target_core_file target_core_pscsi target_core_mod configfs bnx2fc cnic uio fcoe libfcoe libfc 8021q garp stp llc ipv6 emcpdm(P)(U) emcpgpx(P)(U) emcpmpx(P)(U) emcp(P)(U) uinput microcode dcdbas serio_raw pcspkr ghes hed sg ses enclosure iTCO_wdt iTCO_vendor_support i7core_edac edac_core bnx2 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
  5. Apr 21 08:24:41 chnap-itd65 kernel:
  6. Apr 21 08:24:41 chnap-itd65 kernel: Pid: 53163, comm: oracle Tainted: P            2.6.39-200.24.1.el6uek.x86_64 #1 Dell Inc. PowerEdge R910/0NCWG9
  7. Apr 21 08:24:41 chnap-itd65 kernel: RIP: 0010:[<ffffffff815059c9>]  [<ffffffff815059c9>] _raw_spin_unlock_irqrestore+0x19/0x30
  8. Apr 21 08:24:41 chnap-itd65 kernel: RSP: 0000:ffff8851d2b9d810  EFLAGS: 00000282
  9. Apr 21 08:24:41 chnap-itd65 kernel: RAX: 0000000000000000 RBX: ffffffff8150e66e RCX: ffffea00c3254000
  10. Apr 21 08:24:41 chnap-itd65 kernel: RDX: 0000000000000200 RSI: 0000000000000282 RDI: 0000000000000282
  11. Apr 21 08:24:41 chnap-itd65 kernel: RBP: ffff8851d2b9d818 R08: ffff88807ffd9e00 R09: ffffea00c325400c
  12. Apr 21 08:24:41 chnap-itd65 kernel: R10: 000000000457fe00 R11: 000000000000007d R12: ffffffff8150e66e
  13. Apr 21 08:24:41 chnap-itd65 kernel: R13: ffff8851d2b9d818 R14: ffffffff8150e66e R15: ffff8851d2b9d838
  14. Apr 21 08:24:41 chnap-itd65 kernel: FS:  00007fd20d49c700(0000) GS:ffff88807ef80000(0000) knlGS:0000000000000000
  15. Apr 21 08:24:41 chnap-itd65 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
复制代码

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-11-16 18:33 , Processed in 0.094356 second(s), 24 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569