lsz_qh 发表于 2014-4-18 12:29:08

DB节点系统盘故障

x2-2全配一体机db02节点四块硬盘做的raid5,其中slot2 failed ,硬盘黄灯报警,目前系统在read only模式,也就是repair filesystem #下。
目前系统文件丢失,昨天一直fsck 修复文件系统!
重启无法到可读写模式!
目前订购了两块硬盘都去尝试,硬盘还是黄灯报警,是不是更换硬盘有其他特殊操作?
我看mos上更换db节点硬盘的文章,也没有什么特殊之处,或者一直黄灯报警跟文件系统丢失有关?

XuLiQiang 发表于 2014-4-18 12:47:12

期待答案

Maclean Liu(刘相兵 发表于 2014-4-18 13:30:12

考虑参考下面的文档,虽然我没实践过:

The following document provides solutions to different issues found by the execution of the Bare Metal
Restore Procedure.
What could cause this problem where filesytem becomes READ ONLY on the compute nodes?
• if this was still on old ofa (older than 1.5.1-4.0.28) then that could have been cause of this
problem.
• If this was not having its LSI cache in write-back mode (enabled) that could have caused this.
• If this was in write-back but the battery had gotten discharged or not providing back up power to
the cache and there was a reboot or such this could have happened.
• And finally there is RAID-5 issue in all LSI firmware (12.12.0-0048) prior to what will get out in
11.2.2.3.0 - which might have caused issues.
Before starting the restore, one option is try to run fschk. That requires booting the compute node in
rescue mode. Most of the times it does not work due to the large number of corrupted inodes.
How
• Take image diagnostic.iso from other compute node in the cluster. (/opt/oracle.cellos)
• Transfer the diagnostic.iso image to your desktop
• From the GUI ILOM, Open the remote console. Under Devices menu, click on cd rom image and
attach the iso image.
• Reboot the server, Press F2 to change the boot and boot from the iso image.
• When the system reboots, enter in rescue mode
• Log as root password is sos1exadata
• From a different compute node, run df command to get the list of LVM
• Run fschk –fy <path to the lvm>
• After running for all LVM, disconnect the iso image from ILOM and boot now from disk.
If filesystem was fixed, machine may come back. If not, only option is bare metal restore.
In this document, the goal was to provide clarification to different sections from the main MOS
Document 1084360.1 (bare metal restore). The notes will provide clarification during the execution of
certain commands or will provide workarounds to possible problems that can be reported.

XuLiQiang 发表于 2014-4-18 13:42:09

方便留个QQ号吗 我这里也有一台EXADATA 方便以后交流

lsz_qh 发表于 2014-4-18 14:43:19

XuLiQiang 发表于 2014-4-18 13:42 static/image/common/back.gif
方便留个QQ号吗 我这里也有一台EXADATA 方便以后交流

留个你的吧,我加你,呵呵

XuLiQiang 发表于 2014-4-18 14:55:43

我的QQ: 562868813
页: [1]
查看完整版本: DB节点系统盘故障