Oracle数据库数据恢复、性能优化

找回密码
注册
搜索
热搜: 活动 交友 discuz
发新帖

0

积分

1

好友

10

主题
1#
发表于 2014-4-11 16:36:24 | 查看: 2889| 回复: 1
环境描述:
aix 5.3
oracle 10.2.0.5 双节点RAC

客户反应,在10号早上重启了主机后,库起来了,但是部分业务不能办理,部分业务正常(客户描述)
他们的维护策略是:10号重启1节点主机,20号重启节点主机。

具体表现;在一个select查询时,两个节点均遭遇
SQL> select * from scxnb.ac43 where aaa027='510904';
select * from scxnb.ac43 where aaa027='510904'
*
ERROR at line 1:
ORA-01157: cannot identify/lock data file 102 - see DBWR trace file
ORA-01110: data file 102: '+DG_DATA1/cp3_94'


检查:
起初怀疑数据文件丢失
然而经过检查,数据文件存在。
检查alert日志后,之前客户比较粗暴的关闭主机,对主机进行重启的(他们目的是重启数据库相关服务,不知道为啥直接要重启主机)

相关alert日志如下:
SUCCESS: diskgroup ARCHDG was mounted
Wed Apr 09 23:46:56 BEIST 2014
SUCCESS: diskgroup ARCHDG was dismounted
Thu Apr 10 07:00:05 BEIST 2014
Shutting down instance (abort)
License high water mark = 344
Instance terminated by USER, pid = 733204

Thu Apr 10 07:13:33 BEIST 2014
Starting ORACLE instance (normal)
sskgpgetexecname failed to get name
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 en1 172.16.1.0 configured from OCR for use as a cluster interconnect
Interface type 1 en7 10.160.5.0 configured from OCR for use as  a public interface
Picked latch-free SCN scheme 3
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.5.0.
System parameters with non-default values:
  processes                = 1500
  sessions                 = 2000
  resource_limit           = TRUE
  sga_max_size             = 21474836480
  __shared_pool_size       = 1442840576
  __large_pool_size        = 16777216
  __java_pool_size         = 50331648
  __streams_pool_size      = 0
  spfile                   = +DATADG/scnbdb/spfilescnbdb.ora
  sga_target               = 12884901888
  control_files            = +DATADG/scnbdb/controlfile/current.260.748375667
  db_block_size            = 8192
  __db_cache_size          = 11358175232
  compatible               = 10.2.0.5.0
  log_archive_dest_1       = location=+ARCHDG
  log_archive_format       = %t_%s_%r.dbf
  db_file_multiblock_read_count= 16
  cluster_database         = TRUE
  cluster_database_instances= 2
  db_create_file_dest      = +DATADG
  thread                   = 1
  instance_number          = 1
  undo_management          = AUTO
  undo_tablespace          = UNDOTBS1
  undo_retention           = 86400
  remote_login_passwordfile= EXCLUSIVE
  db_domain                =
  dispatchers              = (PROTOCOL=TCP) (SERVICE=scnbdbXDB)
  local_listener           = (address=(protocol=tcp)(host=10.160.5.1)(port=1521))
  remote_listener          = LISTENERS_SCNBDB
  utl_file_dir             = /u01/app/oracle/logs_utl_file
  job_queue_processes      = 10
  background_dump_dest     = /u01/app/oracle/admin/scnbdb/bdump
  user_dump_dest           = /u01/app/oracle/admin/scnbdb/udump
  core_dump_dest           = /u01/app/oracle/admin/scnbdb/cdump
  audit_file_dest          = /u01/app/oracle/admin/scnbdb/adump
  db_name                  = scnbdb
  open_cursors             = 2000
  pga_aggregate_target     = 8589934592
  _awr_flush_threshold_metrics= TRUE
Cluster communication is configured to use the following interface(s) for this instance
  172.16.1.1
Thu Apr 10 07:13:34 BEIST 2014
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=229432
DIAG started with pid=3, OS id=233530
PSP0 started with pid=4, OS id=229802
LMON started with pid=5, OS id=237628
LMD0 started with pid=6, OS id=241726
LMS0 started with pid=7, OS id=131618
LMS1 started with pid=8, OS id=168506
LMS2 started with pid=9, OS id=254020
MMAN started with pid=10, OS id=201020
DBW0 started with pid=11, OS id=233808
DBW1 started with pid=12, OS id=221268
LGWR started with pid=13, OS id=262218
CKPT started with pid=14, OS id=209300
SMON started with pid=15, OS id=241996
RECO started with pid=16, OS id=250192
CJQ0 started with pid=17, OS id=266322
MMON started with pid=18, OS id=270420
Thu Apr 10 07:13:34 BEIST 2014
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=19, OS id=274518
Thu Apr 10 07:13:35 BEIST 2014
starting up 1 shared server(s) ...
Thu Apr 10 07:13:38 BEIST 2014
lmon registered with NM - instance id 1 (internal mem no 0)
Thu Apr 10 07:13:38 BEIST 2014
Reconfiguration started (old inc 0, new inc 8)
List of nodes:
0 1
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* domain 0 valid according to instance 1
* domain 0 valid = 1 according to instance 1
Thu Apr 10 07:13:38 BEIST 2014
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Thu Apr 10 07:13:38 BEIST 2014
LMS 0: 0 GCS shadows cancelled, 0 closed
Thu Apr 10 07:13:38 BEIST 2014
LMS 2: 0 GCS shadows cancelled, 0 closed
Thu Apr 10 07:13:38 BEIST 2014
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Thu Apr 10 07:13:39 BEIST 2014
LMS 2: 0 GCS shadows traversed, 0 replayed
Thu Apr 10 07:13:39 BEIST 2014
LMS 0: 0 GCS shadows traversed, 0 replayed
Thu Apr 10 07:13:39 BEIST 2014
LMS 1: 0 GCS shadows traversed, 0 replayed
Thu Apr 10 07:13:39 BEIST 2014
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=22, OS id=258128
Thu Apr 10 07:13:41 BEIST 2014
ALTER DATABASE   MOUNT
Thu Apr 10 07:13:42 BEIST 2014
Starting background process ASMB
ASMB started with pid=24, OS id=282746
Starting background process RBAL
RBAL started with pid=25, OS id=245828
Thu Apr 10 07:13:46 BEIST 2014
SUCCESS: diskgroup DATADG was mounted
Thu Apr 10 07:13:51 BEIST 2014
Setting recovery target incarnation to 2
Thu Apr 10 07:13:51 BEIST 2014
Successful mount of redo thread 1, with mount id 3422982371
Thu Apr 10 07:13:51 BEIST 2014
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Completed: ALTER DATABASE   MOUNT
Thu Apr 10 07:13:52 BEIST 2014
ALTER DATABASE OPEN
Picked broadcast on commit scheme to generate SCNs
Thu Apr 10 07:13:58 BEIST 2014
Errors in file /u01/app/oracle/admin/scnbdb/bdump/scnbdb1_dbw0_233808.trc:
ORA-01157: cannot identify/lock data file 99 - see DBWR trace file
ORA-01110: data file 99: '+DG_DATA1/cp3_91'
ORA-17503: ksfdopn:2 Failed to open file +DG_DATA1/cp3_91
ORA-15001: diskgroup "DG_DATA1" does not exist or is not mounted
ORA-15001: diskgroup "DG_DATA1" does not exist or is not mounted
Thu Apr 10 07:13:58 BEIST 2014
Errors in file /u01/app/oracle/admin/scnbdb/bdump/scnbdb1_dbw0_233808.trc:
ORA-01157: cannot identify/lock data file 100 - see DBWR trace file
ORA-01110: data file 100: '+DG_DATA1/cp3_92'
ORA-17503: ksfdopn:2 Failed to open file +DG_DATA1/cp3_92
ORA-15001: diskgroup "DG_DATA1" does not exist or is not mounted
ORA-15001: diskgroup "DG_DATA1" does not exist or is not mounted


在两个节点
NODE1:

export ORACLE_SID=+ASM1

SQL> select name,state from v$asm_diskgroup;

NAME                           STATE
------------------------------ ----------------------
ARCHDG                         MOUNTED
DATADG                         MOUNTED
DG_DATA1                      DISMOUNTED

NODE2
export ORACLE_SID=+ASM2

SQL> select name,state from v$asm_diskgroup;

NAME                           STATE
------------------------------ ----------------------
ARCHDG                         MOUNTED
DATADG                         MOUNTED
DG_DATA1                      DISMOUNTED



export ORACLE_SID=+ASM1

SQL> select name,state from v$asm_diskgroup;

NAME                           STATE
------------------------------ ----------------------
ARCHDG                         MOUNTED
DATADG                         MOUNTED
DG_DATA1                      DISMOUNTED


--asm asm_diskgroups参数指定了 DATADG 、ARCHDG 、DG_DATA1  但是,DATADG 、ARCHDG mount了,
DG_DATA1 DISMOUNTED;

--比较奇怪,如果是因为shutdown 导致的diskgroup dismount,数据库还能open?有点疑惑。。
且此时数据库正常运行,日志切换均正常。
和客户沟通,9号一切业务正常,推断是因为对主机的shutdown导致数据库的此错误。

分别在两个节点mount DG_DATA1磁盘组
alter diskgroup DG_DATA1 mount;
然后在数据库执行
SQL> alter system check datafiles;

System altered.

检查之前报错的数据文件状态
SQL> select file_id,file_name,status from dba_data_files where file_id=102;

FILE_ID FILE_NAME                      STATUS
------- ------------------------------ --------------------
    102 +DG_DATA1/cp3_94               AVAILABLE

状态正常,
select * from scxnb.ac43 where aaa027='510904';
--返回结果,不再报错
2#
发表于 2014-4-11 16:39:11
补充:检查了9号的alert日志,没有发现对对 DG_DATA1 DISMOUNT的操作

回复 只看该作者 道具 举报

您需要登录后才可以回帖 登录 | 注册

QQ|手机版|Archiver|Oracle数据库数据恢复、性能优化

GMT+8, 2024-12-21 06:22 , Processed in 0.043320 second(s), 20 queries .

Powered by Discuz! X2.5

© 2001-2012 Comsenz Inc.

回顶部
TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569