yzm1987 发表于 2013-7-29 16:42:13

纠结了两天OGG 错误~实在是找不到解决办法~请指教

系统环境:linux CentOS 64bit
数据库环境:oracle 11G 11.2.3

之前按照官方文档,一切都正常配置。但到最后执行

GGSCI (single) 32> start extract ext1

Sending START request to MANAGER ...
EXTRACT EXT1 starting


GGSCI (single) 33> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                          
EXTRACT     STOPPED     EXT1        00:00:00      02:47:36   

始终不能start 状态。查询日志
2013-07-13 08:09:06  ERROR   OGG-00685  Oracle GoldenGate Capture for Oracle, ext1.prm:  begin time Jul 13, 2013 5:21:30 AM prior to oldest log in log history. Last SQL executed <SELECT 1   FROM dual  WHERE TO_DATE(:begin_time,'YYYY-MM-DD HH24:MI:SS') <    (SELECT MIN(first_time)       FROM v$log_history      WHERE thread# = :ora_thread)>.
2013-07-13 08:09:06  ERROR   OGG-01668  Oracle GoldenGate Capture for Oracle, ext1.prm:  PROCESS ABENDING.
2013-07-17 10:17:13  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): stop *.

在网上查询资料,说什么需要切换日志,因为资源库与目标库 archivelog 不统一。后来执行
SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

SQL> /

System altered.

完了以后,继续执行
GGSCI (single) 32> start extract ext1
还是起不了,日志中查询一直报 OGG-00685 错误。

Maclean Liu(刘相兵 发表于 2013-7-29 19:19:59

alter system archive log current;
alter system archive log current;
/

执行5~6次再看看

yzm1987 发表于 2013-7-30 11:24:36

今天按照上述方法试了一下,结果还是不行。

源端 ggserr.log
2013-07-18 04:49:17  WARNING OGG-01423  Oracle GoldenGate Capture for Oracle, ext1.prm:  No valid default archive log destination directory found for thread 1.
2013-07-18 04:49:20  INFO    OGG-01515  Oracle GoldenGate Capture for Oracle, ext1.prm:  Positioning to begin time Jul 13, 2013 5:21:30 AM.
2013-07-18 04:49:20  ERROR   OGG-00685  Oracle GoldenGate Capture for Oracle, ext1.prm:  begin time Jul 13, 2013 5:21:30 AM prior to oldest log in log history. Last SQL executed <SELECT 1   FROM dual  WHERE TO_DATE(:begin_time,'YYYY-MM-DD HH24:MI:SS') <    (SELECT MIN(first_time)       FROM v$log_history      WHERE thread# = :ora_thread)>.
2013-07-18 04:49:20  ERROR   OGG-01668  Oracle GoldenGate Capture for Oracle, ext1.prm:  PROCESS ABENDING.

OGG-01423 看了一下,说要打开补充日志
SQL> Select                                                         
  2  SUPPLEMENTAL_LOG_DATA_MIN
  3  ,SUPPLEMENTAL_LOG_DATA_PK
  4  ,SUPPLEMENTAL_LOG_DATA_UI
  5  ,SUPPLEMENTAL_LOG_DATA_FK
  6  ,SUPPLEMENTAL_LOG_DATA_ALL from v$database;

SUPPLEMENTAL_LOG_DATA_MI SUPPLEMEN SUPPLEMEN SUPPLEMEN SUPPLEMEN
------------------------ --------- --------- --------- ---------
YES                      NO        NO        NO        NO

源端配置的参数文件
GGSCI (single) 39> edit params ext1

extract ext1
userid ogguser@TARGET, password ogguser
rmthost target, mgrport 7809
rmttrail /oracle/ggate/dirdat/lt
ddl include mapped objname sender.*;
table sender.*;


目标端
GGSCI (target) 23> view report rep1
***********************************************************************
                 Oracle GoldenGate Delivery for Oracle
    Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230_FBO
   Linux, x64, 64bit (optimized), Oracle 11g on Apr 23 2012 08:48:07

Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.


                    Starting at 2013-07-31 04:11:37
***********************************************************************

Operating System Version:
Linux
Version #1 SMP Thu Sep 3 03:28:30 EDT 2009, Release 2.6.18-164.el5
Node: target
Machine: x86_64
                         soft limit   hard limit
Address Space Size   :    unlimited    unlimited
Heap Size            :    unlimited    unlimited
File Size            :    unlimited    unlimited
CPU Time             :    unlimited    unlimited

Process id: 8179

Description:

***********************************************************************
**            Running with the following parameters                  **
***********************************************************************

2013-07-31 04:11:37  INFO    OGG-03035  Operating system character set identified as UTF-8. Locale: en_US, LC_ALL:.
replicat rep1
HANDLECOLLISIONS
ASSUMETARGETDEFS
userid ogguser@SINGLE,password *******
discardfile /u01/app/oracle/ggs/dirdat/rep1_discard.txt, append, megabytes 10
DDL INCLUDE ALL
DDERROR DEFAULT IGNORE RETRYOP

Source Context :
  SourceModule            :
  SourceID                :
  SourceFunction          :
  SourceLine              :
  ThreadBacktrace         : elements
                          : ]
                          : [/u01/app/oracle/ggs/libgglog.so(CMessageFactory::CreateMessage(CSourceContext*, unsigned int,
...)+0x2cc) ]
                          : [/u01/app/oracle/ggs/libgglog.so(_MSG_ERR_STARTUP_PARAM_UNRECOGNIZED(CSourceContext*, char con
st*, CMessageFactory::MessageDisposition)+0x31) ]
                          : ]
                          : [/u01/app/oracle/ggs/replicat(get_infile_params(time_elt_def*, time_elt_def*, char**)+0x1352)
]
                          : ]
                          : ]
                          : ]

2013-07-31 04:11:38  ERROR   OGG-00014  Unrecognized parameter: DDERROR. Parameter could be misspelled or unsupported.

2013-07-31 04:11:38  ERROR   OGG-01668  PROCESS ABENDING.


目标段参数文件
GGSCI (target) 25> edit params rep1

replicat rep1
HANDLECOLLISIONS
ASSUMETARGETDEFS
userid ogguser@SINGLE,password ogguser
discardfile /u01/app/oracle/ggs/dirdat/rep1_discard.txt, append, megabytes 10
DDL INCLUDE ALL
DDERROR DEFAULT IGNORE RETRYOP
map sender.*,target receiver.*;

目标端补充日志
SQL> Select                                                         
  2  SUPPLEMENTAL_LOG_DATA_MIN
  3  ,SUPPLEMENTAL_LOG_DATA_PK
  4  ,SUPPLEMENTAL_LOG_DATA_UI
  5  ,SUPPLEMENTAL_LOG_DATA_FK
  6  ,SUPPLEMENTAL_LOG_DATA_ALL from v$database;

SUPPLEMENTAL_LOG_DATA_MI SUPPLEMEN SUPPLEMEN SUPPLEMEN SUPPLEMEN
------------------------ --------- --------- --------- ---------
YES                      NO        NO        NO        NO


现在有点想再重新配置一边了~~不知道是哪里出了问题。

Maclean Liu(刘相兵 发表于 2013-9-9 20:25:44

2013-07-18 04:49:17  WARNING OGG-01423  Oracle GoldenGate Capture for Oracle, ext1.prm:  No valid default archive log destination directory found for thread 1.
2013-07-18 04:49:20  INFO    OGG-01515  Oracle GoldenGate Capture for Oracle, ext1.prm:  Positioning to begin time Jul 13, 2013 5:21:30 AM.
2013-07-18 04:49:20  ERROR   OGG-00685  Oracle GoldenGate Capture for Oracle, ext1.prm:  begin time Jul 13, 2013 5:21:30 AM prior to oldest log in log history. Last SQL executed <SELECT 1   FROM dual  WHERE TO_DATE(:begin_time,'YYYY-MM-DD HH24:MI:SS') <    (SELECT MIN(first_time)       FROM v$log_history      WHERE thread# = :ora_thread)>.


肯定是archivelog 配置的有问题 , 重建 extract或者可以解决问题

Maclean Liu(刘相兵 发表于 2013-9-9 20:44:24

1) Bug 13840207 & Bug 13628811

The bug occurs when there is a race condition in producer_status that is called from both the consumer and coordinator threads.
Restart of the extract would work fine in this case


The first one is Bug 13628811 which was backported in label OGGCORE_11.2.1.0.1_PLATFORMS_120229.1800
The second is Bug 13840207 which was backported in label OGGCORE_11.2.1.0.2_PLATFORMS_120516.1800


2) Bug 14105014

The bug occurs due to issues reading the CSN information from output recovery file

renaming the <extract process name>.cps file from dirchk and restarting the extract may help


If extract generated a core then gdb (dbx for AIX) output would show a error like below

$gdb extract <core file name>
...
Core was generated by `extract'.
Program terminated with signal 10, Bus error.

(gdb) where
#0 0x40000000006c3ce0:0 in parse_supp_token_buf () at
/scratch/aime1/adestore/views/aime1_staoi06/oggcore/OpenSys/src/gglib/ggapp/checkpt.c:2213
#1 0x40000000006ca060:0 in CP_read_checkpoint_csn () at
/scratch/aime1/adestore/views/aime1_staoi06/oggcore/OpenSys/src/gglib/ggapp/checkpt.c:2785
#2 0x40000000003b7850:0 in START_EXTRACT_CHECKPOINT ()


3) Bug 13894704

Extract could core dump if minimum db level supplemental logging is not enabled

OGGCORE_11.2.1.0.1_PLATFORMS_120403.1800 contains changes from arsepulv_rfi_backport_13894704_11.2.1.0.1
4) Bug 13860160

This bug causes the Extract to core dump when capturing inserts from an IOT table generated by
doing a select from another table:

eg., insert into <iot table> select * from <table>;

5) Bug 14546106

Issue can also happen when the system has insufficient heap size. Here is an example showing the problem

Snippet from the extract report file

Operating System Version:
AIX
Version 6, Release 1
Node: kipe401
Machine: 00F634464C00
soft limit hard limit
Address Space Size : unlimited unlimited
Heap Size : 268435456 unlimited
File Size : unlimited unlimited
CPU Time : unlimited unlimited

Increasing the heap size resolved the issue in this case

customer used LDR_CNTRL=MAXDATA=0xC0000000 to do the increase
6) Bugs 14268459, 14371558 . Core shows bus error like below

Core was generated by `/u02/app/oracle/product/11.2.1.0.1/extract PARAMFILE /u02/app/oracle/product/11'.
Program terminated with signal 7, Bus error.
#0 0x000000314787c613 in ?? ()

7) Bug 14477847

Replicat core dumps and stack shows following

Program received signal SIGSEGV, Segmentation fault.
0x000000364be79ba0 in strlen () from /lib64/libc.so.6
(gdb) where
#0 0x000000364be79ba0 in strlen () from /lib64/libc.so.6
#1 0x000000364be798d6 in strdup () from /lib64/libc.so.6
#2 0x00002aaaaab86725 in CDataItem::CDataItem (this=0x7fffffff8f80, value=@0x7fffffff8f58)
at /scratch/mmar/view_storage/mmar_14066483/oggcore/OpenSys/src/gglib/gglog/CDataItem.cpp:109
#3 0x000000000051e3ec in take_rep_err_action (errtype=218, err=0, err_msg=0x0, extr_ptr=0x1e57270, hdr=0x14b7420,
rec_buf=0x12b4580 "", src_file=0x17f52b0, commit_error=false)
at /scratch/mmar/view_storage/mmar_14066483/oggcore/OpenSys/src/app/er/errors.cpp:376
#4 0x0000000000536a80 in process_extract_loop ()
at /scratch/mmar/view_storage/mmar_14066483/oggcore/OpenSys/src/app/er/processloop.cpp:1791
#5 0x0000000000548722 in main (argc=<value optimized out>, argv=<value optimized out>)
at /scratch/mmar/view_storage/mmar_14066483/oggcore/OpenSys/src/app/er/replicat/replicat.cpp:899
(gdb)

8) Replicat core dumps and stack from the core shows following

libclntsh.so.11.1`OCIStmtExecute+0x30(102b58148, 103a26a40, 102b73068, 3e8, 0, 0)
__1cTDBOCI_exec_stmt_ctx6FpnTOCISESS_context_def_pnODBOCI_stmt_def_pnUBATCHSQdDL_db_params_t__i_+0xbc(1009300e0, 1036f46b0,
102abf990, 0, 10092b000, 103e97fb0)
__1cVDBBATCHSQdDL_exec_batch6FpnOBATCHSQdDL_ctx_t_pnUBATCHSQdDL_db_params_t__i_+0x1b4(1018d7600, 102abf990, 101735440, 102abf990,
103a3c750, 1017353d0)
__1cRexec_batch_thread6Fpv_v_+0x1ac(1036fb150, 230a3c, 102abf990, ffffffff, 87a0000, 10090b550)
__1cRcheck_queue_batch6FpnOBATCHSQdDL_ctx_t_pnQBATCHSQdDL_batch_t_i_i_+0x1e8(1, 1, 0, 100a05, 100a05e78, ffffffff)
__1cTexec_queue_threaded6FpnOBATCHSQdDL_ctx_t_pnQBATCHSQdDL_queue_t__i_+0x294(1018d7600, 103449f10, 1017354b0, 10090b550, 0, 0)
__1cTREP_end_transaction6Fh_i_+0x83c(100911000, 100925540, 100756804, 100926b40, 0, 3e8)
__1cUprocess_extract_loop6F_h_+0x60fc(100800, 0, 0, 101463b68, 10074eb00, 100925540)
main+0xc8c(100756e37, 1009268f0, 1007572f4, 100800, 101752350, 100757205)
_start+0x17c(0, 0, 0, 0, 0, 0)

BATCHSQL can be disabled as a work-around. Fix is in bug 10379816
页: [1]
查看完整版本: 纠结了两天OGG 错误~实在是找不到解决办法~请指教