雪影舞剑 发表于 2014-3-27 13:57:46

v$process中存在部分进程不在v$session中

AIX单机11.2.0.3 v$process中存在部分进程不在v$session中,在os上查询均为LOCAL=NO的进程,这些进程的数量不断增加,请问是何原因

相关信息在附件中,请参考,请ML与各位朋友解答,谢谢!

Liu Maclean(刘相兵 发表于 2014-3-27 20:49:48


通过SSD 看几个 没有session的process:

18809158
33295340
65668442


  SO: 0x7000017810c2a38, type: 2, owner: 0x0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
   proc=0x7000017810c2a38, name=process, file=ksu.h LINE:12616 ID:, pg=0
  (process) Oracle pid:32, ser:187, calls cur/top: 0x0/0x7000017312be0f0
            flags : (0x0) -
            flags2: (0x0),  flags3: (0x10)
            intr error: 0, call error: 0, sess error: 0, txn error 0
            intr queue: empty
    ksudlp FALSE at location: 0
  (post info) last post received: 0 0 0
              last post received-location: No post
              last process to post me: none
              last post sent: 0 0 16
              last post sent-location: ksu.h LINE:13945 ID:ksupsc
              last process posted by me: 700001741076900 1 14
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 0x7000017610eb6d0
    O/S info: user: oracle, term: UNKNOWN, ospid: 18809158
    OSD pid info: Unix process pid: 18809158, image: oracle@tplreport
    Short stack dump:
ksedsts()+360<-ksdxfstk()+44<-ksdxcb()+3384<-sspuser()+116<-48bc<-nttfprd()+736<-nsbasic_brc()+1116<-nsbrecv()+128<-nioqrc()+17480<-opikndf2()+2024<-opitsk()+752<-opiino()+940<-opiodr()+720
<-opidrv()+1132<-sou2o()+136<-opimai_real()+608<-ssthrdmain()+268<-main()+204<-__start()+112
    ----------------------------------------
    SO: 0x7000017614992e8, type: 14, owner: 0x7000017810c2a38, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
     proc=0x7000017810c2a38, name=channel handle, file=ksr2.h LINE:368 ID:, pg=0
    (broadcast handle) 7000017614992e8 flag: (2) ACTIVE SUBSCRIBER,
                       owner: 7000017810c2a38 - ospid: 18809158
                       event: 538425, last message event: 538425,
                       last message waited event: 538425,
                       next message: 0(0), messages read: 0
                      channel: (70000174154fa10) system events broadcast channel
                               scope: 2, event: 1165293, last mesage event: 43,
                               publishers/subscribers: 0/1031,
                               messages published: 1
                               heuristic msg queue length: 0
    ----------------------------------------
    SO: 0x7000017615dc910, type: 24, owner: 0x7000017810c2a38, flag: -/-/-/0x00 if: 0x3 c: 0x3
     proc=0x7000017810c2a38, name=ksz parent, file=ksz2.h LINE:432 ID:, pg=0
    ----------------------------------------
    SO: 0x7000017312be0f0, type: 3, owner: 0x7000017810c2a38, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x7000017810c2a38, name=call, file=ksu.h LINE:12620 ID:, pg=0
    (call) sess: cur 0, rec 0, usr 0; flg:0 fl2:1; depth:0
    svpt(xcb:0x0 sptn:0x5c uba: 0x00000000.0000.00)
    ksudlc FALSE at location: 0
    ----------------------------------------
    SO: 0x70000181b60a230, type: 20, owner: 0x7000017810c2a38, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
     proc=0x7000017810c2a38, name=OS proc request holder, file=kso2.h LINE:522 ID:, pg=0
    (osp req holder)




  SO: 0x70000176103de60, type: 2, owner: 0x0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
   proc=0x70000176103de60, name=process, file=ksu.h LINE:12616 ID:, pg=0
  (process) Oracle pid:35, ser:134, calls cur/top: 0x0/0x700001731265f78
            flags : (0x0) -
            flags2: (0x0),  flags3: (0x10)
            intr error: 0, call error: 0, sess error: 0, txn error 0
            intr queue: empty
    ksudlp FALSE at location: 0
  (post info) last post received: 0 0 0
              last post received-location: No post
              last process to post me: none
              last post sent: 0 0 16
              last post sent-location: ksu.h LINE:13945 ID:ksupsc
              last process posted by me: 700001741076900 1 14
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 0x7000017610eb6d0
    O/S info: user: oracle, term: UNKNOWN, ospid: 33295340
    OSD pid info: Unix process pid: 33295340, image: oracle@tplreport
    Short stack dump:
ksedsts()+360<-ksdxfstk()+44<-ksdxcb()+3384<-sspuser()+116<-48bc<-nttfprd()+736<-nsbasic_brc()+1116<-nsbrecv()+128<-nioqrc()+17480<-opikndf2()+2024<-opitsk()+752<-opiino()+940<-opiodr()+720
<-opidrv()+1132<-sou2o()+136<-opimai_real()+608<-ssthrdmain()+268<-main()+204<-__start()+112
    ----------------------------------------
    SO: 0x7000017614a34f8, type: 14, owner: 0x70000176103de60, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
     proc=0x70000176103de60, name=channel handle, file=ksr2.h LINE:368 ID:, pg=0
    (broadcast handle) 7000017614a34f8 flag: (2) ACTIVE SUBSCRIBER,
                       owner: 70000176103de60 - ospid: 33295340
                       event: 1139768, last message event: 1139768,
                       last message waited event: 1139768,
                       next message: 0(0), messages read: 0
                      channel: (70000174154fa10) system events broadcast channel
                               scope: 2, event: 1165293, last mesage event: 43,
                               publishers/subscribers: 0/1031,
                               messages published: 1
                               heuristic msg queue length: 0
    ----------------------------------------
    SO: 0x70000173124f178, type: 24, owner: 0x70000176103de60, flag: -/-/-/0x00 if: 0x3 c: 0x3
     proc=0x70000176103de60, name=ksz parent, file=ksz2.h LINE:432 ID:, pg=0
    ----------------------------------------
    SO: 0x700001731265f78, type: 3, owner: 0x70000176103de60, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x70000176103de60, name=call, file=ksu.h LINE:12620 ID:, pg=0
    (call) sess: cur 0, rec 0, usr 0; flg:0 fl2:1; depth:0
    svpt(xcb:0x0 sptn:0x59 uba: 0x00000000.0000.00)
    ksudlc FALSE at location: 0
    ----------------------------------------
    SO: 0x70000174beb8ca0, type: 20, owner: 0x70000176103de60, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
     proc=0x70000176103de60, name=OS proc request holder, file=kso2.h LINE:522 ID:, pg=0
    (osp req holder)
  PSO child state object changes :



  SO: 0x70000175103b530, type: 2, owner: 0x0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
   proc=0x70000175103b530, name=process, file=ksu.h LINE:12616 ID:, pg=0
  (process) Oracle pid:36, ser:38, calls cur/top: 0x0/0x700001731261b90
            flags : (0x0) -
            flags2: (0x0),  flags3: (0x10)
            intr error: 0, call error: 0, sess error: 0, txn error 0
            intr queue: empty
    ksudlp FALSE at location: 0
  (post info) last post received: 0 0 0
              last post received-location: No post
              last process to post me: none
              last post sent: 0 0 16
              last post sent-location: ksu.h LINE:13945 ID:ksupsc
              last process posted by me: 700001741076900 1 14
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 0x7000017610eb6d0
    O/S info: user: oracle, term: UNKNOWN, ospid: 65668442
    OSD pid info: Unix process pid: 65668442, image: oracle@tplreport
    Short stack dump:
ksedsts()+360<-ksdxfstk()+44<-ksdxcb()+3384<-sspuser()+116<-48bc<-nttfprd()+736<-nsbasic_brc()+1116<-nsbrecv()+128<-nioqrc()+17480<-opikndf2()+2024<-opitsk()+752<-opiino()+940<-opiodr()+720
<-opidrv()+1132<-sou2o()+136<-opimai_real()+608<-ssthrdmain()+268<-main()+204<-__start()+112
    ----------------------------------------
    SO: 0x7000017614a08f0, type: 14, owner: 0x70000175103b530, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
     proc=0x70000175103b530, name=channel handle, file=ksr2.h LINE:368 ID:, pg=0
    (broadcast handle) 7000017614a08f0 flag: (2) ACTIVE SUBSCRIBER,
                       owner: 70000175103b530 - ospid: 65668442
                       event: 592237, last message event: 592237,
                       last message waited event: 592237,
                       next message: 0(0), messages read: 0
                      channel: (70000174154fa10) system events broadcast channel
                               scope: 2, event: 1165293, last mesage event: 43,
                               publishers/subscribers: 0/1031,
                               messages published: 1
                               heuristic msg queue length: 0
    ----------------------------------------
    SO: 0x7000017615dce38, type: 24, owner: 0x70000175103b530, flag: -/-/-/0x00 if: 0x3 c: 0x3
     proc=0x70000175103b530, name=ksz parent, file=ksz2.h LINE:432 ID:, pg=0
    ----------------------------------------
    SO: 0x700001731261b90, type: 3, owner: 0x70000175103b530, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x70000175103b530, name=call, file=ksu.h LINE:12620 ID:, pg=0
    (call) sess: cur 0, rec 0, usr 0; flg:0 fl2:1; depth:0
    svpt(xcb:0x0 sptn:0x50 uba: 0x00000000.0000.00)
    ksudlc FALSE at location: 0
    ----------------------------------------
    SO: 0x70000174be80a00, type: 20, owner: 0x70000175103b530, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
     proc=0x70000175103b530, name=OS proc request holder, file=kso2.h LINE:522 ID:, pg=0
    (osp req holder)

Liu Maclean(刘相兵 发表于 2014-3-27 20:51:38


总结下相似出:

stack call均为:

ksedsts()+360<-ksdxfstk()+44<-ksdxcb()+3384<-sspuser()+116<-48bc<-nttfprd()+736<-nsbasic_brc()+1116<-nsbrecv()+128<-nioqrc()+17480<-opikndf2()+2024<-opitsk()+752<-opiino()+940<-opiodr()+720
<-opidrv()+1132<-sou2o()+136<-opimai_real()+608<-ssthrdmain()+268<-main()+204<-__start()+112


ksedsts()+360<-ksdxfstk()+44<-ksdxcb()+3384<-sspuser()+116<-48bc<-nttfprd()+736<-nsbasic_brc()+1116<-nsbrecv()+128<-nioqrc()+17480<-opikndf2()+2024<-opitsk()+752<-opiino()+940<-opiodr()+720
<-opidrv()+1132<-sou2o()+136<-opimai_real()+608<-ssthrdmain()+268<-main()+204<-__start()+112


且process的当前call 均为name=call, file=ksu.h LINE:12620

Liu Maclean(刘相兵 发表于 2014-3-27 21:20:00

可能的一种解释;

ODM FINDING:

The reason is that this process (which make some sessions expired due to exceed connect and experation time parameter
A user starts a program/session, then leaves it running and idle for an extended period of time , or may be it is ended abnormally

The recommendation to avoid such behaviour is :


It is strongly recommended that both DCD and Resource Limits with Profiles be implemented in order to clean up resources at both the database and OS level

This combination will not clean up IDLE / ABANDONED / INACTIVE connections (OS processes) as these sessions still have active clients

For this case we will see that :
* PMON has cleaned up the V$SESSION entries .. but both the OS processes and the V$PROCESS entries will still exist
* SQLNET will continue to be able to send the 10 byte packet successfully until the session is logged off

N.B: please follow solution above to avoid such behavior not to kill the process from OS level



WHY? - When an Oracle session is terminated ... whether by the user logging out ... a kill session by a privileged user ... or even the user reaching a resource limit set in their profile
                           Oracle SMON (or PMON) will first clean up the database resources (rollback transactions ... release locks etc)
                           Oracle then will remove the entry in V$SESSION as the session is now 'cleaned up'
                           Oracle will then request that the operating system terminate any OS processes associated with the former Oracle Session

The Oracle Kernel cannot force the OS to terminate the processes .. this occurs at the OS level only ... all it can do is request that they be terminated.

These are previously valid connections with the database but the connection between the client and server processes has terminated abnormally.
Examples of a dead connection:
- A user reboots/turns-off their machine without logging off or disconnecting from the database.
- A network problem prevents communication between the client and the server.

In these cases, the shadow process running on the server and the session in the database may not terminate.PMON has cleaned up the V$SESSION entries .. but both the OS processes and the V$PROCESS entries will still exist

The solutions to this scenario can are to cleanup the unwanted OS processes ... after which the $PROCESS entries should be removed automatically

Liu Maclean(刘相兵 发表于 2014-3-27 21:20:39

high increase in Oracle processes and huge gap between v$process and v$session
一个类似的SR ,也是发生在11.2.0.3 上,似乎这个版本有一些特殊的处理被引入了

Liu Maclean(刘相兵 发表于 2014-3-27 21:24:09

另一种可能性:

lient is on different machine (not db server) and is using jdbc5 : (PROGRAM=JDBC Thin Client)(HOST=__jdbc__)

We've done some tracing and we think that we can explain the big difference between v$process and v$session. It looks like the listener is spawning a process for a incoming connection but it's waiting to get through to the database (so no complete login is done). So the process is present in the database but the login is not complete, it's waiting on resources, it's blocked....
We say this because the tracing shows that this process does not have a datafile open. Normally every oracle user process at leasts opens 1 datafile (system datafile for login purposes : authentication, grants, ...).
Question now is : why does it not get further ? There are some wait events on the database.
After a while, server runs out of memory (100% swap and mem usage) and nothing can be done any more.

Liu Maclean(刘相兵 发表于 2014-3-27 21:29:50


NOTE:395505.1 - How to Check if Dead Connection Detection (DCD) is Enabled in 9i ,10g and 11g
How To Track Dead Connection Detection(DCD) Mechanism Without Enabling Any Client/Server Network Tracing (Doc ID 438923.1)
A discussion of Dead Connection Detection, Resource Limits, V$SESSION, V$PROCESS and OS processes (Doc ID 601605.1)
NOTE:151972.1 - Dead Connection Detection (DCD) Explained

雪影舞剑 发表于 2014-3-28 14:21:44

感谢ML回复,是否我可以考虑先启用DCD
Implemented by
      * adding SQLNET.EXPIRE_TIME = <MINUTES> to the sqlnet.ora file

雪影舞剑 发表于 2014-3-31 10:00:37

在一个有同样问题的测试库尝试了一下,三天过去了,未发现LOCAL=NO的不关联进程
  1* select spid,pname,username,program from v$process where addr not in (select paddr from v$session)
SQL> /

SPID                         PNAME USERNAME        PROGRAM
------------------------ ----- --------------- ------------------------------------------------
                                               PSEUDO
12915                         D000  oracle               oracle@MIS_UAT (D000)
12919                         S000  oracle               oracle@MIS_UAT (S000)
页: [1]
查看完整版本: v$process中存在部分进程不在v$session中