[转载]TNS-12537 TNSconnection closed ORA-609错误处理
(2013-04-25 00:18:19)
标签:
转载 |
分类: 知识储备 |
起因:
最近研发反应JDBC和PL/SQL工具连接过慢,昨天事情多懒得去管http://www/uc/myshow/blog/misc/gif/E___6724EN00SIGG.gifTNSconnection
今早检查alter.log后发现报错
Fatal NI connect error 12637, connecting to:
TNS-12637: Packet receive failed
opiodr aborting process unknown ospid (24234) as a result of ORA-609
Tue Apr 23 11:05:15 2013
故障分析
检查 listener 日志确认没有异常,在部分机器 TNSPING XXX 60 等进行尝试结果正常,无链接问题
明显感觉,只要第一次 tnsping 链接正常,之后的 jdbc 链接就正常;如果第一次不正常,之后的 jdbc 链接就容易返回超时的错误。
尝试重启 lsnrctl stop/start
和重新加载配置文件
tnsping xxxx N 的时候明显观察到:
OK (56730 msec)
OK (80 msec)
OK (80 msec)
OK (70 msec)
OK (70 msec)
OK (60 msec)
OK (47820 msec)
OK (70 msec)
同时 PING IP
的结果并没有出现延时,机器之前并未设置防火墙等,可以排除网络和防火墙问题。
在连续
TNSPING
的情况下,出现明显链接延时。
解决方案
方案一:(有效,但是由于是测试环境,所以影响层面未评估)
注释掉 /etc/resolve.conf 配置文件中对 DNS 的解析解决
[oracle@vcdog log]$ cat /etc/resolv.conf
#nameserver 202.106.0.20
方案二:官方文档(效果不明显)
ORA-609 TNS-12537 and TNS-12547 in 11g Alert.log [ID
1116960.1]
Cause
The ORA-609 error is thrown when a client connection
of any kind failed to complete or aborted the connection process
before the server process was completely spawned.
Very often, this connection abort is due to a
timeout.
We have also discovered that the ORA-609 occurs frequently in
installations where the database is monitored by DB Console and the
Enterprise Manager agent
(emagent).
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
It can be somewhat
challenging
For that reason, we often recommend increasing the
values for INBOUND_CONNECT_TIMEOUT at both listener and server side
sqlnet.ora file as a preventive
measure.
e.g.
Sqlnet.ora:
SQLNET.INBOUND_CONNECT_TIMEOUT=180
Listener.ora:
INBOUND_CONNECT_TIMEOUT_listener_name=120
These settings are in seconds.
If the issue persists and inbound connect does not
have any effect, the following steps are intended to help
locate
1)
DIAG_ADR_ENABLED_
This will cause the TNS errors to be posted to the
ORACLE_HOME/network/log/sqlnet.log file that is local to the
database and may yield useful information about the client's
address.
For example, here's a snippet from a server side sqlnet.log where
client address info was posted:
Production Time: 15-FEB-2010
07:15:01
Fatal NI connect error 12537, connecting
to:
(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(Host=yourhost)(Port=1521))(CONNECT_DATA=(SID=PROD1DR)(CID=(PROGRAM=sqlplus)(HOST=client_host)(USER=client))))
Observe the PROGRAM and HOST fields on the last
line.
Be sure to match timestamps in the sqlnet.log with the timestamps
of the alert.log errors.
TRACE_LEVEL_CLIENT=16
TRACE_DIRECTORY_CLIENT=
TRACE_TIMESTAMP_CLIENT=TRUE
DIAG_ADR_ENABLED=off
If you need assistance with client or server tracing, please open an SR with Global Customer Support.
2)
Here's an example snippet of an incoming client connection that was
posted to the listener.log:
20-JAN-2009 17:08:45 (CONNECT_DATA=(SID=orcl)(CID=(PROGRAM=D:oracleproduct10.1.0Db_1perl5.6.1binMSWin32-x86perl.exe)(HOST=myclient)
Note that the exact timestamp, program name and client host will
often be recorded.
3)
Match the PID that accompanies the ORA-609 to the server trace
label.
ORA-609 : opiodr aborting process unknown ospid
(4799_1)
This PID would correspond to server trace
labeled:
See below for instuctions on enabling Oracle Net server
tracing.
The following details the discovery of the source of an ORA-609 for a real case:
The alert.log reports the following messages intermittently but frequently:
Mon Nov 16 22:39:22 2009
ORA-609 : opiodr aborting process unknown ospid
(nnnn)
Enabled Oracle Net server tracing:
TRACE_LEVEL_SERVER=16
TRACE_DIRECTORY_SERVER=
TRACE_TIMESTAMP_SERVER=TRUE
DIAG_ADR_ENABLED=off
Reloaded listener and wait for error to appear again.:
ORA-609 : opiodr aborting process unknown
ospid
Note that the server trace file set that corresponded to this event
was named
Of course the timestamps of the alert.log event and the server
trace creation matched as well.
A review of the server trace showed only an EOF failure and
the
Read unexpected EOF ERROR
nserror: nsres: id=0, op=68, ns=12537
In this particular case, there was no information about the client
in the trace. This is atypical for a server trace.
Here's an excerpt from a listener.log entry where an emagent
establishes a connection:
PROGRAM=D:oracleproduct10.1.0Db_1binemagent.exe)
Checked the EM Agent traces and logs and discovered the following
entry:
Fatal NI connect error 12547, connecting
to:
(LOCAL=NO)
VERSION INFORMATION:
TNS for Solaris: Version 11.1.0.7.0 -
Production
Oracle Bequeath NT Protocol Adapter for Solaris:
Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Solaris: Version
11.1.0.7.0 - Production
Time: 16-NOV-2009 22:39:22
****Tracing to file:
/backup/sid_traces/sqlnetlog/svr_5233.trc
Tns error struct:
ns main err code: 12547
TNS-12547: TNS:lost contact
ns secondary err code: 12560
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
****Note the name of the server trace which contains
the PID:
Also, the timestamp of the agent event matches the
timestamp of the alert.log error.
Check the following locations for EM Agent traces. If
working with support on this issue and the EM Agent is suspected,
upload ALL files under:
$ORACLE_HOME/sysman/log/emagent.trc < Single node agent trace
location
$ORACLE_HOME/host/sysman/log/emagent.trc < RAC agent trace
location
It was determined that in this case, the emagent was aborting the
connection before it was complete and then simply reconnecting and
succeeding on the subsequent
try.
Please review the following documents for more information about
timeouts and tracing:
Note
119706.1
Errors
Note
345197.1
Intermittently Fail with ORA-3113,ORA-3106 or ORA-3136 from
10.2 Onwards
Note
405755.1
if an RDA is not Available
Note
395525.1
Kerberos and External procedure Tracing from Net Manager
Note
454927.1
(ADR) with Oracle Net for 11g