Comments (10)
- Describe what the program's doing.
- Try adding the line
disable_oob=on
to a sqlnet.ora file on the machine where your app is running. - Try remove the DIAG_* lines from the sqlnet.ora file.
- Try with a recent version of the DB.
from odpi.
- Our program is calling a procedure which has an insert command inside. We are using a connection string like this:
host:port/service?CONNECT_TIMEOUT=300ms&RECV_TIMEOUT=200ms&SEND_TIMEOUT=60000ms
We used pooled connection where during startup of the application, it would create some connections with the following pool creation params:
poolCreateParams.maxSessions = maxSessions;
poolCreateParams.minSessions = maxSessions;
poolCreateParams.sessionIncrement = 0;
poolCreateParams.pingTimeout = 1000; //millisec
poolCreateParams.pingInterval = 120;//seconds
poolCreateParams.timeout = 0;
We needed constant amount of connection available at all times. Then we added the pool wait time to operate in a NON-BLOCKING mode . If there is a timeout then it would result in a failure which is okay sometimes. It rarely happens though in our scenario. We have seen we were getting more than 300k request per hour where ~70 reqs were not getting any connection. This is okay at out current deployment.
dpiPool_setGetMode(*pool, DPI_MODE_POOL_GET_TIMEDWAIT);
dpiPool_setWaitTimeout(*pool, timeout);
But when there is a space issue in DB, then the procedure execution would fail after 2s per connection and give the error unable to extend table
. We needed the procedure to finish either successfully or fail within a shorter time period. Usually for successful cases it takes 5ms avg so we tried to add a RECV_TIMEOUT of 200ms for the fail cases to expedite any return from procedure execution. Application behaves mostly the way it should but every time during the issue it generates core dump from a segmentation fault deep inside libclntsh
which disrupts the production env and congestion occurs in our network.
disable_oob=on
will be added- We added
sqlnet.ora
to remove huge files created for diagnostic purposes and added those DIAG_* lines and I was confused that the timeouts in the connection string may be overridden by thesqlnet.ora
and added those too to the file. Are diagnostics needed for debugging? We have asqlnet.log
file created where the application is running. It creates mainly the following lines,
***********************************************************************
Fatal NI connect error 12520, connecting to:
(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=)(CID=(PROGRAM=)(HOST=)(USER=root)))(CONNECT_TIMEOUT=300ms)(RECV_TIMEOUT=200ms)(SEND_TIMEOUT=60000ms)(ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=))(ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=))(ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=)))
VERSION INFORMATION:
TNS for Linux: Version 19.0.0.0.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production
Version 19.11.0.0.0
Time: 15-OCT-2021 09:28:29
Tracing not turned on.
Tns error struct:
ns main err code: 12564
TNS-12564: Message 12564 not found; No message file for product=network, facility=TNS
ns secondary err code: 12560
nt main err code: 524
TNS-00524: Message 524 not found; No message file for product=network, facility=TNS
nt secondary err code: 115
nt OS err code: 0
***********************************************************************
Fatal NI connect error 12170.
VERSION INFORMATION:
TNS for Linux: Version 19.0.0.0.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production
Version 19.11.0.0.0
Time: 15-OCT-2021 14:13:35
Tracing not turned on.
Tns error struct:
ns main err code: 12535
TNS-12535: Message 12535 not found; No message file for product=network, facility=TNS
ns secondary err code: 12609
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=))
- Our production is running with 11.2 and we don't have any other options right now.
from odpi.
now we are getting this,
#0 0x00007f496269b4fb in raise () from /lib64/libpthread.so.0
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-324.el7_9.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 libgcc-4.8.5-44.el7.x86_64 numactl-libs-2.0.12-5.el7.x86_64 systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64
(gdb) bt
#0 0x00007f496269b4fb in raise () from /lib64/libpthread.so.0
#1 0x00007f495fe7e1bf in skgesigOSCrash () from /usr/lib/oracle/19.11/client64/lib/libclntsh.so
#2 0x00007f4960572e3d in kpeDbgSignalHandler () from /usr/lib/oracle/19.11/client64/lib/libclntsh.so
#3 0x00007f495fe7e492 in skgesig_sigactionHandler () from /usr/lib/oracle/19.11/client64/lib/libclntsh.so
#4 <signal handler called>
#5 0x00007f495ed62849 in kpucopyuserhndl () from /usr/lib/oracle/19.11/client64/lib/libclntsh.so
#6 0x00007f495ee02142 in kpuspextend () from /usr/lib/oracle/19.11/client64/lib/libclntsh.so
#7 0x00007f495ee045ac in kpuspgetfreesession () from /usr/lib/oracle/19.11/client64/lib/libclntsh.so
#8 0x00007f495ee00f6f in kpuspgetpooledsession () from /usr/lib/oracle/19.11/client64/lib/libclntsh.so
#9 0x00007f495edfd04d in kpuspsessionget () from /usr/lib/oracle/19.11/client64/lib/libclntsh.so
#10 0x00007f495eb3dd01 in OCISessionGet () from /usr/lib/oracle/19.11/client64/lib/libclntsh.so
#11 0x00007f496246c3a4 in dpiOci__sessionGet (envHandle=0x1685d00, handle=handle@entry=0x7f48c400d680, authInfo=authInfo@entry=0x7f4865fd7618,
connectString=connectString@entry=0x16baa68 "OCI:SP:zpuyRFdHlCPgU6hJZQr8DC", connectStringLength=connectStringLength@entry=29, tag=<optimized out>,
tagLength=0, outTag=outTag@entry=0x7f493f7e58c8, outTagLength=outTagLength@entry=0x7f493f7e58d0, found=found@entry=0x7f493f7e58d4, mode=mode@entry=1,
error=error@entry=0x7f493f7e5850) at src/dpiOci.c:3136
#12 0x00007f4962450f1c in dpiConn__getSession (error=0x7f493f7e5850, authInfo=0x7f4865fd7618, params=0x7f493f7e5870, connectStringLength=29,
connectString=0x16baa68 "OCI:SP:zpuyRFdHlCPgU6hJZQr8DC", mode=1, conn=0x7f48c400d660) at src/dpiConn.c:861
#13 dpiConn__get (error=0x7f493f7e5850, pool=0x1685b60, createParams=0x7f493f7e5870, connectStringLength=29,
connectString=0x16baa68 "OCI:SP:zpuyRFdHlCPgU6hJZQr8DC", passwordLength=<optimized out>, password=<optimized out>, userNameLength=<optimized out>,
userName=<optimized out>, conn=0x7f48c400d660) at src/dpiConn.c:636
#14 dpiConn__create (conn=0x7f48c400d660, context=<optimized out>, userName=userName@entry=0x0, userNameLength=userNameLength@entry=0,
password=password@entry=0x0, passwordLength=passwordLength@entry=0, connectString=0x16baa68 "OCI:SP:zpuyRFdHlCPgU6hJZQr8DC", connectStringLength=29,
pool=pool@entry=0x1685b60, commonParams=commonParams@entry=0x0, createParams=createParams@entry=0x7f493f7e5870, error=error@entry=0x7f493f7e5850)
at src/dpiConn.c:414
#15 0x00007f496245ad71 in dpiPool__acquireConnection (pool=pool@entry=0x1685b60, userName=userName@entry=0x0, userNameLength=userNameLength@entry=0,
password=password@entry=0x0, passwordLength=0, params=params@entry=0x7f493f7e5870, conn=conn@entry=0x7f493f7e5958, error=error@entry=0x7f493f7e5850)
at src/dpiPool.c:36
#16 0x00007f496245af1d in dpiPool_acquireConnection (pool=pool@entry=0x1685b60, userName=userName@entry=0x0, userNameLength=userNameLength@entry=0,
password=password@entry=0x0, passwordLength=passwordLength@entry=0, params=0x7f493f7e5870, params@entry=0x0, conn=conn@entry=0x7f493f7e5958)
at src/dpiPool.c:341
#17 0x0000000000406a5e in dpi__getConnFromPool (pool=0x1685b60) at db_util.c:197
sqlnet.log
SQLNET.RECV_TIMEOUT=200ms
SQLNET.OUTBOUND_CONNECT_TIMEOUT=300ms
DISABLE_OOB=on
ADR_BASE=/data01/oracle_diag
We had exactly this issue before adding DIAG_* params in sqlnet.ora file. Here is a stackoverflow thread which suggested DIAG_*'s addition.
from odpi.
Maybe
poolCreateParams.sessionIncrement = 1;
poolCreateParams.timeout = 100;
Would be more real-world (what libclntsh writers awaited for)?
from odpi.
from doc here
dpiPoolCreateParams.timeout
Specifies the length of time (in seconds) after which idle sessions in the pool are terminated. Note that termination only occurs when the pool is accessed. The default value is 0 which means that no idle sessions are terminated. This value can be set after the pool has been created using the function dpiPool_setTimeout() and acquired using the function dpiPool_getTimeout().
dpiPoolCreateParams.sessionIncrement
Specifies the number of sessions that will be created by the session pool when more sessions are required and the number of sessions is less than the maximum allowed. This value is ignored if the dpiPoolCreateParams.homogeneous member has a value of 0. This value added to the dpiPoolCreateParams.minSessions member value must not exceed the dpiPoolCreateParams.maxSessions member value. The default value is 0.
I need a constant number of connections available at all time. so I added,
poolCreateParams.maxSessions = maxSessions;
poolCreateParams.minSessions = maxSessions;
Both have the same value assigned. I don't want to terminate idle connections ,so there will be no cost of creating one when needed. It can fail if there is none available within the pool timeout time. So I decided to assign session increment to zero. As there is no termination, max and min has same value assigned to the maxSessions param.
from odpi.
@cjbj, Should we consider updating instant client to 19.12? Where should we go from here? Changing DB version is not an option right now.
I am getting the feeling that these tools work better with relatively new versions of oracle DB. But to be honest, this timeout feature is too basic to fail this way. So, I am not sure why it is not working as it should. Is there any other info needed for you guys to understand it better?
from odpi.
I cannot unsee the similarity with godror/godror#192 (comment) - we se SIGSEGVs when the pool is full (timeouts set), and acquiring new connections concurrently and fast enough.
Limiting the number of sessions on the call side (Go) evades this.
from odpi.
The Go issue could be unrelated, so let's treat the issues separately for now.
I just finished a discussion with the OCI developers. They want to know what's going but, but as expected, a testcase would help diagnose the problem.
We could look at trace. Overall it would be best if you could create an Oracle support request for the problem so "sensitive" files like traces can be shared; I don't know whether you can do this since you have 11.2 DB? It would be channelled through the OCI support team.
from odpi.
@quidstone if you're unable to createa support request (I wonder what your support status of 11.2 is?), I am happy to have a call with you to gather information. Send me email (see my profile).
from odpi.
hi @cjbj , I have already created a support request through our db-admin at out company. I admit that things are moving slow. This is more of our problem than anyone else's. We have to provide oracle support team traces from both servers(DB, client) when there is an issue. This is taking more time than it should. Unless I share those traces, I can't really tell what's next. We can talk about it a bit though. I will send you an email. thanks.
from odpi.
Related Issues (20)
- XMLTYPE as "real" XML HOT 3
- Ability to set OCI_ATTR_LOBPREFETCH_SIZE HOT 7
- "ORA-03106: fatal two-task communication protocol error" when CLOB is fetched as strings and then lob locators HOT 3
- Any plan to support Apple's M1 plat? HOT 1
- Is there a reference example dequeue message with message ID dequeue option HOT 9
- When ODPI tries to translate a variable to a temporary LOB, and there is inadequate TEMP tablespace, ODPI segfaults when the cursor is closed HOT 6
- M1 Mac issue ( incompatible architecture) HOT 2
- How to use ODPI-C with Oracle Times Ten? HOT 9
- Column Alias Error HOT 7
- Add an option to convert numbers between double and decimal in the same way as node.js, python3, ruby, rust or so HOT 5
- Oracle Rust driver HOT 2
- Option to get json float as float. It is got as double now. HOT 8
- Bytes after nul in members of dpiXid are zeroed. HOT 2
- Is DPI_TPC_END_SUSPEND an undocumented flag in the OCI manual? HOT 3
- dpiEnqOptions_setVisibility and dpiLob_getType are listed not in round_trips.rst HOT 2
- Get column name for position HOT 2
- `dpiStmt_execute` with `DPI_MODE_EXEC_PARSE_ONLY` will create a table. HOT 4
- Retrieving CLOB from an OUT parameter of an Store Procedure HOT 1
- ERROR: DPI-1047: Cannot locate a 32-bit Oracle Client library HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from odpi.