Code Monkey home page Code Monkey logo

Comments (12)

ctrimble avatar ctrimble commented on September 18, 2024

@jurmous sorry for reporting this issue without a test. I will see what I can do to reproduce the issue in the test suite. For reference, I pushed a commit with what I had to do to stabilize access to the uris array. I am not happy with this solution, but I thought it might be informative.

from etcd4j.

lburgazzoli avatar lburgazzoli commented on September 18, 2024

A question, do yu share the same EtcdClient across threads ?

from etcd4j.

ctrimble avatar ctrimble commented on September 18, 2024

Yes. My application has one client running.

from etcd4j.

lburgazzoli avatar lburgazzoli commented on September 18, 2024

I'm unsure about the fix as even it seems to work, we have a ConnectionState per request thus if two thread are manipulating it, it smells like a potential bug.

Any change to have a test case ?

from etcd4j.

lburgazzoli avatar lburgazzoli commented on September 18, 2024

@jurmous do you know what should this code in EtcdResponseHandler do? In particular, why a new connection is created ?

        // If connection was accepted maybe response has to be waited for
        if (status.equals(HttpResponseStatus.OK)
          || status.equals(HttpResponseStatus.ACCEPTED)
          || status.equals(HttpResponseStatus.CREATED)) {
          this.client.connect(this.request);
        } else {
          this.promise.setFailure(new IOException(
            "Content was not readable. HTTP Status: " + status));
        }

from etcd4j.

jurmous avatar jurmous commented on September 18, 2024

I remember that there where issues with waits and the client needed to reconnect if no content was sent back. The old connection was already closed so I needed to create a new one. Maybe this is not needed anymore now so you have to recheck with waits. Although this issue sees to point to that this is indeed hit.

from etcd4j.

lburgazzoli avatar lburgazzoli commented on September 18, 2024

If I run testWaitTimeout withiut the explicit connect, I see

/opt/sfw/lang/java/1.8.0/bin/java -ea -Didea.launcher.port=7533 -Didea.launcher.bin.path=/opt/sfw/tools/jetbrains/idea-u-15/bin -Didea.junit.sm_runner -Dfile.encoding=UTF-8 -classpath /opt/sfw/tools/jetbrains/idea-u-15/lib/idea_rt.jar:/opt/sfw/tools/jetbrains/idea-u-15/plugins/junit/lib/junit-rt.jar:/opt/sfw/lang/java/1.8.0/jre/lib/charsets.jar:/opt/sfw/lang/java/1.8.0/jre/lib/deploy.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/cldrdata.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/dnsns.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/jaccess.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/jfxrt.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/localedata.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/nashorn.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/sunec.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/sunjce_provider.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/sunpkcs11.jar:/opt/sfw/lang/java/1.8.0/jre/lib/ext/zipfs.jar:/opt/sfw/lang/java/1.8.0/jre/lib/javaws.jar:/opt/sfw/lang/java/1.8.0/jre/lib/jce.jar:/opt/sfw/lang/java/1.8.0/jre/lib/jfr.jar:/opt/sfw/lang/java/1.8.0/jre/lib/jfxswt.jar:/opt/sfw/lang/java/1.8.0/jre/lib/jsse.jar:/opt/sfw/lang/java/1.8.0/jre/lib/management-agent.jar:/opt/sfw/lang/java/1.8.0/jre/lib/plugin.jar:/opt/sfw/lang/java/1.8.0/jre/lib/resources.jar:/opt/sfw/lang/java/1.8.0/jre/lib/rt.jar:/home/lburgazz/work/lb/dev-fork/etcd4j/build/classes/test:/home/lburgazz/work/lb/dev-fork/etcd4j/build/classes/main:/home/lburgazz/work/lb/dev-fork/etcd4j/build/resources/test:/home/lburgazz/work/lb/dev-fork/etcd4j/build/resources/main:/opt/sfw/tools/apache/maven-repo/org/slf4j/slf4j-api/1.7.13/slf4j-api-1.7.13.jar:/home/lburgazz/.gradle/caches/modules-2/files-2.1/io.netty/netty-codec-http/4.1.0.Beta8/6c93630272bd68097dbff54a8e5ce9bb3cf2c464/netty-codec-http-4.1.0.Beta8.jar:/home/lburgazz/.gradle/caches/modules-2/files-2.1/io.netty/netty-handler/4.1.0.Beta8/7a1de4c299112bd5d036028490256ac5a6297574/netty-handler-4.1.0.Beta8.jar:/home/lburgazz/.gradle/caches/modules-2/files-2.1/io.netty/netty-codec/4.1.0.Beta8/35b06944ca2231d6b9d23a2bbc98e61941f78a3a/netty-codec-4.1.0.Beta8.jar:/home/lburgazz/.gradle/caches/modules-2/files-2.1/io.netty/netty-buffer/4.1.0.Beta8/abe1c55b89adc50c1561e5e98b192b142596f3bd/netty-buffer-4.1.0.Beta8.jar:/home/lburgazz/.gradle/caches/modules-2/files-2.1/io.netty/netty-transport/4.1.0.Beta8/55694c48c7b296e7f2c2b4f5da7753c1f92e3641/netty-transport-4.1.0.Beta8.jar:/home/lburgazz/.gradle/caches/modules-2/files-2.1/io.netty/netty-common/4.1.0.Beta8/4f0a26b56df9ce71f8a8aab9db21ac4bea3a952c/netty-common-4.1.0.Beta8.jar:/home/lburgazz/.gradle/caches/modules-2/files-2.1/io.netty/netty-resolver/4.1.0.Beta8/df72d88b512f04c744d170cae76d8b23cf00c677/netty-resolver-4.1.0.Beta8.jar:/opt/sfw/tools/apache/maven-repo/junit/junit/4.12/junit-4.12.jar:/opt/sfw/tools/apache/maven-repo/org/slf4j/slf4j-simple/1.7.13/slf4j-simple-1.7.13.jar:/opt/sfw/tools/apache/maven-repo/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar:/opt/sfw/tools/apache/maven-repo/com/fasterxml/jackson/core/jackson-core/2.7.0/jackson-core-2.7.0.jar:/opt/sfw/tools/apache/maven-repo/com/fasterxml/jackson/core/jackson-databind/2.7.0/jackson-databind-2.7.0.jar:/opt/sfw/tools/apache/maven-repo/com/fasterxml/jackson/core/jackson-annotations/2.7.0/jackson-annotations-2.7.0.jar:/home/lburgazz/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.module/jackson-module-afterburner/2.7.0/6f9a59e4fa14baf04d6857a6f4619aa78746741a/jackson-module-afterburner-2.7.0.jar com.intellij.rt.execution.application.AppMain com.intellij.rt.execution.junit.JUnitStarter -ideVersion5 mousio.etcd4j.TestFunctionality,testWaitTimeout
[main] [DEBUG] io.netty.util.internal.logging.InternalLoggerFactory - Using SLF4J as the default logging framework
[main] [DEBUG] io.netty.util.ResourceLeakDetector - -Dio.netty.leakDetection.level: simple
[main] [DEBUG] io.netty.util.ResourceLeakDetector - -Dio.netty.leakDetection.maxRecords: 4
[main] [DEBUG] io.netty.util.internal.PlatformDependent0 - java.nio.Buffer.address: available
[main] [DEBUG] io.netty.util.internal.PlatformDependent0 - sun.misc.Unsafe.theUnsafe: available
[main] [DEBUG] io.netty.util.internal.PlatformDependent0 - sun.misc.Unsafe.copyMemory: available
[main] [DEBUG] io.netty.util.internal.PlatformDependent0 - java.nio.Bits.unaligned: true
[main] [DEBUG] io.netty.util.internal.PlatformDependent - Java version: 8
[main] [DEBUG] io.netty.util.internal.PlatformDependent - -Dio.netty.noUnsafe: false
[main] [DEBUG] io.netty.util.internal.PlatformDependent - sun.misc.Unsafe: available
[main] [DEBUG] io.netty.util.internal.PlatformDependent - -Dio.netty.noJavassist: false
[main] [DEBUG] io.netty.util.internal.PlatformDependent - Javassist: unavailable
[main] [DEBUG] io.netty.util.internal.PlatformDependent - You don't have Javassist in your class path or you don't have enough permission to load dynamically generated classes.  Please check the configuration for better performance.
[main] [DEBUG] io.netty.util.internal.PlatformDependent - -Dio.netty.tmpdir: /tmp (java.io.tmpdir)
[main] [DEBUG] io.netty.util.internal.PlatformDependent - -Dio.netty.bitMode: 64 (sun.arch.data.model)
[main] [DEBUG] io.netty.util.internal.PlatformDependent - -Dio.netty.noPreferDirect: false
[main] [DEBUG] io.netty.channel.MultithreadEventLoopGroup - -Dio.netty.eventLoopThreads: 8
[main] [DEBUG] io.netty.channel.nio.NioEventLoop - -Dio.netty.noKeySetOptimization: false
[main] [DEBUG] io.netty.channel.nio.NioEventLoop - -Dio.netty.selectorAutoRebuildThreshold: 512
[main] [INFO] mousio.etcd4j.transport.EtcdNettyClient - Setting up Etcd4j Netty client
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.numHeapArenas: 8
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.numDirectArenas: 8
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.pageSize: 8192
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.maxOrder: 11
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.chunkSize: 16777216
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.tinyCacheSize: 512
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.smallCacheSize: 256
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.normalCacheSize: 64
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.maxCachedBufferCapacity: 32768
[main] [DEBUG] io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.cacheTrimInterval: 8192
[main] [DEBUG] io.netty.channel.DefaultChannelId - -Dio.netty.processId: 13894 (auto-detected)
[main] [DEBUG] io.netty.channel.DefaultChannelId - -Dio.netty.machineId: dc:53:60:ff:fe:0b:c8:45 (auto-detected)
[main] [DEBUG] io.netty.util.internal.ThreadLocalRandom - -Dio.netty.initialSeedUniquifier: 0xe84faad38200f53b (took 33 ms)
[main] [DEBUG] io.netty.buffer.ByteBufUtil - -Dio.netty.allocator.type: pooled
[main] [DEBUG] io.netty.buffer.ByteBufUtil - -Dio.netty.threadLocalDirectBufferSize: 65536
[main] [DEBUG] io.netty.buffer.ByteBufUtil - -Dio.netty.maxThreadLocalCharBufferSize: 16384
[nioEventLoopGroup-2-1] [DEBUG] io.netty.buffer.AbstractByteBuf - -Dio.netty.buffer.bytebuf.checkAccessible: true
[nioEventLoopGroup-2-1] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connected to /127.0.0.1:4001 (0)
[nioEventLoopGroup-2-1] [DEBUG] io.netty.util.Recycler - -Dio.netty.recycler.maxCapacity.maxCapacity: 262144
[nioEventLoopGroup-2-1] [DEBUG] mousio.client.retry.RetryPolicy - Retry 0 to send command (1)
[nioEventLoopGroup-2-1] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connection closed for request GET /v2/keys/etcd4j_test/test
[nioEventLoopGroup-2-2] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connected to /127.0.0.1:4001 (0)
[nioEventLoopGroup-2-2] [DEBUG] mousio.client.retry.RetryPolicy - Retry 1 to send command (1)
[nioEventLoopGroup-2-2] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connection closed for request GET /v2/keys/etcd4j_test/test
[nioEventLoopGroup-2-2] [DEBUG] mousio.etcd4j.transport.EtcdResponseHandler - Received 200 for GET /v2/keys/etcd4j_test/test
[nioEventLoopGroup-2-3] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connected to /127.0.0.1:4001 (0)
[nioEventLoopGroup-2-3] [DEBUG] mousio.client.retry.RetryPolicy - Retry 2 to send command (1)
[nioEventLoopGroup-2-3] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connection closed for request GET /v2/keys/etcd4j_test/test
[nioEventLoopGroup-2-4] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connected to /127.0.0.1:4001 (0)
[nioEventLoopGroup-2-4] [DEBUG] mousio.client.retry.RetryPolicy - Retry 3 to send command (1)
[nioEventLoopGroup-2-4] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connection closed for request GET /v2/keys/etcd4j_test/test
[nioEventLoopGroup-2-4] [DEBUG] mousio.etcd4j.transport.EtcdResponseHandler - Received 200 for GET /v2/keys/etcd4j_test/test
[nioEventLoopGroup-2-5] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connected to /127.0.0.1:4001 (0)
[nioEventLoopGroup-2-5] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connection closed for request GET /v2/keys/etcd4j_test/test
[nioEventLoopGroup-2-5] [DEBUG] mousio.etcd4j.transport.EtcdResponseHandler - Received 200 for GET /v2/keys/etcd4j_test/test
[nioEventLoopGroup-2-6] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connected to /127.0.0.1:4001 (0)
[nioEventLoopGroup-2-6] [DEBUG] mousio.etcd4j.transport.EtcdResponseHandler - Received 404 for DELETE /v2/keys/etcd4j_test
[main] [INFO] mousio.etcd4j.transport.EtcdNettyClient - Shutting down Etcd4j Netty client
[nioEventLoopGroup-2-6] [DEBUG] mousio.etcd4j.transport.EtcdNettyClient - Connection closed for request DELETE /v2/keys/etcd4j_test

Process finished with exit code 0

So looks like the reconnect is performed by the RetryPolicy, does it make sense or the retry should be performed only in case of errors/disconnections and not if timeout occurs ? Like:

  @Override
  public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) throws Exception  {
    if (cause instanceof ReadTimeoutException) {
      this.promise.setFailure(cause);
    }
  }

from etcd4j.

ctrimble avatar ctrimble commented on September 18, 2024

@lburgazzoli I agree that the commit I pointed to is not the way to solve this. I only pointed to it, so you could see the lines involved.

Note: the logs here are against 2.7.0, but also seeing this in behavior in 2.9.0

Let me see what I can do about working up a test case. I am currently seeing the issue in this test, but I should be able to strip this down into something that specifically deals with etcd4j. From my logs it looks like file handles are leaking somewhere, preventing new sockets from being created. Not sure if this is related to etcd4j or not.

io.netty.channel.ChannelException: Unable to create Channel from class class io.netty.channel.socket.nio.NioSocketChannel
    at io.netty.channel.ReflectiveChannelFactory.newChannel(ReflectiveChannelFactory.java:40) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.bootstrap.AbstractBootstrap.initAndRegister(AbstractBootstrap.java:315) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.bootstrap.Bootstrap.doResolveAndConnect(Bootstrap.java:157) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:139) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:120) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    at mousio.etcd4j.transport.EtcdNettyClient.connect(EtcdNettyClient.java:160) ~[etcd4j-2.7.0.jar:na]
    at mousio.etcd4j.transport.EtcdNettyClient$2.doRetry(EtcdNettyClient.java:114) ~[etcd4j-2.7.0.jar:na]
    at mousio.client.retry.RetryPolicy$1.run(RetryPolicy.java:55) ~[etcd4j-2.7.0.jar:na]
    at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) ~[netty-common-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:655) [netty-common-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) [netty-common-4.1.0.Beta5.jar:4.1.0.Beta5]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
Caused by: io.netty.channel.ChannelException: Failed to open a socket.
    at io.netty.channel.socket.nio.NioSocketChannel.newSocket(NioSocketChannel.java:62) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.channel.socket.nio.NioSocketChannel.<init>(NioSocketChannel.java:79) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.channel.socket.nio.NioSocketChannel.<init>(NioSocketChannel.java:72) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    at sun.reflect.GeneratedConstructorAccessor4.newInstance(Unknown Source) ~[na:na]
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_66]
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422) ~[na:1.8.0_66]
    at java.lang.Class.newInstance(Class.java:442) ~[na:1.8.0_66]
    at io.netty.channel.ReflectiveChannelFactory.newChannel(ReflectiveChannelFactory.java:38) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    ... 11 common frames omitted
Caused by: java.net.SocketException: Too many open files in system
    at sun.nio.ch.Net.socket0(Native Method) ~[na:1.8.0_66]
    at sun.nio.ch.Net.socket(Net.java:411) ~[na:1.8.0_66]
    at sun.nio.ch.Net.socket(Net.java:404) ~[na:1.8.0_66]
    at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:105) ~[na:1.8.0_66]
    at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60) ~[na:1.8.0_66]
    at io.netty.channel.socket.nio.NioSocketChannel.newSocket(NioSocketChannel.java:60) ~[netty-transport-4.1.0.Beta5.jar:4.1.0.Beta5]
    ... 18 common frames omitted

After this happens, ConnectionState is corrupted.

java.lang.ArrayIndexOutOfBoundsException: 1
    at mousio.etcd4j.transport.EtcdNettyClient.connect(EtcdNettyClient.java:156) ~[etcd4j-2.7.0.jar:na]
    at mousio.etcd4j.transport.EtcdNettyClient$2.doRetry(EtcdNettyClient.java:114) ~[etcd4j-2.7.0.jar:na]
    at mousio.client.retry.RetryPolicy$1.run(RetryPolicy.java:55) ~[etcd4j-2.7.0.jar:na]
    at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) ~[netty-common-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:655) [netty-common-4.1.0.Beta5.jar:4.1.0.Beta5]
    at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) [netty-common-4.1.0.Beta5.jar:4.1.0.Beta5]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]

from etcd4j.

lburgazzoli avatar lburgazzoli commented on September 18, 2024

Yeas I've observed that too, I'm working on a fix, should be ready in the coming days

from etcd4j.

lburgazzoli avatar lburgazzoli commented on September 18, 2024

Could you please try to build etcd4j from my fork and test ?

from etcd4j.

ctrimble avatar ctrimble commented on September 18, 2024

Sure. I have some production deployments to do this morning and then I will check it out.

from etcd4j.

ctrimble avatar ctrimble commented on September 18, 2024

@lburgazzoli sorry for the delay. I have run your fork with dropwizard-etcd's integration suite and it looks like your changes will resolve the issue I was seeing.

from etcd4j.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.