Code Monkey home page Code Monkey logo

Comments (7)

jchambers avatar jchambers commented on September 24, 2024

This does sound like a bug, and I will investigate the issue.


(via Google Translate):

这听起来确实像是一个错误,我会调查这个问题。

from pushy.

jchambers avatar jchambers commented on September 24, 2024

@learnmore8 do you have a full stack trace from a SslHandshakeTimeoutException that you can share?

您是否有可以分享的“SslHandshakeTimeoutException”的完整堆栈跟踪?

from pushy.

learnmore8 avatar learnmore8 commented on September 24, 2024

非常感谢您的回复和关注,以下是我的初始化代码和报错信息。

  1. 初始化设置了连接超时
ApnsClientBuilder apnsClientBuilder = new ApnsClientBuilder();
... 
apnsClientBuilder.setConcurrentConnections(8)
	.setEventLoopGroup(eventLoopGroup)
	.setConnectionTimeout(Duration.ofSeconds(10)) // 设置 10 秒超时
	.setIdlePingInterval(Duration.ofSeconds(5))
	.setFrameLogger(new Http2FrameLogger(LogLevel.DEBUG, "PUSHY"))
	.setMetricsListener(new MyApnsClientMetricsListener());
ApnsClient apnsClient = apnsClientBuilder.build();
  1. 请求消息推送
      PushNotificationFuture<SimpleApnsPushNotification, PushNotificationResponse<SimpleApnsPushNotification>>
	    sendNotificationFuture = apnsClient.sendNotification(pushNotification);
      try {
    		final PushNotificationResponse<SimpleApnsPushNotification> pushNotificationResponse = sendNotificationFuture.get(3, TimeUnit.SECONDS);

    		if (pushNotificationResponse.isAccepted()) {
    			log.info("push success. Push notification accepted by APNs gateway. token="+pushNotification.getToken());
    		} else {
    			log.error("push failed. Notification rejected by the APNs gateway: " + pushNotificationResponse.getRejectionReason() + ",token="+pushNotification.getToken());
    			pushNotificationResponse.getTokenInvalidationTimestamp().ifPresent(timestamp -> {
    				log.error("\t…and the token is invalid as of " + timestamp + ",token="+pushNotification.getToken());
    			});
    		}
    	} catch (final ExecutionException e) {
    		log.error("Failed to send push notification.",e);
    	} catch (InterruptedException | TimeoutException e) {
    		log.error("sendNotificationFuture.get() 中断或超时异常",e);
    	}
  1. 域名解析服务使用 59.51.78.211 , 解析超时的概率非常大

  2. 程序 3 秒超时报错堆栈

sendNotificationFuture.get() 中断或超时异常
java.util.concurrent.TimeoutException: null
	at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
  1. nginx 超时日志信息
api.push.apple.com could not be resolved (110: Operation timed out)
  1. 程序 10 秒超时的堆栈日志
io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms
	at io.netty.handler.ssl.SslHandler$7.run(SslHandler.java:2125)
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask$$$capture(AbstractEventExecutor.java:174)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:167)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:745)
  1. 然后我是这样规避这个问题的,可能不完全正确,请指正
    com.eatthepath.pushy.apns.ApnsClientHandler#userEventTriggered
    public void userEventTriggered(final ChannelHandlerContext context, final Object event) throws Exception {
        // 这里是调整的代码
        if(event instanceof SslHandshakeCompletionEvent){
            Throwable cause = ((SslHandshakeCompletionEvent) event).cause();
            if(cause != null && cause instanceof SslHandshakeTimeoutException){
                log.error("SslHandshakeTimeoutException,channel:{}会被关闭:{}",context.channel(),cause.getMessage());
                context.voidPromise().tryFailure(cause);
            }
        }
       ....
    }

from pushy.

jchambers avatar jchambers commented on September 24, 2024

@learnmore8 Thank you for the additional details. This is helpful.

I'm confused, though. We assemble the pipeline here:

final ChannelPipeline pipeline = channel.pipeline();
clientConfiguration.getProxyHandlerFactory().ifPresent(proxyHandlerFactory ->
pipeline.addFirst(proxyHandlerFactory.createProxyHandler()));
pipeline.addLast(sslHandler);
pipeline.addLast(new FlushConsolidationHandler(FlushConsolidationHandler.DEFAULT_EXPLICIT_FLUSH_AFTER_FLUSHES, true));
pipeline.addLast(new IdleStateHandler(clientConfiguration.getIdlePingInterval().toMillis(), 0, 0, TimeUnit.MILLISECONDS));
pipeline.addLast(apnsClientHandler);

ApnsChannelHandler is the last handler in the pipeline, and it's already listening for exceptions. It will try to fail the "channel ready" promise if it catches any exception—including an SslHandshakeTimeoutException:

public void exceptionCaught(final ChannelHandlerContext context, final Throwable cause) {
// Always try to fail the "channel ready" promise if we catch an exception; in some cases, these may happen
// after a connection has already become ready, in which case the failure attempt will have no effect.
getChannelReadyPromise(context.channel()).tryFailure(cause);
}

…and regardless of how that promise resolves, ApnsChannelPool will remove the Promise from pendingCreateChannelFutures as soon as it's done:

createChannelFuture.addListener((GenericFutureListener<Future<Channel>>) future -> {
ApnsChannelPool.this.pendingCreateChannelFutures.remove(createChannelFuture);

So, considering all of that, it's not clear to me how we could be using all of the capacity in pendingCreateChannelFutures. Returning to your original question:

如果 Pushy 收到 SslHandshakeTimeoutException 是否应该把这个已经关闭的 channel 从 pendingCreateChannelFutures 移除呢
(translation: If Pushy receives a SslHandshakeTimeoutException, should the closed channel be removed from pendingCreateChannelFutures?)

…I think that should already be happening. Can you please say more about how you know that pendingCreateChannelFutures is filling up?

Also, what version of Pushy are you using?


感谢您提供更多详细信息。 这很有帮助。

不过我很困惑。 我们在这里组装管道:

final ChannelPipeline pipeline = channel.pipeline();
clientConfiguration.getProxyHandlerFactory().ifPresent(proxyHandlerFactory ->
pipeline.addFirst(proxyHandlerFactory.createProxyHandler()));
pipeline.addLast(sslHandler);
pipeline.addLast(new FlushConsolidationHandler(FlushConsolidationHandler.DEFAULT_EXPLICIT_FLUSH_AFTER_FLUSHES, true));
pipeline.addLast(new IdleStateHandler(clientConfiguration.getIdlePingInterval().toMillis(), 0, 0, TimeUnit.MILLISECONDS));
pipeline.addLast(apnsClientHandler);

ApnsChannelHandler 是管道中的最后一个处理程序,它已经在侦听异常。 如果捕获到任何异常(包括“SslHandshakeTimeoutException”),它将尝试使“通道就绪”承诺失败:

public void exceptionCaught(final ChannelHandlerContext context, final Throwable cause) {
// Always try to fail the "channel ready" promise if we catch an exception; in some cases, these may happen
// after a connection has already become ready, in which case the failure attempt will have no effect.
getChannelReadyPromise(context.channel()).tryFailure(cause);
}

...并且无论该承诺如何解决,“ApnsChannelPool”一旦完成都会从“pendingCreateChannelFutures”中删除“Promise”:

createChannelFuture.addListener((GenericFutureListener<Future<Channel>>) future -> {
ApnsChannelPool.this.pendingCreateChannelFutures.remove(createChannelFuture);

因此,考虑到所有这些,我不清楚我们如何使用“pendingCreateChannelFutures”中的所有容量。 回到你原来的问题:

如果 Pushy 收到 SslHandshakeTimeoutException 是否应该把这个已经关闭的 channel 从 pendingCreateChannelFutures 移除呢?

…我认为这应该已经发生了。 您能否详细说明一下您如何知道“pendingCreateChannelFutures”已填满?

另外,您使用的是哪个版本的 Pushy?

from pushy.

jchambers avatar jchambers commented on September 24, 2024

Re-reading your example:

public void userEventTriggered(final ChannelHandlerContext context, final Object event) throws Exception {
    // 这里是调整的代码
    if(event instanceof SslHandshakeCompletionEvent){
        Throwable cause = ((SslHandshakeCompletionEvent) event).cause();
        if(cause != null && cause instanceof SslHandshakeTimeoutException){
            log.error("SslHandshakeTimeoutException,channel:{}会被关闭:{}",context.channel(),cause.getMessage());
            context.voidPromise().tryFailure(cause);
        }
    }
   ....
}

…I think your point may be that this specific handshake failure is being reported as a user event rather than thrown as an exception. That strikes me as strange, but I will review Netty's SslHandler code to see if I can better understand what's happening in this case.


重新阅读你的例子:

public void userEventTriggered(final ChannelHandlerContext context, final Object event) throws Exception {
    // 这里是调整的代码
    if(event instanceof SslHandshakeCompletionEvent){
        Throwable cause = ((SslHandshakeCompletionEvent) event).cause();
        if(cause != null && cause instanceof SslHandshakeTimeoutException){
            log.error("SslHandshakeTimeoutException,channel:{}会被关闭:{}",context.channel(),cause.getMessage());
            context.voidPromise().tryFailure(cause);
        }
    }
   ....
}

…我认为您的观点可能是,这种特定的握手失败被报告为用户事件,而不是作为异常抛出。 这让我觉得很奇怪,但我将回顾 Netty 的“SslHandler”代码,看看我是否可以更好地理解这种情况下发生的情况。

from pushy.

learnmore8 avatar learnmore8 commented on September 24, 2024

触发 SslHandshakeTimeoutException 的代码如下 io.netty.handler.ssl.SslHandler#applyHandshakeTimeout

private void applyHandshakeTimeout() {
        final Promise<Channel> localHandshakePromise = this.handshakePromise;

        // Set timeout if necessary.
        final long handshakeTimeoutMillis = this.handshakeTimeoutMillis;
        if (handshakeTimeoutMillis <= 0 || localHandshakePromise.isDone()) {
            return;
        }

        final Future<?> timeoutFuture = ctx.executor().schedule(new Runnable() {
            @Override
            public void run() {
                if (localHandshakePromise.isDone()) {
                    return;
                }
                SSLException exception =
                        new SslHandshakeTimeoutException("handshake timed out after " + handshakeTimeoutMillis + "ms");
                try {
                    if (localHandshakePromise.tryFailure(exception)) {
                        SslUtils.handleHandshakeFailure(ctx, exception, true);
                    }
                } finally {
                    releaseAndFailAll(ctx, exception);
                }
            }
        }, handshakeTimeoutMillis, TimeUnit.MILLISECONDS);

        // Cancel the handshake timeout when handshake is finished.
        localHandshakePromise.addListener(new FutureListener<Channel>() {
            @Override
            public void operationComplete(Future<Channel> f) throws Exception {
                timeoutFuture.cancel(false);
            }
        });
    }

但上面这个异常并不会触发 com.eatthepath.pushy.apns.ApnsClientHandler#exceptionCaught

 public void exceptionCaught(final ChannelHandlerContext context, final Throwable cause) {
        // Always try to fail the "channel ready" promise if we catch an exception; in some cases, these may happen
        // after a connection has already become ready, in which case the failure attempt will have no effect.
        getChannelReadyPromise(context.channel()).tryFailure(cause);
    }

我使用的版本是:pushy 0.14.1 和 netty 4.1.92.Final

谢谢作者的回复,经过您的指导,我调整了修复方式,可以直接使用如下代码

getChannelReadyPromise(context.channel()).tryFailure(cause);

from pushy.

jchambers avatar jchambers commented on September 24, 2024

@learnmore8 I apologize for the delay in getting back to this.

I've written a test to try to demonstrate the behavior you've described, and I'm still not able to reproduce the specific result where connection attempts get "stuck" in pendingCreateChannelFutures. Please find a test case in https://github.com/jchambers/pushy/compare/tls_handshake_timeout.

What I'm observing is:

  • You are correct that handshake timeouts do not reach com.eatthepath.pushy.apns.ApnsClientHandler#exceptionCaught
  • You are also correct that an appropriate event does reach userEventTriggered
  • …but even with both of those things being true, the channel still closes and is removed from pendingCreateChannelFutures

That left me confused at first, but then you pointed out that you're using 0.14.1. I believe the issue you're describing is a specific symptom of #853, which was fixed in #854. I believe this issue is a duplicate, and you can solve the problem by updating to the latest version of Pushy.

With that in mind, I'm going to close this issue for now. If updating to the latest version does not solve your problem, please leave a comment here and I'll happily reopen the issue.


@learnmore8 对于延迟回到这个话题我深表歉意。

我已经编写了一个测试来尝试演示您所描述的行为,但我仍然无法重现连接尝试在“pendingCreateChannelFutures”中“卡住”的具体结果。 请在 https://github.com/jchambers/pushy/compare/tls_handshake_timeout 中找到测试用例。

我观察到的是:

  • 你是正确的,握手超时确实_not_达到 com.eatthepath.pushy.apns.ApnsClientHandler#exceptionCaught
  • 您也正确地认为适当的事件确实到达了“userEventTriggered”
  • ...但即使这两件事都是真的,通道仍然关闭并从“pendingCreateChannelFutures”中删除

一开始这让我很困惑,但后来你指出你正在使用 0.14.1。 我相信您描述的问题是 #853 的特定症状,已在 #854 中修复。 我相信这个问题是重复的,您可以通过更新到最新版本的 Pushy 来解决该问题。

考虑到这一点,我现在将关闭这个问题。 如果更新到最新版本_不能_解决您的问题,请在此处发表评论,我很乐意重新打开该问题。

from pushy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.