Code Monkey home page Code Monkey logo

Comments (7)

pouwelsjochem avatar pouwelsjochem commented on June 4, 2024 4

Interesting @clang-clang-clang, is there a reason why you've use coroutines for the example? Wouldn't the ANR trigger otherwise? That would explain why not everyone is seeing the high ANR rate.

I'm not quite sure this is the right solution though. With my limited knowledge of locking in Java, simply removing the wait in surfaceDestroyed() could result in possible crashes in other methods which think everything is correctly destroyed, while the "destroy" is not actually finished yet.

I did spend some of last week on the ANR issue as well, and have actually come up with a different solution by looking at the libgdx library which also uses Android's GLSurfaceView.java. They had similar issues on the onPause state and avoid it by using two different methods.

  1. Killing the app when onPause takes longer than 4000ms (ANR triggers at 5000ms)
  2. Using an optimised updateRuntimeState() where not everything is locked/synchronized all of the time, just the moments when controller is actually used.

Start of last weekend I've integrated these solutions in my own fork of Solar2D (https://github.com/pouwelsjochem/solar2d/blob/40f0134d8ee7447373d0900257909f36fd863ba3/platform/android/sdk/src/com/ansca/corona/Controller.java#L222) and slowly rolled that out to my users. Our ANR rate of this last build decreased by roughly 90%, from ~1% to ~0.1% (over 14K sessions in the last 24 hours). I believe Vlad is going to incorporate my changes into the main repo soon.

from corona.

clang-clang-clang avatar clang-clang-clang commented on June 4, 2024 1

I have the demo, steps, and screen recording to reproduce this ANR. @Shchvova

Use the Lua code as below or modified sample - Fishies to reproduction the Controller.stop() ANR with Solar2D 3699:

  --[[
      This is a demonstration for reproducing an ANR issue with
      Android Controller.stop()/start().
      Triggering the Android lifecycle while Lua is processing heavy 
      work (as shown below) can replicate the problem.
  ]]
  
  local sum = 0
  local co = coroutine.create(function ()
	  for i = 1, 100000 do
  
		  sum = sum + i
  
		  local start_time = os.clock()
		  while os.clock() - start_time < 0.1  do
		  end
  
		  coroutine.yield(sum)
	  end
  end)
  
  timer.performWithDelay(10, function()
	  for i = 1, 100 do
		  if co and coroutine.status(co) ~= "dead" then
			  coroutine.resume(co)
		  end
	  end
  end, 0)

The steps are as follows (Using the timeline of screen recording):

  1. 00:00:01 Start by testing the app as usual.
  2. 00:00:13 Enter the game scenes. Due to heavy computation, there is only one frame update at 22 and 32 seconds, which is not significant.
  3. 00:00:33 Exit or kill the app.
  4. 00:00:35 Manually restart the app from the desktop. Android Studio's first re-run won't reproduce this ANR.
  5. 00:00:37 Lock the screen by pressing the power button before fully entering the game scenes. Wait for at least 5 seconds.
  6. 00:00:44 Unlock the screen.
  7. 00:00:45 Seeing the ANR pop-up, use adb bugreport to retrieve it from the phone. This reproduced Controller.stop() ANR.
  8. 00:00:50 Exit or kill the app.
  9. 00:01:00 Manually restart the app from the desktop.
  10. 00:01:01 Exit to the desktop before fully entering the game scenes.
  11. 00:01:04 Re-enter the app.
  12. 00:01:06 Lock the screen by pressing the power button before fully entering the game scenes. Wait for at least 5 seconds.
  13. 00:01:16 See whether the ANR pop-up appears or not, as the app may have been killed by the system. This reproduced Controller.start() ANR (sometimes together with Controller.stop() ANR).

The screen recording:

output.mp4

from corona.

clang-clang-clang avatar clang-clang-clang commented on June 4, 2024 1

I tested 3699 with a simple patch (either a draft or proof of concept) and the ANR disappeared. However, I'm not familiar with the entire locking process, so I'm not sure about potential effects.

platform/android/sdk/src/com/ansca/corona/graphics/opengl/GLSurfaceView.java

        public void surfaceDestroyed() {
            synchronized(sGLThreadManager) {
                if (LOG_THREADS) {
                    Log.i("GLThread", "surfaceDestroyed tid=" + getId());
                }
                mHasSurface = false;
                sGLThreadManager.notifyAll();
-                while((!mWaitingForSurface) && (!mExited)) {
-                    try {
-                        sGLThreadManager.wait();
-                    } catch (InterruptedException e) {
-                        Thread.currentThread().interrupt();
-                    }
-                }
+//                while((!mWaitingForSurface) && (!mExited)) {
+//                    try {
+//                        sGLThreadManager.wait();
+//                    } catch (InterruptedException e) {
+//                        Thread.currentThread().interrupt();
+//                    }
+//                }
            }
        }
platform/android/sdk/src/com/ansca/corona/CoronaActivity.java

    private void requestResumeCoronaRuntime() {
...
		// Start/resume the Corona runtime.
-		fController.start();
+		new Thread(fController::start).start();

...
    private void requestSuspendCoronaRuntime() {
		// Suspend the Corona runtime.
		if (fController != null) {
-			fController.stop();
+			new Thread(fController::stop).start();
		}

I hope the information helpful. Thank you.

from corona.

vegasfun avatar vegasfun commented on June 4, 2024

Similar issue:
ANR-img-1
ANR-img-2

from corona.

pouwelsjochem avatar pouwelsjochem commented on June 4, 2024

Not sure if it's 100% related, but definitely could be: https://discuss.cocos2d-x.org/t/is-there-any-solution-of-anr/43080.

from corona.

pouwelsjochem avatar pouwelsjochem commented on June 4, 2024

Another interesting topic: https://issuetracker.google.com/issues/263307511 it's from a native Android app, but a bit below is an example project can be found & a video showing what happens when the ANR pops up

from corona.

clang-clang-clang avatar clang-clang-clang commented on June 4, 2024

Great news. Glad to hear there's a solution.

Since I can't reproduce the online situation, I can only approximate the ANR by simulating the calculated pressure.
Just keeping the GLThread busy and triggering the lifecycle can cause an ANR.

As for the 'patch' (it's a draft or proof of concept), it can't been production yet. I suspect that the sGLThreadManager also has a locking issue.

from corona.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.