Code Monkey home page Code Monkey logo

Comments (9)

Jc2k avatar Jc2k commented on June 12, 2024 1

So it already is just making one call to subscribe after add_entities, so I don't know what is happening at all 😆

So here is what I was planning for IP:

  • Add IpPairing.async_start and IpPairing.async_stop.
  • In HA, async_setup is called as part of Connection.async_setup. And Connection.async_unload should call async_stop.
  • async_start should create a while True loop running in a new task via async_create_task. We should hold a private reference to it on the pairing.
  • async_stop should cancel and await the loop.
  • The loop needs to be resilient - so it should try: except Exception, log the exception and restart. Possibly with a delay.
  • The loop is in charge of pair-very and re-doing subscriptions and then notifying any waiters via an asyncio.Event() that the connection is up.
  • The loop should check self.description.addresses and self.description.port before starting a new connection in case the ip/port updated. These will be updated automatically by background processes.
  • There will be some mechanism for the background discovery to notify these have changed. That should either use an Event to notify the loop to do something, or just cancel the loop and restart it.
  • _ensure_connected then only ever has to wait on this _is_connected Event, rather than actually trying to reach out to the device.
  • With this model, subscribe should not ever call _ensure_connected. If we are currently connected, it can send subscribe immediately. If not it should not try and connect.
  • This in turn means that we can mostly not do I/O in the foreground at startup
  • The one remaining case is fetching the entity map when the cache is cold. _ensure_connected can be called and time out after 8s, then I believe we catch the exception and fail the config flow. HA will retry it for us later. It won't block startup.

We already do bits of this in IP. There is a retry loop. It does check for IP/Port changes (though not through the new mechanism as its not quite finished). Right now the pairing /does/ have a close. But actually starting up the connection is always on demand. And I think that creates a bit of a weird thing where you expect subscribe to start the connection and hence block.

from aiohomekit.

Jc2k avatar Jc2k commented on June 12, 2024

It's a good question. It's been a while since i've looked at the startup path.

From what I remember (and a very quick look):

  • Right now in theory HA only needs to talk to the device if the characteristic cache is stale. Even if its stale, it can load the stale entities and then create any missing ones when the cache is eventually updated.
  • The only other I/O that happens at startup is setting up subscriptions, and that runs on its own asyncio task. It also sets up a poll to happen using async_track_time_interval. That shouldn't do i/o immediately.

So on the "happy" path I wouldn't expect any delay during HA's startup - the connect should happen in the background.

The case where there is no characteristic cache should be quite rare, in which case I think is reasonable.

from aiohomekit.

Jc2k avatar Jc2k commented on June 12, 2024

The ip backend has a reconnect loop that basically tries forever in the background. Then I can wait for the connection from other asyncio tasks and the timeout doesn't stop it from trying to reconnect in the background task. It will also consult the zeroconf cache for updated connection details. This mostly works quite well, though there is a bug where the char cache is empty at start up I recently notice.

The implementation is messy, and thats largely driven by a poorly defined lifetime/ownership model. It was also added before the zeroconf component matured to the level its at today. You could probably get away with just letting zeroconf announce when it came back online.

Also right now polling is frequent enough that just letting HA wake the connection back up is fine. Though if we reduce polling for the sake of battery devices that might not be the case.

from aiohomekit.

roysjosh avatar roysjosh commented on June 12, 2024

I added a traceback print & here's what I'm seeing... it looks like subscribe is triggering a connect. The web interface has that message at the bottom "Home Assistant is starting, not everything will be available until it is finished."

File "/home/lambda/code/home-assistant/core/venv/bin/hass", line 33, in <module>
  sys.exit(load_entry_point('homeassistantconsole_scripts', 'hass')())
File "/home/lambda/code/home-assistant/core/homeassistant/__main__.py", line 191, in main
  exit_code = runner.run(runtime_conf)
File "/home/lambda/code/home-assistant/core/homeassistant/runner.py", line 119, in run
  return loop.run_until_complete(setup_and_run_hass(runtime_config))
File "/usr/lib64/python3.9/asyncio/base_events.py", line 629, in run_until_complete
  self.run_forever()
File "/usr/lib64/python3.9/asyncio/base_events.py", line 596, in run_forever
  self._run_once()
File "/usr/lib64/python3.9/asyncio/base_events.py", line 1890, in _run_once
  handle._run()
File "/usr/lib64/python3.9/asyncio/events.py", line 80, in _run
  self._context.run(self._callback, *self._args)
File "/home/lambda/code/home-assistant/core/homeassistant/components/homekit_controller/connection.py", line 380, in async_process_entity_map
  await self.pairing.subscribe(self.watchable_characteristics)
File "/home/lambda/code/home-assistant/core/venv/lib64/python3.9/site-packages/aiohomekit/controller/coap/pairing.py", line 107, in subscribe
  await self._ensure_connected()
File "/home/lambda/code/home-assistant/core/venv/lib64/python3.9/site-packages/aiohomekit/controller/coap/pairing.py", line 49, in _ensure_connected
  await self.connection.connect(self.pairing_data)
File "/home/lambda/code/home-assistant/core/venv/lib64/python3.9/site-packages/aiohomekit/controller/coap/connection.py", line 392, in connect
  logger.debug(traceback.format_stack())

It tries to connect 7 (!!! I thought it was 3) times before giving up & then the web interface finishes loading. If I change the pair-verify timeout to 40s, it takes 7 * 40s for the banner thing to disappear. I haven't figured out why yet.

2022-02-14 00:03:51 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2022-02-14 00:04:31 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2022-02-14 00:05:11 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2022-02-14 00:05:51 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2022-02-14 00:06:31 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2022-02-14 00:07:11 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2022-02-14 00:07:51 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2022-02-14 00:07:51 INFO (MainThread) [homeassistant.bootstrap] Home Assistant initialized in 281.33s

from aiohomekit.

Jc2k avatar Jc2k commented on June 12, 2024

Eep. My memory and late night glances were not enough. So it is triggered byasync_process_entity_map here. Could you try editing it like this:


        self.add_entities()

-        if self.watchable_characteristics:
-            await self.pairing.subscribe(self.watchable_characteristics)
-            if not self.pairing.is_connected:
-                return

-        await self.async_update()
+       self.hass.async_create_task(self.async_update())

I think that will stop the I/O if you have an up to date cache.

I also wonder about not having subscribe do I/O unless it is already connected.

We could look to reduce the number of subscriptions. E.g. If the entities asking for a subscription didn't trigger right away. We could do a single batch call at the end of add_entities (if it was connected) otherwise send them as part of the connection process. Thats a bit more work, so something i'd probably put off if we can avoid slow starts for most users with the above.

from aiohomekit.

Jc2k avatar Jc2k commented on June 12, 2024

Argh now, that will mean the backend doesn't know about the subscriptions at all though.

from aiohomekit.

Jc2k avatar Jc2k commented on June 12, 2024

I think BLE will always be connect on demand, as those things really like to be disconnected. Theres like a 10-20s window where you can reuse the existing session. Subscribe there will be subscribe if connected, otherwise remember for when you connect.

from aiohomekit.

Jc2k avatar Jc2k commented on June 12, 2024

I imagine CoAP is somewhat similar to the existing IP backend, but as there is no "connection" in UDP its mostly s/connection/session/. The "reconnect loop" could be something that does one of those ping packets you talked about and re-pair-verifies if the ping fails?

from aiohomekit.

roysjosh avatar roysjosh commented on June 12, 2024

Found the source of the problem. I added the connection locks to the CoAP path because I was seeing overlapping connection attempts... the culprit is add_watchable_characteristics. It calls .subscribe 5x for each bulb and then async_process_entity_map calls .subscribe once. I think I'm seeing such long timeouts because the first 5x trip the lock, each in turn, and the last one from async_process_entity_map has to wait for them to finish.

from aiohomekit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.