Self-explanatory. The REST API would provide a simplified PT_SERVERINFO/PT_PLAYERINFO list which would be loaded by some web dashboard for public viewing. It should also show the addon hash for the room, as described in #1.
The official setup method for the lobby server should be as a Docker image, with volumes configured for addons etc., application configuration, and Docker sockets. This will minimize potential environment-related errors, and make writing a setup guide much easier. Accordingly, configuration options for these things shouldn't need to be documented for server operators.
This will also mean that executable releases won't need to be published; the GitHub container registry will be our release method.
There should be some way of starting a room that requires a passcode, so that passcode rooms can be hosted on the same lobby server as non-passcode rooms. Not sure how this could work if you have multiple rooms with the same passcode, but maybe we can just say that rooms with the same passcode are load-balanced like non-passcode rooms.
Since we don't get waiting player counts from the server info packets, we need to balance players added to full rooms when our instances are maxed out, so we don't just reject waiting players. Ideally we'd have a hook or a Lua script installed to give us this information, but failing that, we can just spread waiters evenly across our open containers.
It would be great if new instances being created took players from the waiting queues of other instances, to more effectively churn players through games.
If we create a proxy UDP server for every single player, and we have a port binding for every single room, the system will run out of ports after a few ten thousand or so. We should reuse proxy connections between rooms, while ensuring that no proxy is used multiple times for the same room.
The ban list should be shared across instances using some sort of distributed store, like TiDB. Some sort of hook will be needed to read the server's ban lists into the database, and read bans back from the database.
In order to monitor containers better, there should be a command-line tool that shows the name of each running container and the players in each of those containers. Planned workflow would look something like:
> kartlobby ps
CONTAINER PLAYERS
some_container Player 1 Player 2 Player 3 Player 4
Player 5
other_container Player 6 Player 7 Player 8 Player 9
> docker logs -f some_container
logs...
When a game server doesn't send any data to a proxy for a few seconds, it should be closed to avoid eventual port exhaustion. This is more urgent than #10, since there is a very real possibility that an instance might not close for a very long time, leading to unclosed proxies killing the server.
There should be a config file or something that lets people use Docker images that they've created in the server, for example if they're using modified binaries. This should be kept in mind before considering using hooks, Lua is probably the way to go for compatibility.
I should make some diagrams showing how different parts of the system work. It gives me a headache just to work on, so explaining it to other people is bound to be a mess if I don't have some decent documentation to remind myself of what I did. The comments are nice, but they only explain functions, not the whole system.
Since we're creating new application instances for each room, we should be able to change the addons in use and have that take effect for every new instance, without affecting existing ones. Those instances will eventually die off until only instances with the newer addon lists are left. This will need some flag or file list hash to close off existing instances to new joiners.
If the lobby server goes down for some reason, connections should not be dropped when the server is restarted. The game instances behind the lobby server will still be running, so no game state will be lost. All that needs to be done to restore connections is to persist the connection map into a database of some kind.