Some times load balancer receives the following json error response from an execution node.
{ [Error: connect ECONNREFUSED 127.0.0.1:8082]
2016-11-28T12:59:33.875721134Z code: 'ECONNREFUSED',
2016-11-28T12:59:33.875724234Z errno: 'ECONNREFUSED',
2016-11-28T12:59:33.875726634Z syscall: 'connect',
2016-11-28T12:59:33.875729034Z address: '127.0.0.1',
2016-11-28T12:59:33.875731434Z port: 8082 }
Even after receiving the above error, the load balancer keeps the execution node running at port 8082 in the available execution nodes list. This needs to be changed.
Looking closely at the error, we can deduce that the load balancer is making the incorrect socket request to localhost at 8082 port where as the execution node has been configured at the socket 10.0.0.5:8082. Funnily enough this kind of error opens only for 8082 port all the time.
The error is also not position dependent. I tried with the following load balancer configuration.
"Nodes": [
{
"hostname": "10.0.0.5",
"port": "8084"
},
{
"hostname": "10.0.0.5",
"port": "8083"
},
{
"hostname": "10.0.0.5",
"port": "8082"
}
]
Note that the port 8082 is the last one in the list. Still the error occurs only with the socket 10.0.0.5:8082; other sockets at ports 8083 and 8084 work fine.
If for any genuine reason, an execution node is down or refusing to take evaluation requests, the load balancer must recognize this fact and respond. A preferred response is to keep the execution node that is sending connection refused messages in the down list with the status as connection refused. Such a status can be clearly shown on the appurl:9000/status
page as well.