Comments (12)
Our aim is next week.
from nats-server.
@JohnTseng1012 Glad to know that the issue is resolved once you set the advertise
and did a rolling restart. I am closing this issue now.
from nats-server.
@JohnTseng1012 How do you specify the listen
specification in the gateway{}
block?
from nats-server.
If you don't want to use advertise, you should set the listen
config to the public address: listen: "10.xxx...."
from nats-server.
Thank you for your suggestions.
Additionally, will the "no_advertise" feature be provided in the gateway in the future? Another question is why it gets stuck after attempting to reconnect several times, requiring a one-hour wait before attempting to reconnect again.
from nats-server.
The "no advertise" does not make sense in this context. This is normally used to avoid advertising URLs to client connections. Gateways never advertise server URLs from other clusters to clients.
Have you verified that using proper "listen" specification solves your issue? You should not have non local IPs anyway. The server will detect interfaces if the specification is "any" (0.0.0.0) and should exclude local IPs. We may need to run a test on those machines to see what is being returned by
Line 3864 in bb9bf95
If you specify hostname (which does not look like you do) and it was to resolve to an internal IP, that could also explain.
As for the reason it blocked, not sure at all. Maybe the pending PR (#5356) may help?
from nats-server.
@JohnTseng1012 I tried even with the older server v2.5.0, and it seems to work fine. Again, my guess is that you are not specifying the "listen" option and therefore the server finds the interfaces and pick the first one, which may be the 172.xx
that you are referring to as internal. You can see if you run the server with -D
debug flag an output such as:
[3180] 2024/04/25 12:28:49.617175 [DBG] Get non local IPs for "0.0.0.0"
[3180] 2024/04/25 12:28:49.617418 [DBG] ip=<some IP>
[3180] 2024/04/25 12:28:49.617422 [DBG] ip=<some IP>
..
[3180] 2024/04/25 12:28:49.617532 [INF] Server is ready
[3180] 2024/04/25 12:28:49.617577 [INF] Cluster name is WEST
If the first on the list is a 172.
then yes, it will be used as the listen specification when sending to others. So the simple solution is to use the public address in the "listen" specification.
You can check your logs and see what address is being used. You should see something like:
Address for gateway "<gateway name" is <IP>
Again, if this IP is 172.x
, then that means that it was the first in the list of returned interfaces.
from nats-server.
@kozlovic I have set the listen
, but the logs show the following message:
[FTL] Error listening on gateway port: 7522 - listen tcp 10.XXX.XXX.XXX:7522: bind: cannot assign requested address
my setting (10.XXX
type is LoadBalancer)
gateway {
name: " cluster-A"
listen: "10.XXX.XXX.XXX:7522"
gateways: [
{
name: " cluster-A"
urls: ["10.XXX.XXX.XXX:7522", "10.XXX.XXX.XXX:7522", "10.XXX.XXX.XXX:7522" ]
},
{
name: " cluster-B"
urls: ["10.XXX.XXX.XXX:7522", "10.XXX.XXX.XXX:7522", "10.XXX.XXX.XXX:7522" ]
},
{
name: " cluster-C"
urls: ["10.XXX.XXX.XXX:7522", "10.XXX.XXX.XXX:7522", "10.XXX.XXX.XXX:7522" ]
}
]
}
Is there something configured incorrectly?
And I think PR (#5356) should be able to solve the issue with the reconnection getting stuck.
from nats-server.
@JohnTseng1012 We usually don't recommend load balancers between NATS Server(s)/client(s). Now that I understand that this address is the one from the load balancer, obviously the "listen" specification with this address won't work. Instead, specify "listen" with the IP address of this machine and use "advertise: 10.xxx" so that this is the address sent, not the actual IP the server is listening to. Do that for all servers in the clusters.
from nats-server.
@kozlovic Is rebuilding the super cluster the only way to remove the internal private IPs from the IP list used by s.getRandomIP? After adding the advertise
, I am still seeing internal private IPs. Is there any way to use only the advertise
and the gateway URLs that I have configured myself?
from nats-server.
The s.getRandomIP has nothing to do with this if you never specify a host name, just IPs.
So I have tested with both current main
and back to v2.5.0
since this is the version you are using (you should upgrade, this is no longer supported). You don't actually have to set the "listen", but if you don't, by default the server will listen to 0.0.0.0
and get all interfaces and select one as the URL to send to its peers so that each server can "augment" the list of URLs this cluster can be reached at.
This is why you see (by printing the list of URLs before a server tries to connect) that there are some IPs that you consider internal (but they are non local from getNonLocalIPsIfHostIsIPAny() perspective).
When later you set the "advertise" config option to a "public" IP:port in say cluster1-server1, and restart that server, that server will now advertise this address to its peer, but the other servers still have their "internal" IPs communicated to other. You need to make this update (adding advertise) to all servers in the first cluster and do a rolling update. Then move to the second cluster and do the same rolling update (that is, update a server and restart it, move to the next), finally to the third cluster. It could be enough that they all cleared the "internal" IPs from their list, but it is possible that you need to do a rolling restart of each cluster to fully clear it. Of course, if you can "afford" it, then you could shutdown all servers, do the config updates, then restart the whole super cluster.
Let me know if that helps resolve your issue and I will close this ticket. Thanks!
from nats-server.
Thank you, after adding the advertise
and rolling restarting all clusters, the 172.XXX
are no longer appearing. Additionally, we have tested 2.10.15-RC
, and it has resolved the issue of connections getting stuck. When will v2.10.15 be released?
from nats-server.
Related Issues (20)
- '408 Request Timeout' instead of '404 No Messages' from $JS.API.CONSUMER.MSG.NEXT.<stream>.<consumer> HOT 4
- NATS Deleting Recovered Stream as Orphaned HOT 5
- Interest retention doesn't drop messages if not captured by consumer FilterSubjects HOT 3
- user JWT Src limits can be used for in-process connection HOT 7
- Performance degradation HOT 2
- Comment at end of config file is a parse error HOT 6
- Explicit server route connection retry does not backoff HOT 1
- Dynamic append headers for clients messages HOT 4
- Add support for inline configuration in CLI
- Embedded nats servers with opt.LogFile have no logging HOT 2
- Consumer not receiving messages when power off and restart, consumer's ack floor is ahead of stream's last sequence HOT 12
- Connection between Leafnode and Core NATS over satellite link fails to get established
- NATS Cluster - Dynamically del node HOT 3
- Durable Consumer Does not Consume From Last Message Per Subject HOT 3
- Abnormal NATS write load associated with a specific jetstream HOT 2
- Jetstream KV Cluster loosing data after nodes restart/ HOT 4
- Add the time zone designator to the time when `logtime_utc` is enabled
- WorkQueue jetstream messages are not deleted on non-leader nodes when used as mirror source
- Too many CPU/System resource used after many consumer created in idle cluster HOT 2
- Healthcheck fails when JetStream account is removed from configuration HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nats-server.