Comments (14)
It's my understanding we want to take a user-specified ntp.conf file and distribute it to all nodes, but we must avoid distributing the file to the node that is the ntp master. Yes?
Right. If the ntp master happens to be one of the Salt minions, we can't clobber that one.
If this is the case:
- Where should DeepSea get the ntp.conf file? In the
srv/salt/ceph/time/ntp/conf_file
directory, perhaps? Should we populate a default ntp.conf file here, or should we fall back to the existing method of setting the NTP server master to the Salt master?
Salt has been suggesting files subdirectories under their respective part, so /srv/salt/ceph/time/ntp/files/ntp.conf.j2 is fine. I'll suggest using the Jinja2 for the simple inclusion of the ntp server. Rely on the setting of time_server in the pillar. If the admin changes time_server and then reruns this state, that's fine too.
If an admin wants to customize and create their own, they can. So, no need to worry about including multiple sources from the internet.
- Speaking of falling back, what if the user doesn't want to specify an ntp.conf and wants to use the existing method instead? What is the preferred method for the customer to provide this input? I have a thought to do as @rjfd has done in in his branch https://github.com/rjfd/DeepSea/blob/wip-single-config/srv/pillar/ceph/config.sls and create cluster-level config file where the user could specify, for example:
cluster_config:
# If the 'ntp' section is configured, DeepSea will manage the cluster's NTP configuration
# If this section is not configured, DeepSea will not manage the cluster's NTP configuration
# Note that Ceph requires NTP to be configured to operate properly
ntp:
# Distribute an NTP configuration file given at the following path to all nodes
distribute_ntp_conf_file: /path/to/conf/file
# If the path is not given, configure the node given as the NTP master
# Recommended NTP master is the Salt master node
ntp_master: ceph.admin
I went with time_server over ntp_master just because I thought we might support chrony one day. The current setting is living in /srv/pillar/ceph/stack/global.yml but otherwise defaults to the Salt master. At this point, we will still be leaving the configuration of the ntp server as an exercise for the admin on the Salt master. We have too many divergent opinions for that part.
from deepsea.
@jan--f, do you have any thoughts here?
from deepsea.
@BlaineEXE I'm wondering if we should simply use systemd's timesynd service. It has a simple enough config file (/etc/systemd/timesyncd.conf
) and ties in with systemd's networking, i.e. if a network definition or a dhcp provides an ntp server.
I wonder if runing an ntp server is a bit out of scope for deepsea? at least its a different issue I think.
from deepsea.
If timesyncd solves the problem nicely, then dropping the whole .../ceph/time directory might make this easier. With ntp (and chrony), there's inevitably confusion of whether a salt state intends to configure a client or setup a server. With timesyncd, I am presuming it will only ever be a client.
If timesyncd works fine and is easier to configure with Salt, then go for it. Drop ../ceph/time and add .../ceph/timesyncd.
from deepsea.
...or we could just add timesyncd to ceph/time and make it the default. I could see the need for some people wanting to do more complex ntp setups.
from deepsea.
It may be appropriate to rope #71 into this. I haven't looked into timesyncd yet, but I would hope it would have an option similar to NTPD_FORCE_SYNC_ON_STARTUP
.
I had a brief discussion at the workshop with @rjfd (and a couple others?) about whether configuring time sync is feature creep in DeepSea, and Ricardo had the point that since Ceph is so dependent on accurate time, it does make sense for DeepSea to be able to manage a basic time sync feature with no frills for essential operation. I would think that using timesyncd with a simple config option would fit that bill. And if users want to have their own custom NTP/chronyd/whatever setup, they can manage that themselves.
@jan--f, are you proposing that we leave the existing ntp config, add timesyncd, and set timesyncd as default?
There is the use case that a user wants to give DeepSea a custom ntp.conf file to have distributed to the cluster. The difficulty there is that we would (by my research) have to parse the ntp.conf file to figure out what server(s) are NTP masters and then avoid distributing the conf file to those nodes if they are managed by DeepSea/salt. I'm worried that distributing a custom conf is in 'feature creep' territory.
from deepsea.
Also, is there any reason we shouldn't use saltstack's ntp configuration formula? https://github.com/saltstack-formulas/ntp-formula
If it works for us, that's fine. We wouldn't need the server part. (I'm basing that more on the consensus that DeepSea shouldn't be setting up the time server. When something is already packaged and works, I'm hesitant to remove it. Part of me would say include all of it and disable those parts by default. It might be useful to some. Could use other opinions on that.)
from deepsea.
When something is already packaged and works, I'm hesitant to remove it.
The thing is that the ntp part never worked for me. sntp always refused to sync to the server I gave it. Maybe we could have stuff like this in a doc/example tree? I think shipping stuff that isn't used and may or may not work can be confusing.
In other news, I assume that comment came from you @swiftgist? How does github tell me its from @BlaineEXE? I think you did this earlier with me too.
from deepsea.
And more to the point: the saltstack ntp formular could be useful. I don't think we should ship it though. Users can add it to their setup on their own. https://docs.saltstack.com/en/latest/topics/development/conventions/formulas.html#installation
from deepsea.
I would like to reach a resolution here so I don't waste my time working on something we aren't committing to.
As I mentioned, I had a good conversation with @rjfd a couple weeks ago about whether DeepSea doing NTP configuration might be feature creep versus why configuring NTP in DeepSea might be appropriate.
Arguments for:
Ceph is highly dependent on accurate time, meaning time sync across the server is critical to Ceph. If we further consider that a large number of ceph deployments are in air-gapped network environments and won't have access to standard NTP servers, it becomes necessary to sync with NTP servers in the air-gapped environment. It seems to me from reading through a lot of the DeepSea discussions that one of the prime design tents of Deepsea is to produce a minimal useful cluster out of the box; the product of this with the other notes above, for a user who hasn't had the foresight or technical skill to set up an NTP server in the air-gapped environment, it may be appropriate for DeepSea to be able to configure a basic NTP server and to distribute the configuration to the nodes.
Arguments against:
This could be viewed as feature creep; if we continue this trend, we will also be configuring firewalls and who knows what else. I don't know the numbers, but a large percentage of users may already have an NTP server configured or wish to configure it themselves. If this is the case, we are spending development time on a feature that is largely unneeded; furthermore, we are committing to spending testing and maintenance on the feature in the future for little benefit.
My thoughts:
First of all, please let me know if you see pros/cons I haven't mentioned. Right now, I can see both sides of the argument and I don't mind going either way, but I would like to commit strongly in one direction or another. Either we decide that basic NTP configuration is beneficial and support it, or we decide that NTP configuration is feature creep, and we remove the existing NTP configuration from DeepSea completely. If we remove it completely, a good middle ground may be to do as @jan--f has suggested and encourage the user to install the ntp-formula alongside DeepSea; if it's appropriate, I could test this myself and put a short section in the documentation as to how the user might set this up including how to distribute a custom ntp.conf if the user so wishes.
(Edit) More thoughts:
My gut reaction to this was that it feels like feature creep. The primary argument that swayed me away from this was that time sync is a requirement for Ceph and that not having it can lead to strange errors. For the present, I think the middle-ground option I mentioned above (encourage the user to install the ntp-formula alongside DeepSea) is better for the DeepSea project. If, in the future, we have a number of users who are asking for the functionality to be built-in, then we will have a stronger incentive to include it.
In other news, I assume that comment came from you @swiftgist? How does github tell me its from @BlaineEXE? I think you did this earlier with me too.
This is really strange. I'm seeing that too.
from deepsea.
@BlaineEXE Besides the normal warning of "clock skew" by the monitors in a Ceph cluster, OSDs can seem to randomly fail if the clocks are off. These are time consuming to debug.
The original approach was to simply force sync the time any time Stage 0 is run and have the validation act as strong encouragement to the admin that the nodes in a Ceph cluster need to be sync'd. However, Sales Engineers are manually configuring the ntp client on all the nodes of a Salt cluster before proceeding with the installation/configuration of Ceph. Since this is ideal territory for Salt (i.e. a single sls file), the request does not exceed the scope of guaranteeing a Ceph cluster is in sync. I originally suggested configuring the ntp client, but timesyncd is fine as well.
Everything in DeepSea is technically optional and can be overridden/disabled. I am not against including multiple solutions for a specific issue and making the selection between them trivial. I simply want the default to work without hassle if possible. Users are manually setting up ntp and requested this be done since it's a dependency.
from deepsea.
@swiftgist Your input seems inline with what @rjfd and I discussed a couple weeks ago. I propose to move forward with the prototype I've been working on that includes SaltStack's ntp-formula within DeepSea, giving the user an option to configure a node in the cluster as an NTP server (default server is the node on which DeepSea is installed) and all other nodes as clients pointing toward said NTP server. How do the following requirements sound?
Proposed requirements:
- DeepSea should be able to configure basic time sync across the cluster (we will use NTP)
- The user should be able to enable/disable time sync via configuration file (default: enabled)
- (Edit: added) Enabling time sync should mean DeepSea ensures NTP packages are installed and services are running
- DeepSea should be able to set up a single node in the cluster as an NTP server (default: node on which DeepSea is installed)
This one will be a point of contention for some and would need to be disabled by default. It's the closest agreement we've managed to reach. Since many sites will have their own time setup, we can expect that they have one. For those that don't, having the server configuration one salt command away would be nice to have. :)
Also, setting up an ntp server that is isolated from the Internet can be troublesome. If getting Salt to do this nicely becomes an obstacle, then it's okay to separate this into another issue for implementation on another day. If the formula works brilliantly and the server part is disabled by default, then I cannot see a counter argument.
- NTP server should sync to higher stratum servers by default ([0-3].pool.ntp.org)
- NTP server should fall back to its own local clock if higher stratum servers can't be contacted
- (Edit: added) Users should be able to provide a custom list of servers for the NTP server to connect to (e.g., [0-3].mydomain.ntp.net)
- NTP clients in the cluster should connect to the NTP server set up by DeepSea by default
This happens because nothing about DeepSea is interactive in the question/answer sense. It seemed reasonable since relying on a server name called 'time' or 'ntp' seems just as arbitrary. The validation will stop Stage 3 to keep the admin from getting too far with a misconfiguration.
- NTP clients in the cluster should connect to higher stratum servers by default if DeepSea is not managing the NTP server ([0-3].pool.ntp.org)
- Users should be able to provide a custom list of servers for NTP clients to connect to
Probably. Considering I made time_server single valued, that may need adjusting. Also, might need to consider if DeepSea upgrades from 0.6.x to 0.7.x, that the logic handles/corrects the single valued time_server in the pillar
- Users should be able to provide their own custom server and client ntp.conf file's for DeepSea to distribute
This is likely the easier part with regards to Salt. I have been avoiding the word "custom" in sls filenames so that end users can reliably use it. An example sitting next to the ntp.conf.j2 might be appropriate though to help those along that did want to use their own.
from deepsea.
My impression was always that a large portion (if not almost all) of our customers have an ntp setup in their data center.
As far as I know its fairly complex to setup an ntp server, that doesn't have a server to sync with. So I'm worried we end up in a situation like we have no where sntp will refuse to sync with our server.
Maybe this is a topic we should bring to the deepsea-users ml to ask what the requirements in the field actually are.
from deepsea.
Addressed in #260
from deepsea.
Related Issues (20)
- make install fails on archlinux HOT 1
- cephdisks.unused falsely returns mounted disks
- Typo - should be c_v_commands instead of d_v_commands
- Example code for return structures of modules HOT 6
- rebuild runner needs to read error messages from osd.py(runner) HOT 2
- osd.remove fails to zap devices on ceph version 14.2.3-349 HOT 10
- functests.3nodes fails with IndexError HOT 4
- How to deploy /dev/sdb ,/dev/sdc only use /dev/nvme0n1 as db device ? HOT 5
- Can not remove cluster node HOT 2
- [SES5] remove.osd functest fails HOT 9
- No role for rbd-client (e.g. mapping images) HOT 1
- Stage.3 Hang in Disks.Deploy HOT 6
- IGW lrbd support
- Device discovery not working in KVM environment HOT 3
- SES5: "time_init: ntp" does not work as expected HOT 1
- SES5: osd.redeploy fails for filestore -> bluestore
- deepsea hangs in stage 3 @ ceph.mgr.auth on
- deepsea monitor not working with gitfs
- stage.0 Exception: 'getpwuid(): uid not found:
- Deepsea development status after cephadm release HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepsea.