Code Monkey home page Code Monkey logo

Comments (14)

BlaineEXE avatar BlaineEXE commented on May 24, 2024

It's my understanding we want to take a user-specified ntp.conf file and distribute it to all nodes, but we must avoid distributing the file to the node that is the ntp master. Yes?

Right. If the ntp master happens to be one of the Salt minions, we can't clobber that one.

If this is the case:

  1. Where should DeepSea get the ntp.conf file? In the srv/salt/ceph/time/ntp/conf_file directory, perhaps? Should we populate a default ntp.conf file here, or should we fall back to the existing method of setting the NTP server master to the Salt master?

Salt has been suggesting files subdirectories under their respective part, so /srv/salt/ceph/time/ntp/files/ntp.conf.j2 is fine. I'll suggest using the Jinja2 for the simple inclusion of the ntp server. Rely on the setting of time_server in the pillar. If the admin changes time_server and then reruns this state, that's fine too.

If an admin wants to customize and create their own, they can. So, no need to worry about including multiple sources from the internet.

  1. Speaking of falling back, what if the user doesn't want to specify an ntp.conf and wants to use the existing method instead? What is the preferred method for the customer to provide this input? I have a thought to do as @rjfd has done in in his branch https://github.com/rjfd/DeepSea/blob/wip-single-config/srv/pillar/ceph/config.sls and create cluster-level config file where the user could specify, for example:
cluster_config:
  # If the 'ntp' section is configured, DeepSea will manage the cluster's NTP configuration
  # If this section is not configured, DeepSea will not manage the cluster's NTP configuration
  # Note that Ceph requires NTP to be configured to operate properly
  ntp:
    # Distribute an NTP configuration file given at the following path to all nodes
    distribute_ntp_conf_file: /path/to/conf/file
    # If the path is not given, configure the node given as the NTP master
    # Recommended NTP master is the Salt master node
    ntp_master: ceph.admin

I went with time_server over ntp_master just because I thought we might support chrony one day. The current setting is living in /srv/pillar/ceph/stack/global.yml but otherwise defaults to the Salt master. At this point, we will still be leaving the configuration of the ntp server as an exercise for the admin on the Salt master. We have too many divergent opinions for that part.

from deepsea.

BlaineEXE avatar BlaineEXE commented on May 24, 2024

@jan--f, do you have any thoughts here?

from deepsea.

jan--f avatar jan--f commented on May 24, 2024

@BlaineEXE I'm wondering if we should simply use systemd's timesynd service. It has a simple enough config file (/etc/systemd/timesyncd.conf) and ties in with systemd's networking, i.e. if a network definition or a dhcp provides an ntp server.

I wonder if runing an ntp server is a bit out of scope for deepsea? at least its a different issue I think.

from deepsea.

swiftgist avatar swiftgist commented on May 24, 2024

If timesyncd solves the problem nicely, then dropping the whole .../ceph/time directory might make this easier. With ntp (and chrony), there's inevitably confusion of whether a salt state intends to configure a client or setup a server. With timesyncd, I am presuming it will only ever be a client.

If timesyncd works fine and is easier to configure with Salt, then go for it. Drop ../ceph/time and add .../ceph/timesyncd.

from deepsea.

jan--f avatar jan--f commented on May 24, 2024

...or we could just add timesyncd to ceph/time and make it the default. I could see the need for some people wanting to do more complex ntp setups.

from deepsea.

BlaineEXE avatar BlaineEXE commented on May 24, 2024

It may be appropriate to rope #71 into this. I haven't looked into timesyncd yet, but I would hope it would have an option similar to NTPD_FORCE_SYNC_ON_STARTUP.

I had a brief discussion at the workshop with @rjfd (and a couple others?) about whether configuring time sync is feature creep in DeepSea, and Ricardo had the point that since Ceph is so dependent on accurate time, it does make sense for DeepSea to be able to manage a basic time sync feature with no frills for essential operation. I would think that using timesyncd with a simple config option would fit that bill. And if users want to have their own custom NTP/chronyd/whatever setup, they can manage that themselves.

@jan--f, are you proposing that we leave the existing ntp config, add timesyncd, and set timesyncd as default?

There is the use case that a user wants to give DeepSea a custom ntp.conf file to have distributed to the cluster. The difficulty there is that we would (by my research) have to parse the ntp.conf file to figure out what server(s) are NTP masters and then avoid distributing the conf file to those nodes if they are managed by DeepSea/salt. I'm worried that distributing a custom conf is in 'feature creep' territory.

from deepsea.

BlaineEXE avatar BlaineEXE commented on May 24, 2024

Also, is there any reason we shouldn't use saltstack's ntp configuration formula? https://github.com/saltstack-formulas/ntp-formula

If it works for us, that's fine. We wouldn't need the server part. (I'm basing that more on the consensus that DeepSea shouldn't be setting up the time server. When something is already packaged and works, I'm hesitant to remove it. Part of me would say include all of it and disable those parts by default. It might be useful to some. Could use other opinions on that.)

from deepsea.

jan--f avatar jan--f commented on May 24, 2024

When something is already packaged and works, I'm hesitant to remove it.

The thing is that the ntp part never worked for me. sntp always refused to sync to the server I gave it. Maybe we could have stuff like this in a doc/example tree? I think shipping stuff that isn't used and may or may not work can be confusing.

In other news, I assume that comment came from you @swiftgist? How does github tell me its from @BlaineEXE? I think you did this earlier with me too.

from deepsea.

jan--f avatar jan--f commented on May 24, 2024

And more to the point: the saltstack ntp formular could be useful. I don't think we should ship it though. Users can add it to their setup on their own. https://docs.saltstack.com/en/latest/topics/development/conventions/formulas.html#installation

from deepsea.

BlaineEXE avatar BlaineEXE commented on May 24, 2024

I would like to reach a resolution here so I don't waste my time working on something we aren't committing to.

As I mentioned, I had a good conversation with @rjfd a couple weeks ago about whether DeepSea doing NTP configuration might be feature creep versus why configuring NTP in DeepSea might be appropriate.

Arguments for:
Ceph is highly dependent on accurate time, meaning time sync across the server is critical to Ceph. If we further consider that a large number of ceph deployments are in air-gapped network environments and won't have access to standard NTP servers, it becomes necessary to sync with NTP servers in the air-gapped environment. It seems to me from reading through a lot of the DeepSea discussions that one of the prime design tents of Deepsea is to produce a minimal useful cluster out of the box; the product of this with the other notes above, for a user who hasn't had the foresight or technical skill to set up an NTP server in the air-gapped environment, it may be appropriate for DeepSea to be able to configure a basic NTP server and to distribute the configuration to the nodes.

Arguments against:
This could be viewed as feature creep; if we continue this trend, we will also be configuring firewalls and who knows what else. I don't know the numbers, but a large percentage of users may already have an NTP server configured or wish to configure it themselves. If this is the case, we are spending development time on a feature that is largely unneeded; furthermore, we are committing to spending testing and maintenance on the feature in the future for little benefit.

My thoughts:
First of all, please let me know if you see pros/cons I haven't mentioned. Right now, I can see both sides of the argument and I don't mind going either way, but I would like to commit strongly in one direction or another. Either we decide that basic NTP configuration is beneficial and support it, or we decide that NTP configuration is feature creep, and we remove the existing NTP configuration from DeepSea completely. If we remove it completely, a good middle ground may be to do as @jan--f has suggested and encourage the user to install the ntp-formula alongside DeepSea; if it's appropriate, I could test this myself and put a short section in the documentation as to how the user might set this up including how to distribute a custom ntp.conf if the user so wishes.

(Edit) More thoughts:
My gut reaction to this was that it feels like feature creep. The primary argument that swayed me away from this was that time sync is a requirement for Ceph and that not having it can lead to strange errors. For the present, I think the middle-ground option I mentioned above (encourage the user to install the ntp-formula alongside DeepSea) is better for the DeepSea project. If, in the future, we have a number of users who are asking for the functionality to be built-in, then we will have a stronger incentive to include it.

In other news, I assume that comment came from you @swiftgist? How does github tell me its from @BlaineEXE? I think you did this earlier with me too.
This is really strange. I'm seeing that too.

from deepsea.

swiftgist avatar swiftgist commented on May 24, 2024

@BlaineEXE Besides the normal warning of "clock skew" by the monitors in a Ceph cluster, OSDs can seem to randomly fail if the clocks are off. These are time consuming to debug.

The original approach was to simply force sync the time any time Stage 0 is run and have the validation act as strong encouragement to the admin that the nodes in a Ceph cluster need to be sync'd. However, Sales Engineers are manually configuring the ntp client on all the nodes of a Salt cluster before proceeding with the installation/configuration of Ceph. Since this is ideal territory for Salt (i.e. a single sls file), the request does not exceed the scope of guaranteeing a Ceph cluster is in sync. I originally suggested configuring the ntp client, but timesyncd is fine as well.

Everything in DeepSea is technically optional and can be overridden/disabled. I am not against including multiple solutions for a specific issue and making the selection between them trivial. I simply want the default to work without hassle if possible. Users are manually setting up ntp and requested this be done since it's a dependency.

from deepsea.

BlaineEXE avatar BlaineEXE commented on May 24, 2024

@swiftgist Your input seems inline with what @rjfd and I discussed a couple weeks ago. I propose to move forward with the prototype I've been working on that includes SaltStack's ntp-formula within DeepSea, giving the user an option to configure a node in the cluster as an NTP server (default server is the node on which DeepSea is installed) and all other nodes as clients pointing toward said NTP server. How do the following requirements sound?

Proposed requirements:

  • DeepSea should be able to configure basic time sync across the cluster (we will use NTP)
  • The user should be able to enable/disable time sync via configuration file (default: enabled)
  • (Edit: added) Enabling time sync should mean DeepSea ensures NTP packages are installed and services are running
  • DeepSea should be able to set up a single node in the cluster as an NTP server (default: node on which DeepSea is installed)

This one will be a point of contention for some and would need to be disabled by default. It's the closest agreement we've managed to reach. Since many sites will have their own time setup, we can expect that they have one. For those that don't, having the server configuration one salt command away would be nice to have. :)

Also, setting up an ntp server that is isolated from the Internet can be troublesome. If getting Salt to do this nicely becomes an obstacle, then it's okay to separate this into another issue for implementation on another day. If the formula works brilliantly and the server part is disabled by default, then I cannot see a counter argument.

  • NTP server should sync to higher stratum servers by default ([0-3].pool.ntp.org)
  • NTP server should fall back to its own local clock if higher stratum servers can't be contacted
  • (Edit: added) Users should be able to provide a custom list of servers for the NTP server to connect to (e.g., [0-3].mydomain.ntp.net)
  • NTP clients in the cluster should connect to the NTP server set up by DeepSea by default

This happens because nothing about DeepSea is interactive in the question/answer sense. It seemed reasonable since relying on a server name called 'time' or 'ntp' seems just as arbitrary. The validation will stop Stage 3 to keep the admin from getting too far with a misconfiguration.

  • NTP clients in the cluster should connect to higher stratum servers by default if DeepSea is not managing the NTP server ([0-3].pool.ntp.org)
  • Users should be able to provide a custom list of servers for NTP clients to connect to

Probably. Considering I made time_server single valued, that may need adjusting. Also, might need to consider if DeepSea upgrades from 0.6.x to 0.7.x, that the logic handles/corrects the single valued time_server in the pillar

  • Users should be able to provide their own custom server and client ntp.conf file's for DeepSea to distribute

This is likely the easier part with regards to Salt. I have been avoiding the word "custom" in sls filenames so that end users can reliably use it. An example sitting next to the ntp.conf.j2 might be appropriate though to help those along that did want to use their own.

from deepsea.

jan--f avatar jan--f commented on May 24, 2024

My impression was always that a large portion (if not almost all) of our customers have an ntp setup in their data center.
As far as I know its fairly complex to setup an ntp server, that doesn't have a server to sync with. So I'm worried we end up in a situation like we have no where sntp will refuse to sync with our server.

Maybe this is a topic we should bring to the deepsea-users ml to ask what the requirements in the field actually are.

from deepsea.

swiftgist avatar swiftgist commented on May 24, 2024

Addressed in #260

from deepsea.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.