Code Monkey home page Code Monkey logo

Comments (5)

bmartinn avatar bmartinn commented on June 7, 2024 1

In AKS there's a way to deploy Elastic Search using a Persistent Volume, which suppose to be connected to Azure Files. I wonder how hard it would be to adjust trains-server for that, because it would mean that Azure (or other cloud developers) would be able to automate the entire setup process (including upgrades)

Should not be complicated to integrate, the elastic search container is a standard ELK container, so off-the-shelf setup should work. Obviously this is cloud specific, hence not part of the default setup :)

from clearml-server-helm.

bmartinn avatar bmartinn commented on June 7, 2024

Hi @Shaked

trains-server setup (either with docker or on k8s) is configured by default to store all data into externally mapped folder, usually /opt/trains.

This means that even deleting a deployed trains-server will not have an effect on the data itself, as it is stored outside of the containers.

If you take a look at the trains-server upgrade process, section 2 explains how to backup an entire server.

You could create a cron job doing just that, but I would opt for per database (mongodb, elastic-search) backup script, together with zipping /opt/trains/config and /opt/trains/data/fileserver

This way you do not have to spin down the trains-server but have the cron job executed independently.

from clearml-server-helm.

Shaked avatar Shaked commented on June 7, 2024

Hey @bmartinn

So the only way to lose data would be if the specific k8s node fails for whatever reason, right?

I hope you don't mind me asking, what was the reason to save the data on a specific node?

You could create a cron job doing just that, but I would opt for per database (mongodb, elastic-search) backup script, together with zipping /opt/trains/config and /opt/trains/data/fileserver

Do you know if I have to manually connect to the labeled node and set a cron job there or is there another way to do that? My fear is that it would become some unknown part of the entire setup process. Kind of the same as the elastic-search setup which although needs to be done only once, it's quite hard to automate.

Thank you for your help and patience!
Shaked

from clearml-server-helm.

bmartinn avatar bmartinn commented on June 7, 2024

So the only way to lose data would be if the specific k8s node fails for whatever reason, right?

Yes, only if the k8s node data volume is lost (which by default is on the node itself)

I hope you don't mind me asking, what was the reason to save the data on a specific node?

If you mean from a setup point of view, the idea was to make it as easy as possible to setup on k8s.
Scaling elastic-search is an art of it's own, and well, trains-server is just another elastic-search setup, data volume is just one aspect out of many, and the same goes for the mongodb setup, the idea is to set it up as easily as possible.

Your point on the trains-server "elastic-search setup" is exactly that, our setup is nothing special but some of the ingredients (see ELK cookbook) you have to configure to get a stable elastic-search up and running...

Do you know if I have to manually connect to the labeled node and set a cron job there or is there ...

Hmm, I think I would maybe map the same data volume to an additional container and have that container run the cron job and back everything to an external object-storage. Notice that this is a regular ELK/MongoDB backup k8s setup, there is nothing special here, the only addition is /opt/trains/data/fileserver, which is just a regular file backup.

from clearml-server-helm.

Shaked avatar Shaked commented on June 7, 2024

I see.

In AKS there's a way to deploy Elastic Search using a Persistent Volume, which suppose to be connected to Azure Files. I wonder how hard it would be to adjust trains-server for that, because it would mean that Azure (or other cloud developers) would be able to automate the entire setup process (including upgrades)

Hmm, I think I would maybe map the same data volume to an additional container and have that container run the cron job and back everything to an external object-storage. Notice that this is a regular ELK/MongoDB backup k8s setup, there is nothing special here, the only addition is /opt/trains/data/fileserver, which is just a regular file backup.

That's a great idea actually. Definitely gonna go with this approach

from clearml-server-helm.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.