Code Monkey home page Code Monkey logo

geto's Introduction

geto

(G)ood (e)nough (t)ask (o)ffloader is a framework for offloading work to hosts with minimal setup and dependencies.

Basically, geto can be used to offload an arbitrary task to any target host and retrieve results.

You might want to use geto if you have a machine (or more) that needs to offload work to other machines. Geto code is only required on a machine from which the work is offloaded; geto is not required (or in any way useful) on the target host machines.

It is likely that the offloading and result gathering will take on the order of seconds, so you might not want to use geto if that is a concern.

Here's a trivial example that runs a sleep command on a target host:

package main

import (
    "fmt"
	"github.com/bgmerrell/geto/lib/config"
	"github.com/bgmerrell/geto/lib/remote/ssh"
	"github.com/bgmerrell/geto/lib/task"
)

func main() {
	conf, _ := config.ParseConfig("/etc/geto.ini")
	var script task.Script = task.NewScriptWithCommands(
		"sleep", []string{"#!/bin/bash", "sleep 15"}, nil)
	var depFiles []string
	t, _ := task.New(depFiles, script, 0)
	ch := make(chan task.RunOutput)
	go task.RunOnHost(ssh.New(), t, conf.Hosts[0], ch)
	taskOutput := <-ch
	fmt.Printf("stdout: %s\n", taskOutput.Stdout)
	fmt.Printf("stderr: %s\n", taskOutput.Stderr)
	if taskOutput.Err != nil {
		fmt.Printf("err: %s\n", taskOutput.Err.Error())
	}
}

Prerequisites

Any host to which the user wishes to offload must have the following:

  • A Unix-like environment (only tested on Linux)
  • SSH server allowing public-key authenticated logins by any machine doing offloading. Password authentication is being worked on, but there is an issue
  • The timeout command in your PATH. This command is usually installed by default as part of the coreutils package in Linux.

The machine originating the offloading must have the following:

Terms

  • Host: Any machine that receives a task, i.e., any machine setup with the first set of prerequisites above.
  • Task: A unit of work to run on a host. Task IDs are uniquely generated.
  • Script: A geto object that contains a single script of any language. Script objects can be given any name, and multiple script objects can share a common name. The script name is used to limit and load balance the scripts on target hosts.

A task contains a single script, and multiple tasks can contain scripts of the same name.

For example, In the above code example, a simple bash script is used to compose a geto script (using the task.NewScriptWithCommands() method). That geto script is then used to create a new geto task (using the task.New() method). That task is then executed (using the task.RunOnHost() method) on the first host found in the parsed config file (i.e., conf.Hosts[0]).

Script details

The Script object consists of a name, commands, and the number of maximum scripts that can run concurrently on a given host. Otherwise stated:

// A script that runs on a target host
type Script struct {
    // Name is the name of a script.  It need not be unique.
	name string
	// The commands that make up a shell-style script.
	// Each index represents a line in the script.
	commands []string
	// The number of scripts of the same name that will run on a target host
	// concurrently.  A nil value means there is no limit.
	maxConcurrent *uint32
}

There are multiple ways of creating a script object:

func NewScript(name string, maxConcurrent *uint32) Script

In the above case the user is responsible for adding the commands to the object. Alternatively, the commands can be provided when instantiating the script object (which is the strategy used in the first example of this document) like so:

func NewScriptWithCommands(name string, commands []string, maxConcurrent *uint32) Script

Yet another approach is to provide a path to an existing script file to use to instantiate the geto script:

func NewScriptFromPath(name string, path string, maxConcurrent *uint32) (Script, error)

Scripts are simply executed on the target host; it is up to the script to indicate how it should be executed (e.g., by using a shebang interpreter directive).

Task details

A task object looks like this:

// A task that runs on a target host
type Task struct {
    // A unique ID for the task, automatically generated
	Id string
	// A list of files and/or directories that the task requires
	DepFiles []string
	// A script for the task to run
	Script Script
	// The number of seconds before giving up on a task after it has been
	// started
	Timeout uint32
}

Any file dependencies (specified by DepFiles) are copied to the target host and placed in a special "DEPS" directory. The script is also copied to the target host and placed in the same parent directory as the "DEPS" directory. This means that file dependencies can be relatively referenced from the script. For example, a foo.bin file dependency could be referenced in the script by "DEPS/foo.bin". (NOTE: This may or may not be tested at this point).

There is currently one way to instantiate a task object:

func New(depFiles []string, script Script, timeout uint32) (Task, error)

Once a task has been created, however, there are several fun ways to run it. The user can provide exactly which host on which the task should be run, like this:

func RunOnHost(conn remote.Remote, task Task, host host.Host, resultChan chan<- RunOutput)

Or, the user might wish to just have a random host picked, like this:

func RunOnRandomHost(conn remote.Remote, task Task, ch chan<- RunOutput)

The user can also perform basic load balancing by having geto choose the host that is running the fewest instances of a task's script, like this:

func RunOnHostBalancedByScriptName(conn remote.Remote, task Task, ch chan<- RunOutput)

TODO

  • Allow the remote copy operations to be done using password authentication (see issue #1)
  • Implement Python bridge allowing geto to be wielded from Python. There is already a proof-of-concept code checked into the geto repo. The code consists of a Go JSON rpc server and Python RPC client that calls it.
  • Various TODO-marked code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.