Code Monkey home page Code Monkey logo

apomixis's People

Contributors

thraxil avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

apomixis's Issues

storage caps

should be able to set a maximum disk usage cap for each node. once that max is reached, node changes writeable to False.

etags

image retrieval should do smart things wrt etags

documentation

need a good README, and some install/config documentation.

file status feature

given a hash, a node should be able to report which nodes in the cluster have copies, what the ring looks like, maybe what scaled versions are cached, etc. mainly for debug purposes.

location support

when writing, try to write to the desired number of locations. when reading, always try to read from the current location before falling back to others.

detect noop resizes and 301 redirect

if a client requests an image thumbnail that's redundant (ie, asking for a thumbnail bigger than the original full size image), we should issue a 301 permanent redirect to the "canonical" size.

importlib can't be installed with 2.7

bootstrap installs importlib because celery/amqplib requires it for 2.6. Unfortunately, python 2.7, instead of just ignoring it, balks and the bootstrap fails. So bootstrapping needs to be conditional depending on python version (yuck!)

circuit breaker for failed nodes

apply the circuit breaker pattern to handling of failed nodes to prevent thundering herd problems. Basically, if pinging (or writing to a node) fails, we set its last_failed and wait announce_frequency seconds to try pinging again. If it fails a second ping, we wait 2 * announce_frequency seconds before making the next attempt, then 3 * announce_frequency times, and so on.

verify periodic task

every once in a while, walk the storage directory, calculate the sha1 of each file, verify that it matches the sha1 that it was stored as and repair/rebalance if necessary.

"hard" gossip

the current gossip protocol tries to minimize intra-cluster communication somewhat by not pinging a node if that node has been seen with announce_frequency seconds. This is good most of the time but, occasionally, maybe once every 10 * announce_frequency, we should force the pings to happen.

verify on retrieve

when an image is retrieved, pull it down from the first node that has a copy, but also start a background task that checks that they other N nodes that should have it do have it and attempts to repair if they don't.

single node cluster should work without celery

For the use case of running a "cluster" that is just a single node, it should work without running a celeryd process in addition to the web process.

Currently, if there is no celeryd, the nodes table in the database will never populate with even the single self-entry. Without any entries there, stash() won't write anywhere.

The fix would probably be to populate the nodes table at some point without having to go through celery. Alternatively, stash() could detect the case that there's only a single node (itself) and just write to disk without needing to query the nodes table.

background writes

when an image is uploaded, writes to other nodes should be done as asynchronous background tasks.

This could be an option. IE, the client could say "return as fast as possible" and everything gets done in the background or "i'll wait" and gets the status report on how many nodes it was successfully written to, etc.

cache read_ring/write_ring

to avoid having to re-calculate the rings each time.

cache is invalidated any time a node enters or leaves (but we can leave the cache alone otherwise)

prescale suggestions

when uploading an image, accept a list of sizes that to create eagerly (probably in a celery task). The idea is that often, the client knows in advance what size thumbnails will be needed later so future requests can be faster if these thumbs are pre-made. This will probably need to propagate to /stash.

explicitly version API

for forward compatibility, all API calls should include the apomixis version number somehow.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.