Code Monkey home page Code Monkey logo

competitions-v1-compute-worker's Introduction

CodaLab logo Circle CI codecov

What is CodaLab?

CodaLab is an open-source web-based platform that enables researchers, developers, and data scientists to collaborate, with the goal of advancing research fields where machine learning and advanced computation is used. CodaLab helps to solve many common problems in the arena of data-oriented research through its online community where people can share worksheets and participate in competitions.

To see Codalab Competition's in action, visit codalab.lisn.fr.

Codabench, the next-gen of CodaLab Competitions, is out. Try it out!

Documentation

Community

The CodaLab community forum is hosted on Google Groups.

Quick installation (for Linux!)

To participate in competitions, or even organize your own competition, you don't need to install anything, you just need to sign in an instance of the platform (e.g. this one). If you wish to configure your own instance of CodaLab competitions, here are the instructions:

Install docker and add your user to the docker group, if you haven't already

$ wget -qO- https://get.docker.com/ | sh
$ sudo usermod -aG docker $USER

Clone this repo and get the default environment setup

$ git clone https://github.com/codalab/codalab-competitions
$ cd codalab-competitions
$ cp .env_sample .env
$ pip install docker-compose
$ docker-compose up -d

Now you should be able to access http://localhost/

More details on how to configure your own instance:

License

Copyright (c) 2013-2015, The Outercurve Foundation. Copyright (c) 2016-2021, Université Paris-Saclay. This software is released under the Apache License 2.0 (the "License"); you may not use the software except in compliance with the License.

The text of the Apache License 2.0 can be found online at: http://www.opensource.org/licenses/apache2.0.php

Cite CodaLab Competitions in your research

@article{codalab_competitions_JMLR,
  author  = {Adrien Pavao and Isabelle Guyon and Anne-Catherine Letournel and Dinh-Tuan Tran and Xavier Baro and Hugo Jair Escalante and Sergio Escalera and Tyler Thomas and Zhen Xu},
  title   = {CodaLab Competitions: An Open Source Platform to Organize Scientific Challenges},
  journal = {Journal of Machine Learning Research},
  year    = {2023},
  volume  = {24},
  number  = {198},
  pages   = {1--6},
  url     = {http://jmlr.org/papers/v24/21-1436.html}
}

competitions-v1-compute-worker's People

Contributors

ckcollab avatar didayolo avatar scottyak avatar tthomas63 avatar zhengying-liu avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

competitions-v1-compute-worker's Issues

Execution time limit doesn't work as expected

'--stop-timeout={}'.format(execution_time_limit),

The way we implement execution_time_limit in docker command doesn't work as we expected. This option doesn't stop the container after this limit time.
Here is Docker documentation : https://docs.docker.com/engine/reference/commandline/run/#stop-timeout

"The --stop-timeout flag sets the number of seconds to wait for the container to stop after sending the pre-defined (see --stop-signal) system call signal. If the container does not exit after the timeout elapses, it's forcibly killed with a SIGKILL signal.

If you set --stop-timeout to -1, no timeout is applied, and the daemon waits indefinitely for the container to exit.

The Daemon determines the default, and is 10 seconds for Linux containers, and 30 seconds for Windows containers."

bug in 17-legacy-nvidia-worker-compat worker.py

The docker run command for the new docker version seem to be
docker run --gpus all ...
instead of
nvidia-docker run ...

I have tested the former and it worked for me.

If with the latter, there will be the following error:
docker: Error response from daemon: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v1.linux/moby/1f97be47849eff264efa7a02e7fb6767ceb51ac432ddc502820f667035d84db1/log.json: no such file or directory): exec: "nvidia-container-runtime": executable file not found in $PATH: unknown.

The branches need to be cleaned

We basically have 3 docker images for the compute workers:

  • codalab/competitions-v1-compute-worker:1.1.5 (CPU)
  • codalab/competitions-v1-nvidia-worker:v1.5-compat (GPU)
  • codalab/competitions-v1-compute-worker:latest

Not sure if latest is even used.

However, we have many branches in the repository:

master
Updated 6 minutes ago by Didayolo

dependabot/pip/celery-5.2.2
Updated 2 months ago by dependabot[bot]

dependabot/pip/pyyaml-5.4
Updated 12 months ago by dependabot[bot]

17-legacy-nvidia-worker-compat-py3
Updated 14 months ago by Tthomas63

python3
Updated 2 years ago by Tthomas63

17-legacy-nvidia-worker-compat
Updated 2 years ago by ckcollab

162-nvidia-worker
Updated 2 years ago by zhengying-liu

dependabot/pip/psutil-5.6.6
Updated 2 years ago by dependabot[bot]

dependabot/pip/requests-2.20.0
Updated 2 years ago by dependabot[bot]

162-nvidia-worker-monitor
Updated 3 years ago by ckcollab

feature/realtime-detailed-results
Updated 3 years ago by ckcollab

feature/legacy-azure-version-fix
Updated 3 years ago by ckcollab

162-nvidia-worker-celery-4-3-0
Updated 3 years ago by Tthomas63

feature/fix-prune
Updated 3 years ago by Tthomas63

feature/suppress-warning
Updated 3 years ago by Tthomas63

Weird logs on Google Cloud VMs

This issue concerns the Google Cloud version of compute worker for AutoDL.

If we ssh to some workers (Google Cloud VMs, login required) and check the Docker logs:

gcloud beta compute --project "autodl-221715" ssh --zone "us-west1-a" "gpu-06-25-2019-20-30-04-000"
nvidia-docker logs -f compute_worker

We see a lot of error messages such as:
Screenshot 2019-08-22 at 18 18 56
or in text version:

entr: cannot stat '/tmp/codalab/tmp9CPTvU/run/output/detailed_results.html': No such file or directory

It seems that this issue doesn't completely block the submission handling process but it may be the cause of other issues.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.