comorment / gwas Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v3.0
License: GNU General Public License v3.0
I think less
command is not included in the container.
bash: less: command not found
I suggest to add https://www.gnu.org/software/parallel/ to the container - it's a fairly standard tool, and it's very useful.
It would be great to have a system for sharing datasets that "go together" with your container, but are to big to include in the container itself. For now I actually don't know what to use, UiO or Sigma2 do not provide a simple FTP service.
In the current Dockerfile recipes and bash installer files, versions of different tools are (usually) not pinned. Thus a (re)built container will likely differ from day to day, in particular, if packages are installed from sources like conda-forge and similar where updates are frequent.
Ideally, versions should explicitly be pinned in the recipes, e.g., like
FROM buildpack-deps:focal
RUN apt-get update && \
apt-get install --no-install-recommends -y \
cmake=3.16.3-1ubuntu1 \
python3-dev=3.8.2-0ubuntu2
....
RUN pip install h5py==2.10.0 && \
pip install git+https://github.com/NeuralEnsemble/parameters@b95bac2bd17f03ce600541e435e270a1e1c5a478#egg=parameters \
...
RUN git clone --depth 1 -b v3.1 https://github.com/nest/nest-simulator /usr/src/nest-simulator && \
# compile
...
The above is just taken from another project of mine (complete example: https://github.com/LFPy/LFPykernels/blob/main/Dockerfile).
Version pinning is also a best practice suggested by Dockerfile linting tools like Hadolint (https://hadolint.github.io/hadolint/).
I suggest may want to add --no-home
to our singularity commands, or at least clarify this in the README file.
I see there is also --pwd
, and some interference between this commands ( apptainer/singularity#4077 ).
By default singularity mounts home directory, thus all software deployed in user's home folder may interfere with software provided by the container.
Useful commands yet not available on TSD RHEL machines
https://stackoverflow.com/questions/3569997/how-to-find-out-line-endings-in-a-text-file
(as discussed by e-mail)
Currently all tools are placed in the root of the container:
I think it's best to have a separate folder for each tool, perhaps under a common top-level folder such as /tools. This would correspond to a structure like this:
/tools/flashpca-build
/tools/generic-metal
/tools/HDL
/tools/gctb_2.02_Linux
/tools/htslib
/tools/bcftools
/tools/vcftools
/tools/qctool_v2.0.6-Ubuntu16.04-x86_64
/tools/python_convert
/tools/BOLT-LMM_v2.3.4
Further, I suggest to remote from the path things like "generic-", "_Linux" "-Ubuntu16.04-x86_64", to have a clean name of the tool. As for the versions, I think it's good to setup a specific nomenclature, i.e. similar to how it's done on TSD:
matlabruntime.sif
and pleiofdr / magicsquare executables, push them to https://github.com/comorment/containersTo avoid breaking docker build I think it's helpful to setup travis-ci builds. Practically this is done by (1) adding a .travis.yml
file in the root of your repository, describing what needs to be done for every commit; (2) creating an account at https://travis-ci.org/ to manage your builds .
For you repo I think it makes sense to do docker build
, just to see if it fails.
Here is some docker-specific info at travis-ci
https://docs.travis-ci.com/user/docker/
I'm fairly sure both singularity and docker can expose a webservice, i.e. something like Jupyter Notebook that is running within container, but you access it from your host via browser. Can we have include instructions on how to use this, for example with Jupyter Notebook?
I've made a quick container with tensorflow and other deep learning packages:
https://github.com/comorment/gwas/blob/main/containers/py3ml/Dockerfile
It builds but I haven't tested in on TSD. For that we can use p33-appn-norment01
which has GPU installed
(there was a problem with nvidia drivers on that GPU - I'm not sure if this is already resolved).
Here is a useful link about GPU support in Docker containers:
https://towardsdatascience.com/how-to-properly-use-the-gpu-within-a-docker-container-4c699c78c6d1
This is somewhat similar to issue #6 - but with standard graphical user interface.
For example, can we package an rstudio into a docker or singularity?
Perhaps this is not possible - I just want to double-check.
I might need https://github.com/bulik/ldsc in a separate Docker container.
https://hub.docker.com/r/manninglab/ldsc/dockerfile - could this be a starting point?
I suggest to split README.md file into several files, for example:
into.md
with general info about docker, singularity, interactive / passive modes, mounting data to containers, etc. Basically, everything users need to know about docker and singularity that is not specific to your containers.README.md
file, serving as list of contents - and containing links to all other README files.The readme file for each tool may have
Opening this here as MAGMA will not be included in container recipes at https://github.com/comorment/containers, according to comorment/containers#40.
The file install_magma.sh
tries downloading from https://ctg.cncr.nl/software/MAGMA/prog/magma_v1.09b_static.zip, but the current URL appears to be https://ctg.cncr.nl/software/MAGMA/prog/archive/magma_v1.09b_static.zip
I think it's important to attach a license, i.e. GPL v3
or MIT
/ Free BSD
.
For my stuff I tend to go with GPL v3
(i.e. https://github.com/precimed/mixer/blob/master/LICENSE)
https://opensource.org/licenses gives good overview of licenses.
I'm able to docker pull bayramalex/all_analysis
, and use this container. However docker build failed:
docker build .
Sending build context to Docker daemon 201.7kB
Step 1/43 : FROM 'ubuntu:18.04'
---> 56def654ec22
Step 2/43 : ENV TZ=Europe
---> Using cache
---> 961ed1ff64bb
Step 3/43 : ENV DEBIAN_FRONTEND noninteractive
---> Using cache
---> 74d871c6e530
Step 4/43 : RUN apt-get update && apt-get install -y --no-install-recommends apt-utils python3 python3-pip tar wget unzip git libgsl0-dev perl && rm -rf /var/lib/apt/lists/*
---> Running in 0b176c189532
Err:1 http://security.ubuntu.com/ubuntu bionic-security InRelease
Temporary failure resolving 'security.ubuntu.com'
Err:2 http://archive.ubuntu.com/ubuntu bionic InRelease
Temporary failure resolving 'archive.ubuntu.com'
Err:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease
Temporary failure resolving 'archive.ubuntu.com'
Err:4 http://archive.ubuntu.com/ubuntu bionic-backports InRelease
Temporary failure resolving 'archive.ubuntu.com'
Reading package lists...
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic/InRelease Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic-updates/InRelease Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic-backports/InRelease Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/bionic-security/InRelease Temporary failure resolving 'security.ubuntu.com'
W: Some index files failed to download. They have been ignored, or old ones used instead.
Reading package lists...
Building dependency tree...
Reading state information...
Package perl is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
perl-base
Package apt-utils is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
apt
E: Package 'apt-utils' has no installation candidate
E: Unable to locate package python3
E: Unable to locate package python3-pip
E: Unable to locate package wget
E: Unable to locate package unzip
E: Unable to locate package git
E: Unable to locate package libgsl0-dev
E: Package 'perl' has no installation candidate
The command '/bin/sh -c apt-get update && apt-get install -y --no-install-recommends apt-utils python3 python3-pip tar wget unzip git libgsl0-dev perl && rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100
There is regenie3.sif file in our google drive, but there is no corresponding Dockerfile.
If it's from external Docker image, ok to make something like this: https://github.com/comorment/gwas/blob/main/containers/saige/Dockerfile
I don't have strong opinion about where this should go, feel free to decide.
This could be https://github.com/comorment , to highlight that these tools are developed as part of comorment.
Or this could be in your own github account, i.e. https://github.com/bayramakdeniz/gwas . I think keeping under your account also helps to promote your package on github, because people trust that you are responsible for maintaining it. This is how it's done for plink ( https://github.com/chrchang/plink-ng/ ).
My only suggestion is to have a short name (i.e. https://github.com/bayramakdeniz/gwas ), don't use underscores, and only use lowercase in the repo name.
@bayramakdeniz I've tried to put MiXeR (https://github.com/precimed/mixer) into our python3 container - here is a quick change to the Docker file: 3888cfc
However, there is a weird issue. Native code of the MiXeR is compiled into shared object (/tools/mixer/src/build/libbgmg.so
). Then this shared object is used from python using ctypes
library. To test out that libbgmg.so
is compatible with python
you may start python, and try these commands
import ctypes
ctypes.CDLL('/tools/mixer/src/build/lib/libbgmg.so')
Currently this gives an error.
Could you compile on your machine and test if this works?
It would be good to find how to package MATLAB software into a docker and/or singularity container. I think this can be achieved with matlab compiler (https://se.mathworks.com/products/compiler.html ) and matlab runtime ( https://se.mathworks.com/products/compiler/matlab-runtime.html ). This would be easier to explore using some simple matlab program, and a separate docker container - just to test this particular issue of running matlab within docker / singularity.
It's great to develop TSD-specific instructions about running these containers on TSD.
For specific projects (i.e. p33 and p697) we can have an official location within those project that hosts an up to date version of these singularity containers.
Oskar had a nice trick with local Docker repo, allowing to build singularity images without pushing docker image to dockerhub. Can we document this in the README file?
What is our policy for putting together multiple tools into one container? Obviously all potential tools won't fit into one container. I see pros and cons of distributing tools into multiple containers:
Pros:
Cons:
I hope it won't be too much work to re-arrange tools into multiple containers once we learn what fits together.
I suggest we pick few tools and test performance (singularity vs native execution) on TSD.
I think "plink" could be a good example of such a tool.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.