Code Monkey home page Code Monkey logo

docker-rdkit's Introduction

Dockerfiles for building RDKit.

These images supersede the existing informaticsmatters/rdkit_* Docker images.

These Dockerfiles and shell scripts are for building various Docker images for RDKit. The aim is to build a number of lightweight images that are suited for running in production cloud environments like Kubernetes and OpenShift. For this purpose the images need to be:

  1. as small as is reasonable to minimise download time and reduce the potential attack surface
  2. run as a non-root user or an arbitrarily assigned user ID.

The approach taken to build these images currently follows the builder pattern. See the Smaller containers series of posts on the Informatics Matters blog for more details about how these images are built.

For each RDKit version (image tag) we build a number of images:

Note: we now focus on the Debian based images. In the past we also built on centos and fedora, but this caused too much of a maintenance problem.

Branches

  • master - build from current RDKit master branch. These images are updated at irregular intervals. Images have tag of latest.
  • Release_2017_09_2 - build from RDKit Release_2017_09_2 branch. These are not working correctly yet and may be dropped.
  • Release_2018_03 - build from RDKit Release_2018_03 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2018_03
  • Release_2018_03_1 - build from RDKit Release_2018_03_1 release tag. These images should never change [1]. Images have tag of Release_2018_03_1 [2]
  • Release_2018_03_2 - build from RDKit Release_2018_03_2 release tag. These images should never change [1]. Images have tag of Release_2018_03_2.
  • Release_2018_09 - build from RDKit Release_2018_09 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2018_09
  • Release_2018_09_1 - build from RDKit Release_2018_09_1 release tag. These images should never change [1]. Images have tag of Release_2018_09_1
  • Release_2018_09_2 - build from RDKit Release_2018_09_2 release tag. These images should never change [1]. Images have tag of Release_2018_09_2
  • Release_2018_09_3 - build from RDKit Release_2018_09_3 release tag. These images should never change [1]. Images have tag of Release_2018_09_3
  • Release_2019_03 - build from RDKit Release_2019_03 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2019_03
  • Release_2019_03_1 - build from RDKit Release_2019_03_1 release tag. These images should never change [1]. Images have tag of Release_2019_03_1
  • Release_2019_03_2 - build from RDKit Release_2019_03_2 release tag. These images should never change [1]. Images have tag of Release_2019_03_2
  • Release_2019_03_3 - build from RDKit Release_2019_03_3 release tag. These images should never change [1]. Images have tag of Release_2019_03_3
  • Release_2019_03_4 - build from RDKit Release_2019_03_4 release tag. These images should never change [1]. Images have tag of Release_2019_03_4
  • Release_2019_09 - build from RDKit Release_2019_09 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2019_09
  • Release_2019_09_1 - build from RDKit Release_2019_09_1 release tag. These images should never change [1]. Images have tag of Release_2019_09_1
  • Release_2019_09_2 - build from RDKit Release_2019_09_2 release tag. These images should never change [1]. Images have tag of Release_2019_09_2
  • Release_2020_03 - build from RDKit Release_2020_03 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2020_03
  • Release_2020_03_1 - build from RDKit Release_2020_03_1 release tag. These images should never change [1]. Images have tag of Release_2020_03_1
  • Release_2020_03_2 - build from RDKit Release_2020_03_2 release tag. These images should never change [1]. Images have tag of Release_2020_03_2
  • Release_2020_03_3 - build from RDKit Release_2020_03_3 release tag. These images should never change [1]. Images have tag of Release_2020_03_3
  • Release_2020_03_4 - build from RDKit Release_2020_03_4 release tag. These images should never change [1]. Images have tag of Release_2020_03_4
  • Release_2020_03_5 - build from RDKit Release_2020_03_5 release tag. These images should never change [1]. Images have tag of Release_2020_03_5
  • Release_2020_03_6 - build from RDKit Release_2020_03_6 release tag. These images should never change [1]. Images have tag of Release_2020_03_6
  • Release_2020_09 - build from RDKit Release_2020_09 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2020_09
  • Release_2020_09_1 - build from RDKit Release_2020_09_1 release tag. These images should never change [1]. Images have tag of Release_2020_09_1
  • Release_2020_09_2 - build from RDKit Release_2020_09_2 release tag. These images should never change [1]. Images have tag of Release_2020_09_2
  • Release_2020_09_3 - build from RDKit Release_2020_09_3 release tag. These images should never change [1]. Images have tag of Release_2020_09_3
  • Release_2020_09_4 - build from RDKit Release_2020_09_4 release tag. These images should never change [1]. Images have tag of Release_2020_09_4
  • Release_2020_09_5 - build from RDKit Release_2020_09_5 release tag. These images should never change [1]. Images have tag of Release_2020_09_5
  • Release_2021_03 - build from RDKit Release_2021_03 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2021_03
  • Release_2021_03_1 - build from RDKit Release_2021_03_1 release tag. These images should never change [1]. Images have tag of Release_2021_03_1
  • Release_2021_03_2 - build from RDKit Release_2021_03_2 release tag. These images should never change [1]. Images have tag of Release_2021_03_2
  • Release_2021_09 - build from RDKit Release_2021_09 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2021_09
  • Release_2022_03 - build from RDKit Release_2022_03 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2022_03
  • Release_2022_03_5 - build from RDKit Release_2021_03_5 release tag. These images should never change [1]. Images have tag of Release_2022_03_5
  • Release_2022_09 - build from RDKit Release_2022_09 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2022_09
  • Release_2022_09_4 - build from RDKit Release_2021_09_4 release tag. These images should never change [1]. Images have tag of Release_2022_09_4
  • Release_2022_09_5 - build from RDKit Release_2021_09_5 release tag. These images should never change [1]. Images have tag of Release_2022_09_5
  • Release_2023_03 - build from RDKit Release_2023_03 branch and occasionally rebuilt as the code gets updated. Images have tag of Release_2023_03
  • Release_2023_03_1 - build from RDKit Release_2023_03_1 release tag. These images should never change [1]. Images have tag of Release_2023_03_1
  • Release_2023_03_2 - build from RDKit Release_2023_03_2 release tag. These images should never change [1]. Images have tag of Release_2023_03_2

[1] Where we say that the images should never change what we really mean in that the RDKit content should never change. We may rebuild these images occasionally when we find further improvements, and the underlying Debian packages may be updated, but the RDKit code should be exactly the same.

[2] These images were originally tagged as Release_2018_03_01 (2 digits as the final number). For better consistency with the RDKit GitHub tag names we switched to using a single digit format. Tags with two digits are also present for backward compatibility and point to the equivalent single digit image. Please use the single digit format.

GitHub repo for RDKit is here. GitHub repo for this project is here

To create images for a new version of RDKit you should only need to create a new branch from the corresponding previous version and then edit params.sh.

Build and run

Since October 2023 we have switched to a multi-stage build and are building images for amd64 and arm64 architectures. Thanks to @nmunro and @artran for assistance with building on arm64. These arm64 images should be treated as experimental. Please report any issue you may find.

You need to use the buildx extensions to build these images. The Dockerfile-debian is the multi-stage Dockerfile that builds all the images, and it is run by executing build-debian.sh, which is parameterised through the contents of params.sh.

The build stage builds RDKit form the appropriate GitHub branch for RDKit, and creates the deb packages and the Java artifacts from it for use in the python, java, tomcat and cartridge stages. Each subsequent stage is run separately and the images pushed to dockerhub. Note: only the amd64 is currently built for the tomcat image.

Run the Python image like this:

docker run -it --rm informaticsmatters/rdkit-python3-debian:<tag_name> python

Run the Java image like this:

docker run -it --rm informaticsmatters/rdkit-java-debian:<tag_name> bash

The CLASSPATH environment variable is already defined to include the RDKit library. You'll need to add your own classes and/or libraries to the CLASSPATH. To do a simple test with the java image using simple Java classes that can be found in the java_src directory first compile the classes like this:

$ docker run -it --rm -v $PWD/java_src:/example:Z informaticsmatters/rdkit-build-debian:<tag_name> sh -c 'cd /example && ./compile.sh'

Then run like this:

$ docker run -it --rm -v $PWD/java_src:/example:Z informaticsmatters/rdkit-java-debian:<tag_name> sh -c 'cd /example && ./run.sh'
RDKit version: 2020.09.3
Read smiles: c1ccccc1 Number of atoms: 6
RDKit version: 2020.09.3
Mol: org.RDKit.RWMol@5b2133b1
RDKit version: 2020.09.3
MorganFP: 4294967295

Javadocs are built into /rdkit/Code/JavaWrappers/gmwrapper/doc. Since the 2019_09 release a javadocs.tgz file is created in the artifacts/debian/<tag>/java/ directory.

RDBASE environment variable

In old versions of the images the RDBASE environment variable was set incorrectly which would impact functions where RDKit needs to read its internal data files. Since the 2020_03, 2019_09 and 2019_09_3 images this should be correctly set, but older images will suffer this problem and to fix it you must define the RDBASE environment variable when you run the container and set it to a value of /usr/share/RDKit. e.g. docker run -it -e RDBASE=/usr/share/RDKit ...

Python 3

Starting with the Release_2019_03 release RDKit only supports Python 3. We have been building Python 3 versions on the master/latest branch and for the 2019_03 versions onwards.

Java

Most images are built with Java 8. In early 2019 the Debian Buster repositories changed so that Java 11 was present and Java 8 was no longer available (and could not easily be added). Thus Debian images from 2019 onwards are built with Java 11.

RDKit cartridge

We have now started to handle the RDKit postgres cartridge in a debian environment as a series of informaticsmatters/rdkit-cartridge-debian images. This started with the Release_2018_09 images.

If you want to use the cartridge in the informaticsmatters/rdkit-cartridge-debian:latest image then try something like this:

# start the container
$ docker run -d --name rdkitcartridge informaticsmatters/rdkit-cartridge-debian:latest

# connect to the container
$ docker exec -it -u postgres rdkitcartridge bash
# run psql and create a database
postgres@db485abc2f02:/rdkit$ psql 
psql (10.4 (Debian 10.4-2))
Type "help" for help.

postgres=# create database rdkit;
CREATE DATABASE
postgres=# \q
# connect again to that database and install the cartridge
postgres@db485abc2f02:/rdkit$ psql -d rdkit
psql (10.4 (Debian 10.4-2))
Type "help" for help.

rdkit=# CREATE EXTENSION rdkit;
CREATE EXTENSION
rdkit=# \q

Notes:

  1. You must initially connect to the database as the postgres user, hence the need for the -u postgres option for the docker exec command.

Hopefully coming soon

  • Tests for built images.

Requests also welcome!

docker-rdkit's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

docker-rdkit's Issues

build.sh error

Hello
I am trying to build the project by runing
./build-debian.sh

 ---> Running in 49c6e7c563cc
-- The C compiler identification is GNU 8.3.0
-- The CXX compiler identification is GNU 8.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Check if the system is big endian
-- Searching 16 bit integer
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of unsigned short
-- Check size of unsigned short - done
-- Using unsigned short
-- Check if the system is big endian - little endian
-- Catch not found in /rdkit/External/catch/catch
Downloading https://github.com/catchorg/Catch2/archive/v2.12.1.tar.gz...
CATCH: /rdkit/External/catch/catch/single_include/catch2
-- Could NOT find InChI in system locations (missing: INCHI_LIBRARY INCHI_INCLUDE_DIR) 
Downloading http://www.inchi-trust.org/download/105/INCHI-1-SRC.zip...
-- Found PythonInterp: /usr/bin/python3 (found version "3.7.3") 
-- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.7m.so (found version "3.7.3") 
Python Install directory /usr/lib/python3/dist-packages
PYTHON Py_ENABLE_SHARED: 1
PYTHON USING LINK LINE: -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions  -Wl,-z,relro
-- Found Eigen3: /usr/include/eigen3 (Required is at least version "2.91.0") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Boost version: 1.67.0
-- Found the following Boost libraries:
--   serialization
-- Found PostgreSQL: /usr/lib/x86_64-linux-gnu/libpq.so (found version "11.7 (Debian 11.7-0+deb10u1)") 
== Using strict rotor definition
Downloading http://sourceforge.net/projects/avalontoolkit/files/AvalonToolkit_1.2/AvalonToolkit_1.2.0.source.tar...
-- Boost version: 1.67.0
-- Found the following Boost libraries:
--   system
--   iostreams
--   regex
-- maeparser include dir set as 'maeparser_INCLUDE_DIRS-NOTFOUND'
-- maeparser libraries set as 'maeparser_LIBRARIES-NOTFOUND'
-- Could NOT find maeparser (missing: maeparser_INCLUDE_DIRS maeparser_LIBRARIES) 
Downloading https://github.com/schrodinger/maeparser/archive/v1.2.3.tar.gz...
-- coordgen include dir set as coordgen_INCLUDE_DIRS-NOTFOUND
-- coordgen libraries set as 'coordgen_LIBRARIES-NOTFOUND'
-- Could NOT find coordgen (missing: coordgen_INCLUDE_DIRS coordgen_LIBRARIES) 
Downloading https://github.com/schrodinger/coordgenlibs/archive/v1.4.1.tar.gz...
Downloading https://github.com/rareylab/RingDecomposerLib/archive/v1.1.3_rdkit.tar.gz...
-- Boost version: 1.67.0
-- Found the following Boost libraries:
--   iostreams
--   regex
-- Boost version: 1.67.0
-- Found the following Boost libraries:
--   system
--   iostreams
--   regex
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11") 
== Updating Filters.cpp from pains file
== Done updating pains files
CMake Error at /usr/share/cmake-3.13/Modules/FindPackageHandleStandardArgs.cmake:137 (message):
  Could NOT find Freetype (missing: FREETYPE_LIBRARY FREETYPE_INCLUDE_DIRS)
Call Stack (most recent call first):
  /usr/share/cmake-3.13/Modules/FindPackageHandleStandardArgs.cmake:378 (_FPHSA_FAILURE_MESSAGE)
  /usr/share/cmake-3.13/Modules/FindFreetype.cmake:156 (find_package_handle_standard_args)
  Code/GraphMol/MolDraw2D/CMakeLists.txt:66 (find_package)


-- Configuring incomplete, errors occurred!
See also "/rdkit/build/CMakeFiles/CMakeOutput.log".
See also "/rdkit/build/CMakeFiles/CMakeError.log".
The command '/bin/sh -c cmake -Wno-dev  -DPYTHON_EXECUTABLE=/usr/bin/python3  -DRDK_INSTALL_INTREE=OFF  -DRDK_BUILD_INCHI_SUPPORT=ON  -DRDK_BUILD_AVALON_SUPPORT=ON  -DRDK_BUILD_PYTHON_WRAPPERS=ON  -DRDK_BUILD_SWIG_WRAPPERS=ON  -DRDK_BUILD_PGSQL=ON  -DPostgreSQL_ROOT=/usr/lib/postgresql/$POSTGRES_VERSION  -DPostgreSQL_TYPE_INCLUDE_DIR=/usr/include/postgresql/$POSTGRES_VERSION/server  -DCMAKE_INSTALL_PREFIX=/usr  -DCPACK_PACKAGE_RELOCATABLE=OFF  ..' returned a non-zero code: 1

Any idea of what is wrong ?

Running on Ubuntu 18.04
Docker 19.03.12

Wrong image names in README.md

docker image names in README.me examples are not correct. For example:

informaticsmatters/rdkit_java_debian:latest

should be:

informaticsmatters/rdkit-java-debian:latest

ARM64 Support possible?

Hi,
I've been trying to get this container to build on an ARM64 platform but I get a variety of errors. I know you aren't currently claiming to support ARM64 but do you have any plans for it in the future?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.