Code Monkey home page Code Monkey logo

edbo's People

Contributors

b-shields avatar edbo-test avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

edbo's Issues

Rdkit installation

According to the author (Greg Landrum) I believe the preferred way to install RDKit is
conda install -c conda-forge rdkit
which has the advantage that way you can use Pyton versions 3.8 and higher as well.

Regression ability of gaussian process

Hi Ben,

I am using EDBO to study results of one type of experiments. There are four dimensions in the parameter space. I find that the gaussian process in EDBO has very poor regression performance even for the training set (4 dimensions, around 200 data points in total), and the dataset can be well learned by neural networks as a comparison. I think the reason for the failure of gaussian process is that my dataset is highly "discontinous" with many delta-function like jump between 0 and some finite values, and around 1/3 of data are zeros. Do you think it is reasonable that gaussian process cannot perform very well for a four-dimensional dataset? Thank you very much!

Decoding back to categorical values?

The code doesn't seem to do any conversion from categorical to continuous space, but only in the other direction (e.g. SMILES >> DFT.) That is, it seems to assume that the continuous space, to which the categoricals are encoded, is 'real' space and do all the work in that.
If we were to use this tool to generate a batch of real-world experiments to be carried out, how should we decode the continuous vectors back to actual ligands etc.?

Can we put constraints on the search space?

Hello, I can see in the examples that the chemical space is built in a hypercubic way. However, in my case I cannot reach all the points in this hypercube, instead I will need to put some constraints (e.g. the summation of coords value on all the axis is less than 1).

Just wondering if we are allowed to do that using edbo, thanks!

Passing empty strings as components

Hello,

I have been using edbo to try and optimize some sol-gel reactions in the lab, with some promising first results. I have encountered one issue, however, which is that it doesn't seem possible to pass empty strings as reaction components.
The reason why I'm interested in doing this, is because I would like to try to compare the same reaction with and without ligands for stabilization. Thus I would like to pass a list of the form " 'ligands' : [' ', 'ligand_1', 'ligand_2, ... , 'ligand_n'] ".
Would there be any way to achieve a similar result, or would it be better to pass the surface functional group (i.e. OH- in the case of hydroxide nanoparticles) as a ligand in the case of not adding a ligand?

I hope this question makes sense; if anything was unclear, please feel free to ask for further clarification.

Best,
Fabien

convert the maximum value into the minimum value

Thank you very much for your work. Now I have a question for you, edbo can help us find a maximum point, but now I want to convert the maximum value into the minimum value, what operations do I need to do? Hope to get some specific ideas from you. thank you very much

Input continuous parameter

Hi, thank you for the great tool. This is very useful !!

In the example codes, there are only discrete or categorical (with or without descrioptos) parameters and I cannot find how to use continuous paramters without limitation of grid points.

Is it possible to use continuous parameters in this tool ?

Also I wonder if it is possible to set different grid points from the past experiments conditions. For example, if there are 1.3, 1.5, and 1.7 equivalent of reagents are used in the past experiments, but we want to set 1.2, 1.4, 1.6 equivalent as a candidate values for the next experiment. (I tried and it seems to work, but I'd like to confirm this tool also concider such cases.)

conda error for GPU integration?

Not sure if this is required for GPU or not (I am still at install phase), but
conda install cudatookit=10.1, torchvision -c pytorch
gives following error:
CondaValueError: invalid package specification: cudatookit=10.1,

Is this actually necessary, considering that
conda install -c pytorch pytorch=1.3.1
Seems to install cudatoolkit already?

This is part of the output for me from that install command:

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
cffi-1.14.4                |   py37hcd4344a_0         243 KB
cudatoolkit-10.1.243       |       h74a9793_0       300.3 MB
ninja-1.10.2               |   py37h6d14046_0         246 KB
pytorch-1.3.1              |py3.7_cuda101_cudnn7_0       479.7 MB  pytorch
------------------------------------------------------------
                                       Total:       780.5 MB

The following NEW packages will be INSTALLED:

cffi pkgs/main/win-64::cffi-1.14.4-py37hcd4344a_0
cudatoolkit pkgs/main/win-64::cudatoolkit-10.1.243-h74a9793_0
ninja pkgs/main/win-64::ninja-1.10.2-py37h6d14046_0
pycparser pkgs/main/noarch::pycparser-2.20-py_2
pytorch pytorch/win-64::pytorch-1.3.1-py3.7_cuda101_cudnn7_0

Importing unindexed external results

(Posted here at Ben's request from private correspondence. Thanks, Ben!)

I have a question about importing external results: I've looked through all the example notebooks and bro.py but I'm still unable to figure out how to import a .csv file containing existing results, either one with the experimental index numbers or ideally one without them. Specifically, here's what I'm trying to do:

  1. Use BO_express module to easily encode some components with Mordred from the SMILES strings (e.g., ligand, base, solvent, while other variables use numeric encoding)
  2. Specify an external initialization (init_method='external') so that I can include pre-existing data from earlier screening (e.g., a ligand screen with all other variables held constant at levels which are included in the search space)
  3. Populate a .csv file with data from (e.g.) external ligand screen in the same format as the "init" or "round0" files, but ideally not requiring the experiment index numbers since the design is created after the ligand screen was run.
  4. Import the existing results .csv file into BO and use this to initialize the first round of screening.

Where is the dft descriptor extractor?

Hi. Thank you for such a great codes and paper. I read it and I'm trying to use edbo on my project.

But I cannot find the way how to extract descriptor csv without autoqchem.
It's useful to upload molecule's gaussian result log file to autoqchem indeed but my molecule is a kind of classified information so I couldn't get the permission to upload it on the public site.

I hope anyone let me know the way how to get descriptors from gaussian log file without autoqchem.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.