Comments (9)
Thanks for the head start @kmaziarz and @sarahnlewis . I'll get started with the task.
from molecule-generation.
Hi @kmaziarz I'm an undergrad student from India. I do have experience working on similiar projects and CLI. Can I work on this issue?
from molecule-generation.
Yes, sure! Sorry for a slow response: the best way of addressing this issue isn't very clear, and so I had to give it a bit of thought. One thing to note is that the underlying model loading utility (load_vae_model_and_dataset
) already works with all model types, and the missing functionality is that we need to choose the right wrapper class (either VaeWrapper
or GeneratorWrapper
). Ideally, the outcome would be that apart from being able to do
with VaeWrapper(model_dir, **model_kwargs) as model:
(...)
which is what we do currently (see e.g. cli/encode.py
), we could also do
with load_model_from_directory(model_dir, **model_kwargs) as model:
(...)
and load_model_from_directory
would return the right wrapper class. Then, for scripts that can work with both VAE-style and generator-style models (e.g. cli/sample.py
), we'd use this new function to load them.
To return the right wrapper class, we would just need to select one and then pass all the arguments through. We currently have a bunch of filename matching in wrapper.py
, and we could go deeper in that direction, but ultimately this feels unreliable. Instead, maybe we could take the following steps:
- Get rid of the
_is_moler_model_filename
function that differs between the two wrapper classes, so thatModelWrapper._get_model_file
would grab all the files that end with*.pkl
as potential model files (and then assert there's exactly one, as is currently done). - Create a helper
get_model_class
similar toget_model_parameters
inmodel_utils.py
. - Connect the two things above to infer which wrapper to construct: first run
ModelWrapper._get_model_file
to get the model path, and thenget_model_class
to get the class (e.g.MoLeRVae
); based on this, we'd return eitherVaeWrapper
orGeneratorWrapper
.
What do you think? If it's all too confusing I'm also happy to take a stab at this myself; arguably addressing this issue requires more fiddling with internals than would initially seem... Also pinging @sarahnlewis in case she has any comments.
from molecule-generation.
@kmaziarz is your suggestion that the model type be found in the contents of the same .pkl
file that get_model_parameters
reads, or that we save a separate file e.g. model_type.txt
with each trained model? I think I would prefer the latter.
from molecule-generation.
@kmaziarz is your suggestion that the model type be found in the contents of the same .pkl file that get_model_parameters reads, or that we save a separate file e.g. model_type.txt with each trained model?
The model class is already being saved in the *.pkl
file (which is a dict
, containing not only the model class and weights but also various other hyperparams), and load_vae_model_and_dataset
(the lower level model loading utility used for all model types) use the class it reads to load the model. So my proposal is to just make use of that (which has an advantage of being compatible with old checkpoints out-of-the-box).
from molecule-generation.
OK, sounds good.
from molecule-generation.
Hi @kmaziarz so far I have been able to extract the model class from the .pkl file and defined a method which returns which model wrapper class to use based on the model class i.e. Vaewrapper for MoLeRVae
and GeneratorWrapper for MoLeRGenerator
class. Now the confusion that I have is that GeneratorWrapper doesn't have encode method. From the sample function already defined, it looks like GeneratorWrapper doesn't need one. Should I add a method similar to VaeWrapper or just return the sample_latents?
from molecule-generation.
Now the confusion that I have is that GeneratorWrapper doesn't have encode method. From the sample function already defined, it looks like GeneratorWrapper doesn't need one. Should I add a method similar to VaeWrapper or just return the sample_latents?
GeneratorWrapper
should not have encode
, as it represents latent-space-less models that can only sample (that's why it needs to be a separate class, because the API is more limited). Anywhere encode
is called (e.g. cli/encode.py
) we should keep using the VaeWrapper
, while e.g. in cli/sample.py
we can use the generic load_model_from_directory
.
from molecule-generation.
@anamika-yadav99: So, summing up, for now let's use the generic way of loading the wrapper for sample.py
only. Technically some modes of visualization would also work for MoLeRGenerator
, but the visualizer is a bit of a work-in-progress at the moment, so I would leave it out for now.
from molecule-generation.
Related Issues (20)
- How does decode can return multiple similar molecules? HOT 2
- Motif embeddings HOT 2
- Clarification: correct_edge_choices is array of all zeros, while valid_edge_choices has a few candidates HOT 3
- Script for recreating evaluation scores on Guacamol benchmark HOT 14
- Query about data split! HOT 1
- Question about node_type_predictor_class_loss_weight_factor HOT 2
- where is the training datasets? HOT 2
- how can i generate large SMILES ? for example generate 100000000? HOT 1
- IndexError: pop from empty list HOT 11
- Data Preprocessing HOT 1
- Warning when using Load_model_from_directory(dir) HOT 1
- Tensorflow warnings when using encode HOT 5
- Large amount of error messages when using decode HOT 5
- libdevice not found during training using default conda environment on Ubuntu 22.04.2 with a RTX A4000 HOT 4
- Computing likely next actions HOT 3
- Optimising latent vectors for objective HOT 4
- memory overflow with large dataset preprocessing HOT 4
- preprocess need too many time AND how use my csv to generate the train.smiles and valid.smiles HOT 5
- Provided model cannot be used with the new Tensorflow pickle loader. (module tensorflow.python.training.tracking missing) HOT 2
- advices of the evaluation options defined by sklearn.metrics HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from molecule-generation.