Code Monkey home page Code Monkey logo

Comments (7)

tuncadogan avatar tuncadogan commented on June 4, 2024

Thank you for your interest in our tool. We are aware of the problem related to deepscreen_models_hyperparameters_performance_results.tsv and we will fix it as soon as possible. The missing column is used for binarizing the obtained prediction score (i.e., when the score is above the given threshold, it is accepted as a positive prediction, and vice versa) so this is determined during the training of each of the 704 models, by selecting the threshold value that provided the maximum performance. Assigning an arbitrary value to this threshold would cause an unstable predictor with a totally unknown performance. If you want to use our pre-trained models, It would be best to wait for us to re-train the system and determine the correct thresholds.

Another option would be training the classifier yourself by following the instruction below "How to train a target-based DEEPScreen model" in the readme file. The performance is always dependent on the hyperparameter selections. If that protein is among 704 targets of DEEPScreen, the hyperparameter values that we have selected should be in the same file: deepscreen_models_hyperparameters_performance_results.tsv

Apart from that, could you please tell us which zip files are appearing damaged? So that we can replace them. Thank you.

from deepscreen.

lesliewalcott avatar lesliewalcott commented on June 4, 2024

Hi, I am having an issue with this too. Looking at line 1173 in dataProcessing.py:

`for line in lst_best_fl[1:]:
        log_fl, modelname, target, optimizer, learning_rate, epoch, hidden1, hidden2, dropout, rotate, save_model, test_f1score, test_mcc, test_accuracy, test_precision, test_recall, test_tp, test_fp, test_tn, test_fn, test_threshold, val_auc, val_auprc, test_auc, test_auprc = line.split("\t")'
    'chembl_target_threshold_dict[target] = float(test_threshold)`

However, as dongdongdoge said, the file deepscreen_models_hyperparameters_performance_results.tsv does not contain 'test threshold', 'test_tp', 'test_fp', 'test_tn', or 'test_fn'. Therefore, when I run the code, I get the following error:

'log_fl, modelname, target, optimizer, learning_rate, epoch, hidden1, hidden2, dropout, rotate, save_model, test_f1score, test_mcc, test_accuracy, test_precision, test_recall, test_tp, test_fp, test_tn, test_fn, test_threshold, val_auc, val_auprc, test_auc, test_auprc = line.split("\t")
ValueError: not enough values to unpack (expected 25, got 20)

The file is missing those 5 columns.

from deepscreen.

tuncadogan avatar tuncadogan commented on June 4, 2024

Hi, I am having an issue with this too. Looking at line 1173 in dataProcessing.py:

`for line in lst_best_fl[1:]:
        log_fl, modelname, target, optimizer, learning_rate, epoch, hidden1, hidden2, dropout, rotate, save_model, test_f1score, test_mcc, test_accuracy, test_precision, test_recall, test_tp, test_fp, test_tn, test_fn, test_threshold, val_auc, val_auprc, test_auc, test_auprc = line.split("\t")'
    'chembl_target_threshold_dict[target] = float(test_threshold)`

However, as dongdongdoge said, the file deepscreen_models_hyperparameters_performance_results.tsv does not contain 'test threshold', 'test_tp', 'test_fp', 'test_tn', or 'test_fn'. Therefore, when I run the code, I get the following error:

'log_fl, modelname, target, optimizer, learning_rate, epoch, hidden1, hidden2, dropout, rotate, save_model, test_f1score, test_mcc, test_accuracy, test_precision, test_recall, test_tp, test_fp, test_tn, test_fn, test_threshold, val_auc, val_auprc, test_auc, test_auprc = line.split("\t") ValueError: not enough values to unpack (expected 25, got 20)

The file is missing those 5 columns.

Thank you very much for your interest. Yes, unfortunately the correct version of that file, which contained the 'test threshold', 'test_tp', 'test_fp', 'test_tn', 'test_fn' columns has been lost, and we could not recover it. We can offer 2 alternative solutions for you:

  1. We are now updating the system to solve all library/dependency issues and to re-run the system with up-to-date data. We hope that it will be ready in a few weeks, if you have time to wait.

  2. If you need to use the tool right now, we suggest you to re-train the DEEPScreen model for your target of interest, using the instructions in our readme file, under the title: "How to train a target-based DEEPScreen model"

Please let me know, especially if you choose option 2 and have some questions.

from deepscreen.

lesliewalcott avatar lesliewalcott commented on June 4, 2024

Thank you for your response! I am going to try and re-train the model.

from deepscreen.

lesliewalcott avatar lesliewalcott commented on June 4, 2024

Hi, I wanted to follow up and see if you have made progress toward updating the system?

from deepscreen.

ahmetrifaioglu avatar ahmetrifaioglu commented on June 4, 2024

Hi,
We are sorry for the delay. We are trying our best to update the system. We had to made some major changes to create a new version. We are planning to put the new implementation until this Friday. I will give an update once we finish the initial development and release the code.
Best

from deepscreen.

ahmetrifaioglu avatar ahmetrifaioglu commented on June 4, 2024

Hi,

We are sorry for the late response again. It is quite busy and hectic times for us and we had to do some major changes in the implementation of DEEPScreen as I mentioned before. The main change is that we decided not to proceed with the tflearn as the version that we had used became too old (it has been almost 4 years since we started this project) and we encountered other problems and incompatibilities among the new versions of libraries when we want to do some changes. Some others also reported installation problems.

For these reasons, DEEPScreen has been re-implemented using PyTorch. We created all the training/test/validation images for all targets in order to avoid the image size, quality and library issues. So, you can use the readily available images to train models for the targets. The new version has been tested on MacOSx and Linux. Unfortunately, we have not yet been able to work on CNN architectures in detail and create models for each target as it is required to perform hyper-parameter search for all of the targets separately. But we are planning to work on it next.

Here is the summary of the new changes:
The implementation was done using the latest version of all libraries (PyTorch, RDkit etc.)
The filtered and preprocessed dataset was updated using ChEMBL version 27.
The number of targets increased from 704 to 812 with the updated training datasets.
Training, validation and test images were created for each target.

Here is the things that we are planning to do next:
Adding other CNN architectures will be added such as InceptionV3 for training
Performing hyperparameter search and generating target-specific models
Developing scripts for easy testing using the generated models

I am closing this issue now. Please let us any problems that you encounter.

Best

from deepscreen.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.