hello dear tuncadogan, deepscreen_models_hyperparameters_performance

test_threshold problem and zip file problem about deepscreen HOT 7 CLOSED

cansyl commented on June 4, 2024

test_threshold problem and zip file problem

from deepscreen.

Comments (7)

tuncadogan commented on June 4, 2024

Thank you for your interest in our tool. We are aware of the problem related to deepscreen_models_hyperparameters_performance_results.tsv and we will fix it as soon as possible. The missing column is used for binarizing the obtained prediction score (i.e., when the score is above the given threshold, it is accepted as a positive prediction, and vice versa) so this is determined during the training of each of the 704 models, by selecting the threshold value that provided the maximum performance. Assigning an arbitrary value to this threshold would cause an unstable predictor with a totally unknown performance. If you want to use our pre-trained models, It would be best to wait for us to re-train the system and determine the correct thresholds.

Another option would be training the classifier yourself by following the instruction below "How to train a target-based DEEPScreen model" in the readme file. The performance is always dependent on the hyperparameter selections. If that protein is among 704 targets of DEEPScreen, the hyperparameter values that we have selected should be in the same file: deepscreen_models_hyperparameters_performance_results.tsv

Apart from that, could you please tell us which zip files are appearing damaged? So that we can replace them. Thank you.

from deepscreen.

lesliewalcott commented on June 4, 2024

Hi, I am having an issue with this too. Looking at line 1173 in dataProcessing.py:

`for line in lst_best_fl[1:]:
        log_fl, modelname, target, optimizer, learning_rate, epoch, hidden1, hidden2, dropout, rotate, save_model, test_f1score, test_mcc, test_accuracy, test_precision, test_recall, test_tp, test_fp, test_tn, test_fn, test_threshold, val_auc, val_auprc, test_auc, test_auprc = line.split("\t")'

    'chembl_target_threshold_dict[target] = float(test_threshold)`

However, as dongdongdoge said, the file deepscreen_models_hyperparameters_performance_results.tsv does not contain 'test threshold', 'test_tp', 'test_fp', 'test_tn', or 'test_fn'. Therefore, when I run the code, I get the following error:

'log_fl, modelname, target, optimizer, learning_rate, epoch, hidden1, hidden2, dropout, rotate, save_model, test_f1score, test_mcc, test_accuracy, test_precision, test_recall, test_tp, test_fp, test_tn, test_fn, test_threshold, val_auc, val_auprc, test_auc, test_auprc = line.split("\t")
ValueError: not enough values to unpack (expected 25, got 20)

The file is missing those 5 columns.

from deepscreen.

tuncadogan commented on June 4, 2024

Hi, I am having an issue with this too. Looking at line 1173 in dataProcessing.py:
`for line in lst_best_fl[1:]:
        log_fl, modelname, target, optimizer, learning_rate, epoch, hidden1, hidden2, dropout, rotate, save_model, test_f1score, test_mcc, test_accuracy, test_precision, test_recall, test_tp, test_fp, test_tn, test_fn, test_threshold, val_auc, val_auprc, test_auc, test_auprc = line.split("\t")'
    'chembl_target_threshold_dict[target] = float(test_threshold)`
However, as dongdongdoge said, the file deepscreen_models_hyperparameters_performance_results.tsv does not contain 'test threshold', 'test_tp', 'test_fp', 'test_tn', or 'test_fn'. Therefore, when I run the code, I get the following error:

'log_fl, modelname, target, optimizer, learning_rate, epoch, hidden1, hidden2, dropout, rotate, save_model, test_f1score, test_mcc, test_accuracy, test_precision, test_recall, test_tp, test_fp, test_tn, test_fn, test_threshold, val_auc, val_auprc, test_auc, test_auprc = line.split("\t") ValueError: not enough values to unpack (expected 25, got 20)

The file is missing those 5 columns.

Thank you very much for your interest. Yes, unfortunately the correct version of that file, which contained the 'test threshold', 'test_tp', 'test_fp', 'test_tn', 'test_fn' columns has been lost, and we could not recover it. We can offer 2 alternative solutions for you:

We are now updating the system to solve all library/dependency issues and to re-run the system with up-to-date data. We hope that it will be ready in a few weeks, if you have time to wait.
If you need to use the tool right now, we suggest you to re-train the DEEPScreen model for your target of interest, using the instructions in our readme file, under the title: "How to train a target-based DEEPScreen model"

Please let me know, especially if you choose option 2 and have some questions.

from deepscreen.

lesliewalcott commented on June 4, 2024

Thank you for your response! I am going to try and re-train the model.

from deepscreen.

lesliewalcott commented on June 4, 2024

Hi, I wanted to follow up and see if you have made progress toward updating the system?

from deepscreen.

ahmetrifaioglu commented on June 4, 2024

Hi,
We are sorry for the delay. We are trying our best to update the system. We had to made some major changes to create a new version. We are planning to put the new implementation until this Friday. I will give an update once we finish the initial development and release the code.
Best

from deepscreen.

ahmetrifaioglu commented on June 4, 2024

Hi,

We are sorry for the late response again. It is quite busy and hectic times for us and we had to do some major changes in the implementation of DEEPScreen as I mentioned before. The main change is that we decided not to proceed with the tflearn as the version that we had used became too old (it has been almost 4 years since we started this project) and we encountered other problems and incompatibilities among the new versions of libraries when we want to do some changes. Some others also reported installation problems.

For these reasons, DEEPScreen has been re-implemented using PyTorch. We created all the training/test/validation images for all targets in order to avoid the image size, quality and library issues. So, you can use the readily available images to train models for the targets. The new version has been tested on MacOSx and Linux. Unfortunately, we have not yet been able to work on CNN architectures in detail and create models for each target as it is required to perform hyper-parameter search for all of the targets separately. But we are planning to work on it next.

Here is the summary of the new changes:
The implementation was done using the latest version of all libraries (PyTorch, RDkit etc.)
The filtered and preprocessed dataset was updated using ChEMBL version 27.
The number of targets increased from 704 to 812 with the updated training datasets.
Training, validation and test images were created for each target.

Here is the things that we are planning to do next:
Adding other CNN architectures will be added such as InceptionV3 for training
Performing hyperparameter search and generating target-specific models
Developing scripts for easy testing using the generated models

I am closing this issue now. Please let us any problems that you encounter.

Best

from deepscreen.

test_threshold problem and zip file problem about deepscreen HOT 7 CLOSED

Comments (7)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent