Code Monkey home page Code Monkey logo

prankweb's People

Contributors

andreasoltes avatar davidhoksza avatar davidjakubec avatar jendelel avatar luk27official avatar skodapetr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

prankweb's Issues

Check for initial status

It is possible that the inital qued status is shown even if the task is imediately started .. a solution would be to ignore the first response.

The downloaded results archive contains a PDB file without any ligands

When the archive with the results is downloaded, it contains actually two PDB files, one in the root of the archive and one in the visualization/data folder. Both are named structure. However, the one, which is loaded with the pml script is actually a stripped version of the full PDB file and does not contain any information about any ligand.

When I try to run the predictions with the latest P2rank everything works well.

Failed to predict 7bv2

10/22/2020 15:16:51 [INFO] - Preparing structure ...
10/22/2020 15:16:51 [DEBUG] - Downloading 'https://files.rcsb.org/download/7bv2.pdb' to '/data/p2rank/task/database/v2-conservation/7bv2_A,B,C,P,T/working/structure-raw.pdb' ...
10/22/2020 15:16:51 [DEBUG] - Starting new HTTPS connection (1): files.rcsb.org:443
10/22/2020 15:16:52 [DEBUG] - https://files.rcsb.org:443 "GET /download/7bv2.pdb HTTP/1.1" 200 167677
10/22/2020 15:16:52 [DEBUG] - Reading chains ...
/usr/local/lib/python3.7/dist-packages/Bio/PDB/StructureBuilder.py:92: PDBConstructionWarning: WARNING: Chain A is discontinuous at line 9455.
PDBConstructionWarning,
/usr/local/lib/python3.7/dist-packages/Bio/PDB/StructureBuilder.py:92: PDBConstructionWarning: WARNING: Chain P is discontinuous at line 9468.
PDBConstructionWarning,
/usr/local/lib/python3.7/dist-packages/Bio/PDB/StructureBuilder.py:92: PDBConstructionWarning: WARNING: Chain A is discontinuous at line 9492.
PDBConstructionWarning,
10/22/2020 15:16:52 [DEBUG] - Reading chains ... done
10/22/2020 15:16:52 [DEBUG] - Executing '/opt/protein-utils/bin/protein-utils -a filter-by-chain --structure /data/p2rank/task/database/v2-conservation/7bv2_A,B,C,P,T/working/structure-raw.pdb --output /data/p2rank/task/database/v2-conservation/7bv2_A,B,C,P,T/working/structure.pdb --chains T,C,P,A,B'
15:16:53 [main] INFO cz.siret.protein.utils.ApplicationEntry - Using action: filter-by-chain
15:16:53 [main] WARN o.b.nbio.structure.align.util.UserConfiguration - Could not read dir from system property PDB_DIR or environment variable PDB_DIR, using system's temp directory /tmp
15:16:53 [main] INFO o.b.nbio.structure.align.util.UserConfiguration - Could not read cache dir from system property PDB_CACHE_DIR or environment variable PDB_CACHE_DIR, using PDB directory instead /tmp/
15:16:53 [main] INFO org.biojava.nbio.structure.io.PDBFileParser - Could not parse revision date string ''.
15:16:53 [main] INFO org.biojava.nbio.structure.io.PDBFileParser - Could not parse revision date string ''.
15:16:53 [main] INFO org.biojava.nbio.structure.io.PDBFileParser - Could not parse revision date string ''.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.sun.xml.bind.v2.runtime.reflect.opt.Injector (file:/opt/protein-utils/lib/jaxb-impl-2.3.0.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int)
WARNING: Please consider reporting this to the maintainers of com.sun.xml.bind.v2.runtime.reflect.opt.Injector
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
15:16:55 [main] INFO cz.siret.protein.utils.ApplicationEntry - Finished in 00:00:02
10/22/2020 15:16:55 [INFO] - Preparing structure ... done
10/22/2020 15:16:55 [DEBUG] - Path: /data/p2rank/task/database/v2-conservation/7bv2_A,B,C,P,T/working/structure.pdb
10/22/2020 15:16:55 [DEBUG] - Chains: T C P A B
10/22/2020 15:16:55 [DEBUG] - Executing '/opt/protein-utils/bin/protein-utils -a extract-chain-sequence --structure /data/p2rank/task/database/v2-conservation/7bv2_A,B,C,P,T/working/structure.pdb --output /data/p2rank/task/database/v2-conservation/7bv2_A,B,C,P,T/working/conservation-T/sequence.fasta --chain T'
15:16:55 [main] INFO cz.siret.protein.utils.ApplicationEntry - Using action: extract-chain-sequence
15:16:55 [main] WARN o.b.nbio.structure.align.util.UserConfiguration - Could not read dir from system property PDB_DIR or environment variable PDB_DIR, using system's temp directory /tmp
15:16:55 [main] INFO o.b.nbio.structure.align.util.UserConfiguration - Could not read cache dir from system property PDB_CACHE_DIR or environment variable PDB_CACHE_DIR, using PDB directory instead /tmp/
15:16:55 [main] INFO org.biojava.nbio.structure.io.PDBFileParser - Entity information (COMPOUND record) not found in file. Will assign entities heuristically
15:16:56 [main] ERROR cz.siret.protein.utils.ApplicationEntry - Action execution failed
15:16:56 [main] INFO cz.siret.protein.utils.ApplicationEntry - Reason
cz.siret.protein.utils.action.ActionFailed: Missing chain: T
at cz.siret.protein.utils.action.ChainToSequence.getChainSequence(ChainToSequence.java:19)
at cz.siret.protein.utils.command.StructureToFasta.saveSequence(StructureToFasta.java:78)
at cz.siret.protein.utils.command.StructureToFasta.execute(StructureToFasta.java:64)
at cz.siret.protein.utils.command.StructureToFasta.execute(StructureToFasta.java:20)
at cz.siret.protein.utils.ApplicationEntry.executeAction(ApplicationEntry.java:243)
at cz.siret.protein.utils.ApplicationEntry.executeAction(ApplicationEntry.java:204)
at cz.siret.protein.utils.ApplicationEntry.run(ApplicationEntry.java:55)
at cz.siret.protein.utils.ApplicationEntry.main(ApplicationEntry.java:41)
15:16:56 [main] INFO cz.siret.protein.utils.ApplicationEntry - Finished in 00:00:00
Traceback (most recent call last):
File "/opt/p2rank-runtime/run_p2rank_task.py", line 359, in <module>
main(_read_arguments())
File "/opt/p2rank-runtime/run_p2rank_task.py", line 64, in main
structure_file, chains, configuration, arguments)
File "/opt/p2rank-runtime/run_p2rank_task.py", line 190, in prepare_conservation
return compute_from_structure(structure_file, chains, arguments)
File "/opt/p2rank-runtime/run_p2rank_task.py", line 236, in compute_from_structure
for chain in chains
File "/opt/p2rank-runtime/run_p2rank_task.py", line 236, in <dictcomp>
for chain in chains
File "/opt/p2rank-runtime/run_p2rank_task.py", line 255, in compute_from_structure_for_chain
conservation.compute_conservation(fasta_file, working_dir, target_file)
File "/opt/p2rank-runtime/conservation.py", line 57, in compute_conservation
compute_msa(input_file, working_dir, msa_file)
File "/opt/p2rank-runtime/conservation.py", line 64, in compute_msa
sequences = _read_fasta_file(fasta_file)
File "/opt/p2rank-runtime/conservation.py", line 84, in _read_fasta_file
with open(input_file) as in_stream:
FileNotFoundError: [Errno 2] No such file or directory: '/data/p2rank/task/database/v2-conservation/7bv2_A,B,C,P,T/working/conservation-T/sequence.fasta'

Fix protein-utils warning

From log

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.sun.xml.bind.v2.runtime.reflect.opt.Injector (file:/opt/protein-utils/lib/jaxb-impl-2.3.0.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int)
WARNING: Please consider reporting this to the maintainers of com.sun.xml.bind.v2.runtime.reflect.opt.Injector
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

This is issue with jaxb-impl 2.3.0 use by BioJava.
The jaxb-impl 2.4.0 should be fixed, but it is not used by BioJava.
So we need to monitor BioJava for updates.

No pockets found

Add explicit message with information that no pockets were found. As a bonus we can offer to re-run the prediction using more chains.

Example is 7BV2 for chain B. As of now it may not be clear to the user that there are no pockets.

image

The log should not be red

Assign different colors to the debug, info, and warning messages which appear when waiting for the results.

Re-use conservation scores inside a structure

For some structures, i.e. homomultimers, there is a same chain multiple times. We need to be able to detect this and do not run conservation on the same sequence multiple times.

As in future the conservation pipeline should be able to cache data of computed sequences, the implemented solution should be removable in the future.

Add deployment specific docker-compose

Now we have a docker-compose.production file, this is not enought as we have multiple environments. There should be a file for each env. and also for running on a local machine.

This shold include configuration of env properties such as number of workers and GA tags.

Update to P2Rank 2.2

With new version there are following relevant changes:

  • Add time stamps to log -stdout_timestamp "yyyy.MM.dd HHmm:"
  • Update conservation file name to {basename}{chain}.(.*).hom(.gz|.zst|.bz2|{})
    where {basename} is one of
    {basename}.pdb
    {basename}.pdb.gz
    {basename}.ent
    {basename}.ent.gz
    {basename}.cif
    {basename}.cif.gz

Add comment to the "PDB Code"

This can be selected only when user provide a custom strcuture file.
Perhaps we can display this field only after user select strcuture file.

Update help

Move and update help page to reflect changes in UI. The help page should be part of the repository Wiki.

No stripping of white space from PDB codes

When the user pastes a PDB code with trailing spaces, it is not recognized. The correct behavior would be to always strip the white spaces. The problem is that only without the trailing spaces, the code is recognized and chain selection is offered.

Update sequence filtering

As of now, we limit the number of sequences given to MUSCLE by just using top-n. In order to increase diversity we should choose each m-th sequence instead.

Submitting an empty protein

When the user deselects all chains at the input, the computation proceeds (starts to collect MSA and so on) and even does the prediction. Instead, we should maybe disable the submit button and, in the optimal case, show a warning/error message.

Performance issues with protein utils

Describe the bug
Running protein-utils on 19HC, cause performance issues when clustering ligands.

To Reproduce

  1. Run .\protein-utils\bin\protein-utils.bat -a p2rank-web --structure=./19HC.pdb --prediction=./19hc.pdb.gz_predictions.csv --residues=./19hc.pdb.gz_residues.csv --output-pocket=./prediction.json --output-sequence=./sequence.json

Expected behavior
The execution finish in reasonable time.

Provide option to download raw P2Rank output

It should be possible to download complete output of P2Rank including calculated conservation scores. This is useful not only because PrankWeb doesn't present all information P2Rank can calculate but also for debugging.

Old version of PrankWeb allows it. New version only allows to download pymol visualizations.

Exception processing 2SRC

09/09/2020 07:37:55 [DEBUG] - Executing 'cd /opt/conservation/jense_shannon_divergence/ && 
python2 score_conservation.py /data/p2rank/task/database/v2-conservation/2SRC/working/conservation-A/msa > /data/p2rank/task/database/v2-conservation/2SRC/working/conservation-A/structure_A.score'
Traceback (most recent call last):
File "score_conservation.py", line 831, in <module>
seq_weights = load_sequence_weights(align_file.replace('.%s' % align_suffix, '.weights'))
File "score_conservation.py", line 599, in load_sequence_weights
seq_weights.append(float(l[1]))
ValueError: could not convert string to float: A
Traceback (most recent call last):
File "/opt/p2rank-runtime/execute_p2rank.py", line 359, in <module>
main(_read_arguments())
File "/opt/p2rank-runtime/execute_p2rank.py", line 65, in main
structure_file, chains, configuration, arguments)
File "/opt/p2rank-runtime/execute_p2rank.py", line 190, in prepare_conservation
return compute_from_structure(structure_file, chains, arguments)
File "/opt/p2rank-runtime/execute_p2rank.py", line 236, in compute_from_structure
for chain in chains
File "/opt/p2rank-runtime/execute_p2rank.py", line 236, in <dictcomp>
for chain in chains
File "/opt/p2rank-runtime/execute_p2rank.py", line 255, in compute_from_structure_for_chain
conservation.compute_conservation(fasta_file, working_dir, target_file)
File "/opt/p2rank-runtime/conservation.py", line 58, in compute_conservation
compute_jensen_shannon_divergence(msa_file, output_file)
File "/opt/p2rank-runtime/conservation.py", line 257, in compute_jensen_shannon_divergence
execute_command(cmd)
File "/opt/p2rank-runtime/execute_p2rank.py", line 170, in execute_command
result.check_returncode()
File "/usr/lib/python3.7/subprocess.py", line 428, in check_returncode
self.stderr)
subprocess.CalledProcessError: Command 'cd /opt/conservation/jense_shannon_divergence/ && python2 score_conservation.py /data/p2rank/task/database/v2-conservation/2SRC/working/conservation-A/msa > /data/p2rank/task/database/v2-conservation/2SRC/working/conservation-A/structure_A.score' returned non-zero exit status 1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.