The prolint2 from prolint

Fix issue with the interactions in the Dashboard.

The residues IDs in the dashboard are not correct. The issue should be similar to the previously fixed in the exporting functions.

HPC support with the Dask scheduler

Optimize the runner class to allow for the setup of remote machines to run the calculation of the contacts.

Fix command-line version.

The command-line version stoped working due to some issues with the dependencies that needs to be solved.

Improve API usage.

Improve the way the users can use the software as an API using a python script.

Command-line access to prolint2 is provided by using `typer` and the config_files. This is not working since the latest release.

Missing dependencies and modules after prolint2 installation via pip

Hello,

After installing prolint2 on Ubuntu 22.04 with:

conda create -n prolint2 python=3.8
conda activate prolint2
pip install prolint2

I find the following missing dependencies: seaborn, logomaker (which can be installed afterwards).

The following module is also not getting installed: prolint2.plotting
Error:
from prolint2.plotting.utils import *
ModuleNotFoundError: No module named 'prolint2.plotting'

Would it be possible to fix this?
Thank you.

Implement a visualization frontend (dashboard)

Not clear yet how we should go around implementing this. The actual features to implement are even less clear.
We can start with a backbone setup and continue from there.

Polish docs.

Complete command-line section.
Complete the server section.
Use a homogeneous format for the docs in the API integration.
Polish current tutorials and add more.
Add Plotters to the tutorials.

Redefine data structure for the results including different metrics.

Define valuable metrics and the best way to define them in the code.

Implement the analysis of membrane curvature.

Use https://github.com/MDAnalysis/membrane-curvature.

Self-interactions.

Work on the analyses of self-interactions (e.g. lipid-lipid and lipid-protein interactions).

Load server data from local file

We need to create an API that enables the storage of server data and subsequent loading of the server from the local data file.

We should make some basic system information very easily accessible. All of this data is already available via different MDAnalysis function and methods, but it would be nice to have them readily accessible.

Improvements to the calculated and displayed metrics

We need to define the metrics we will calculate and use. Currently, we output a metric variable that is set to a float value. We need to update that to an iterable. I'm not so sure about the actual metrics. We can discuss the specifics during our regular meetings.
For now, we can simply use the average contact and the maximum contact.

From a broader view, we may want to group contact types into categories: default contacts, occupancies, and residence types.

Plotting error

Hello,
This is a repeat of bug #85 but there has been no one assigned and no movement on this. The module (prolint2.plotting) will not load at all, even when copying the notebooks that are part of this github page.

ModuleNotFoundError: No module named 'prolint2.plotting'

Support for logical operations between groups in the interactive selection.

Add support for + and -, which can be mapped into set operations (e.g. + can be union, | can be intersection, - can be A - (A∩B), and others) during the interactive selection of the Database and Query groups for the calculation of the contacts.

MDAnalysis already implements many of these. I checked the code and it should be possible to add atomgroups directly (and do other operations as well): link.

Have a look at this: https://www.codingem.com/python-__add__-method/.

Include parser

Argument parser to use the library directly from the command line.

adding config file to future main.

Add a config file for default parameters of the calculations in the future_main branch.

Add contact metrics to the contact calculation routine

We should go ahead and implement some basic contact metrics. This is essential to move forward with issue #12
We can maybe start by adding the sum of all contacts, and the average of contacts.

Heatmap/contact projection error

Hello,
I was trying to visualise the contact projection (heatmap) and got this error (below ) could you help me please ?
pl.show_contact_projection(T, bf=js[metrics[0]], cmap="Spectral_r")

TypeError Traceback (most recent call last)
/tmp/ipykernel_3340230/3395143762.py in
1 # visualize the first metric
----> 2 pl.show_contact_projection(T, bf=js[metrics[0]], cmap="Spectral_r")

/softwares/Anaconda3/2023.07/envs/prolint/lib/python3.7/site-packages/prolintpy/vis/show_contact_projection.py in show_contact_projection(t, bf, protein, residue_list, ngl_repr, cmap)
69 else:
70 if len(df.resSeq.unique()) != len(bf):
---> 71 raise TypeError ('When projecting only a subset of residues provide a list of tuples: [(residue_id, value), ...]')
72 for atom in resseq:
73 atomic_bfactors.append(bf[atom-1])

TypeError: When projecting only a subset of residues provide a list of tuples: [(residue_id, value), ...]

i used the data.json file and P454_BB.pdb (which is the file of 2 peptides ) from the output

Add a cli version of the code

To facilitate the usage of the code, we should implement a cli version of the code. Initially, we only need a bare bones version with support for only the essential components.

Adding exporting options.

Add exporting options, so that users can be able to use the contacts data for specific analyses.

Adding the contacts class and basic tests.

Publish v1.0.0

Once #78 has been merged into the main branch, we can upgrade the package version to 1.0, which will be a significant achievement. However, there is still a long way to go. We require extensive (1) testing and (2) the addition of tutorial/example notebooks.

prolint2.plotting error

Hello,

Working with prolint2 i was not able to import prolint2.plotting and got this error message:
" ModuleNotFoundError: No module named 'prolint2.plotting' "

An idea about this error please ?

Add unittests.

Increase the coverage of the tests.

Right now the coverage of the unit tests is only 45%, which is quite low, to demonstrate that the code is reliable for external users we would need to increase this as much as possible.

Create Getting Started notebook.

Include Getting Started notebook.
Modify the overview in the documentation, including a benchmark figure and at least one snapshot of the new dashboard.

Implement support for Index Library

Rather than making an artificial distinction between what is and what isn't protein, we should add support for a much more extensive and user friendly set of groups. We can start by taking the GROMACS make_ndx command as motivation.

Upon loading of user data, we retrieve all atom labels, and group everything into residue level. By default we can define the following labels:

1. System     Size
2. Protein    Size
3. Lipids     Size
4. Water      Size
5. Ions       Size
6. Ligands    Size
7. POPC       Size 
8. POPE       Size
...

Next we take all non-protein residues and list them all.

Now we also need a way to work with this Index Library. One suggestion could be to define a make_library() or make_index() function which takes two arguments: selection and action. Selection is a wrapper around the select_atoms MDAnalysis function, but which adds support for the default groups we define above. E.g. UFCC.make_index('select 1 and not 2', action='a')

Define visualization functions and custom metrics.

Define the same visualization functions there were in the previous version of prolint and define a function for custom metrics.

Include parallelization with Dask.

Recreate parallel routine with Dask once it has been defined the data structure for the contacts results.

Fix order of the groups during the interactive selection.

Make sure that the order of the selection groups during the interactive selection mode is fixed and consistent between different runs of the code so that it can scriptable.

Adding new lipid types with -al option in the command line.

The list type for the add_lipids variable in the command-line parser doesn't seem to be working as expected.

Remove demo branch?

I don't think it is needed now?

Show compute statistics

When calculating contacts, add useful output regarding performance e.g. time it took, resources used, etc.

This output should be enabled by default, and we should provide rough estimates using current dependencies (ie. no need to add a new dependency).

At this point, we also don't need to worry about the formatting of the output or any other similar details. A simple example:

Calculation Report
------------------
Resource used: CPU
Time per search: 0.01 seconds
Iterations: 1000
Total time: 10 seconds

export_to_prolintpy method

This method needs to be optimized or completely substituted. A first step would be to change the line 21 of the w2plp.py file.

Loading examples and example files

We should have a dedicated way of loading and working with example files. For example:

form UFCC.examples import GIRK
# we can then load the data using GIRK.directory notation: 
print (GIRK.trajectory)
# $INSTALLATION_DIR/data/GIRK/trajectory.xtc

prolint / prolint2 Goto Github PK

prolint2's People

Contributors

Stargazers

Watchers

Forkers

prolint2's Issues

Command-line access to prolint2 is provided by using typer and the config_files. This is not working since the latest release.

Recommend Projects

Recommend Topics

Recommend Org

Command-line access to prolint2 is provided by using `typer` and the config_files. This is not working since the latest release.