charlesdavid / jedi Goto Github PK

View Code? Open in Web Editor NEW

5.0 5.0 0.0 82.15 MB

JEDi: Java Essential Dynamics Inspector

License: GNU General Public License v3.0

Java 100.00%

essential-dynamics free-energy-surface pca pca-mode-visualization principal-angles rmsip subspace-analysis

jedi's People

Contributors

Stargazers

Watchers

jedi's Issues

Compiling

The manual states that only Jama is needed as dependency to compile the source code. I found that several other packages are needed, JFree, and bits.fft.FastCosineTransform2d. Providing links to these packages may be useful.

Frame Selection Feature

Let's think about how this should work. I envision this feature to be part of the production running of JEDi, so the entire coordinates matrix will be available. Need to specifically address the syntax for selecting COLUMS of the matrix, compatible with Jama getMatrix methods.

Sparate run-able program for doing alignment

Giving users the ability to output the aligned A matrix for post processing use

kPCA Logical Switch not working?

Possible bug... can't turn off kPCA...

Commas in text files

Large numbers in text files such as eigenvalues have commas separating them, making them difficult to use with other software.

Very Large Variance in HPCA

The top Eigenvalues for Hierarchical PCA are very large in comparison with the other types in JEDi. Code is being checked for this issue.

Development Update June 2022

JEDi has been running well as most bugs have been found and squashed.
Things to think about for this year:

Implement better stats for MI kernel. This could be done using PDF-Estimator when there is a 2-D version available.
SPLOC module
Neural Net module

Development Update December 2020

JEDi Paper submitted to BMC Bioinformatics December 24, 2020: Now in peer review!
BUGS SQUASHED:
- All parameters now behaving well
- PDB file issues solved by re-formatting files that are not in STD format
  - The FORTRAN Format: current format string implements the STD PDB format, but NOT tool specific variations.
CODE REFACTORING:
- New drivers for POOL, VIZ, ALIGN, KPCA, and SSA
- Messages to STD OUT and STD ERR provide more details
- Log file structure and clarity improved
- The aligned coordinates (transformed) for each subset processed can be written out to file
- The output from the Free Energy Surface method now included a 3-D Scatter-plot
- All JEDi flat-file output (except coordinate matrices) are automatically plotted
Development:
- Multi Threading fully implemented: Thread Pool
- All output is high quality PNG format for immediate inspection of results
- Outlier processing fully implemented with comparative analysis
- Sparsification fully implemented with identification of activator and suppressor variables
- Verbose option to output all text files with most in BZ2 format
- ML modules planned including SPLOC

Alignment in HPCA

We were comparing the AA and HPCA results on the small subsets of residues as discussed and we noticed for a single residue the (taking all modes for both PCAs) the eigenvectors were different. We did not expect this as for a single residue taking all modes (we used subsets of glycine becasue it was the smallest) Of course the whole space was equal, but the working guess as to why the individual vectors were different was because there was an alignment discrepancy between the two PCA types.

We were wondering if you knew off the top of your head how the alignment works for them. Is the whole stucture globally aligned in the pre-processing run, and then the same alignment is used in each PCA type, or is there another alignment in each PCA which is unique to the PCA. For example do you align all the residues locally and individually for HPCA?

Code Review: Reduced C and Dyn Matrices

I am requesting a code review of the formulae used to compute the reduced c-matrices and reduced dynamical matrices. The two methods are in the class PCA.java: static Matrix get_reduced_C_matrix(Matrix Q), and static Matrix get_reduced_DYN_matrix(Matrix DYN).
The key lines of code are:
double cov = (Q.get(a1, b1) + Q.get(a2, b2) + Q.get(a3, b3));
double dyn = -(DYN.get(a1, b1) + DYN.get(a2, b2) + DYN.get(a3, b3));
The original formulae were written in a vectorized form for Matlab, so I want to be sure that the conversion to the component version here is correct.

Problem selecting atoms from PDBs with more than 99999 atoms

I'm dealing with a big system, part of a virus capside. It has more then 99999 atoms, so the pdb numeration for the atom field (limited to 5 characters) goes up to 99999 and then goes back to 1.
The problem arises when running doATOM_LIST, that reads the atoms list from the file specified at ATOMS_LIST variable. The format of that file is:

A 1
A 2
A 3

but the program is ignoring the chain and keeps the coordinates of the last atom with the specified atom number.
For example, in my ATOMS_LIST I have:

A 1419

At the reference pdb with atom number 1419, I have:

ATOM 1419 CA PRO A 93 102,561 103,879 248,131 1,00 63,18 C
ATOM 1419 CA VAL c6504 71,255 177,935 91,271 1,00 26,11 C

But from the application log, the selected atom is:

ATOM 1419 CA VAL c 6504 71,255 177,935 91,271 0,00 C

Could it be fixed, please?

Development Update for May 2019

I have made a number of changes to JEDi and fixed a number of bugs:

BUGS SQUASHED:
- KPA now works as expected
- No commas in raw text files... (only in the formatted file for eigenvalue + %var + cum var)
CODE REVISION:
- There is no longer a distinction between SINGLE and MULTI chain PDB files.
  - Single chain PDB files with no chain ids are fixed by adding a default chain id: "A"
  - Residue Lists formatted for Single use will still work for a single chain pdb.
  - Multi chain PDB files work as usual...
    - Multi chain PDB files still MUST have unique chain IDs!
  - Multi residue lists work as normal...
- Down sampling of frames is now implemented
  - Frame range selection may be implemented soon ...
- Messages to STD OUT now more clear and organized
- Log file structure and clarity improved
- The aligned coordinates (transformed) for each subset processed is written out to file.
- The output from the Free Energy Surface method now included a 3-D Scatterplot
Development:
- Covariance shrinkage --> DONE!
- Clustering of frames --> DONE!
- Multi Threading --> DONE!
- SPLOC --> IN PROGRESS

Multi-Threading and Parallelization

Now that JEDi can perform so many tasks, it seems we should be thinking about parallelization. This would require implementing multi-threading and serialization of the java code. Thoughts?

charlesdavid / jedi Goto Github PK

jedi's People

Contributors

Stargazers

Watchers

jedi's Issues

Recommend Projects

Recommend Topics

Recommend Org