Code Monkey home page Code Monkey logo

jedi's People

Contributors

charlesdavid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

jedi's Issues

Compiling

The manual states that only Jama is needed as dependency to compile the source code. I found that several other packages are needed, JFree, and bits.fft.FastCosineTransform2d. Providing links to these packages may be useful.

Frame Selection Feature

Let's think about how this should work. I envision this feature to be part of the production running of JEDi, so the entire coordinates matrix will be available. Need to specifically address the syntax for selecting COLUMS of the matrix, compatible with Jama getMatrix methods.

Commas in text files

Large numbers in text files such as eigenvalues have commas separating them, making them difficult to use with other software.

Very Large Variance in HPCA

The top Eigenvalues for Hierarchical PCA are very large in comparison with the other types in JEDi. Code is being checked for this issue.

Development Update June 2022

JEDi has been running well as most bugs have been found and squashed.
Things to think about for this year:

  • Implement better stats for MI kernel. This could be done using PDF-Estimator when there is a 2-D version available.
  • SPLOC module
  • Neural Net module

Development Update December 2020

  • JEDi Paper submitted to BMC Bioinformatics December 24, 2020: Now in peer review!
  • BUGS SQUASHED:
    • All parameters now behaving well
    • PDB file issues solved by re-formatting files that are not in STD format
      • The FORTRAN Format: current format string implements the STD PDB format, but NOT tool specific variations.
  • CODE REFACTORING:
    • New drivers for POOL, VIZ, ALIGN, KPCA, and SSA
    • Messages to STD OUT and STD ERR provide more details
    • Log file structure and clarity improved
    • The aligned coordinates (transformed) for each subset processed can be written out to file
    • The output from the Free Energy Surface method now included a 3-D Scatter-plot
    • All JEDi flat-file output (except coordinate matrices) are automatically plotted
  • Development:
    • Multi Threading fully implemented: Thread Pool
    • All output is high quality PNG format for immediate inspection of results
    • Outlier processing fully implemented with comparative analysis
    • Sparsification fully implemented with identification of activator and suppressor variables
    • Verbose option to output all text files with most in BZ2 format
    • ML modules planned including SPLOC

Alignment in HPCA

We were comparing the AA and HPCA results on the small subsets of residues as discussed and we noticed for a single residue the (taking all modes for both PCAs) the eigenvectors were different. We did not expect this as for a single residue taking all modes (we used subsets of glycine becasue it was the smallest) Of course the whole space was equal, but the working guess as to why the individual vectors were different was because there was an alignment discrepancy between the two PCA types.

We were wondering if you knew off the top of your head how the alignment works for them. Is the whole stucture globally aligned in the pre-processing run, and then the same alignment is used in each PCA type, or is there another alignment in each PCA which is unique to the PCA. For example do you align all the residues locally and individually for HPCA?

Code Review: Reduced C and Dyn Matrices

I am requesting a code review of the formulae used to compute the reduced c-matrices and reduced dynamical matrices. The two methods are in the class PCA.java: static Matrix get_reduced_C_matrix(Matrix Q), and static Matrix get_reduced_DYN_matrix(Matrix DYN).
The key lines of code are:
double cov = (Q.get(a1, b1) + Q.get(a2, b2) + Q.get(a3, b3));
double dyn = -(DYN.get(a1, b1) + DYN.get(a2, b2) + DYN.get(a3, b3));
The original formulae were written in a vectorized form for Matlab, so I want to be sure that the conversion to the component version here is correct.

Problem selecting atoms from PDBs with more than 99999 atoms

I'm dealing with a big system, part of a virus capside. It has more then 99999 atoms, so the pdb numeration for the atom field (limited to 5 characters) goes up to 99999 and then goes back to 1.
The problem arises when running doATOM_LIST, that reads the atoms list from the file specified at ATOMS_LIST variable. The format of that file is:

A 1
A 2
A 3

but the program is ignoring the chain and keeps the coordinates of the last atom with the specified atom number.
For example, in my ATOMS_LIST I have:

A 1419

At the reference pdb with atom number 1419, I have:

ATOM 1419 CA PRO A 93 102,561 103,879 248,131 1,00 63,18 C
ATOM 1419 CA VAL c6504 71,255 177,935 91,271 1,00 26,11 C

But from the application log, the selected atom is:

ATOM 1419 CA VAL c 6504 71,255 177,935 91,271 0,00 C

Could it be fixed, please?

Development Update for May 2019

I have made a number of changes to JEDi and fixed a number of bugs:

  • BUGS SQUASHED:
    • KPA now works as expected
    • No commas in raw text files... (only in the formatted file for eigenvalue + %var + cum var)
  • CODE REVISION:
    • There is no longer a distinction between SINGLE and MULTI chain PDB files.
      • Single chain PDB files with no chain ids are fixed by adding a default chain id: "A"
      • Residue Lists formatted for Single use will still work for a single chain pdb.
      • Multi chain PDB files work as usual...
        • Multi chain PDB files still MUST have unique chain IDs!
      • Multi residue lists work as normal...
    • Down sampling of frames is now implemented
      • Frame range selection may be implemented soon ...
    • Messages to STD OUT now more clear and organized
    • Log file structure and clarity improved
    • The aligned coordinates (transformed) for each subset processed is written out to file.
    • The output from the Free Energy Surface method now included a 3-D Scatterplot
  • Development:
    • Covariance shrinkage --> DONE!
    • Clustering of frames --> DONE!
    • Multi Threading --> DONE!
    • SPLOC --> IN PROGRESS

Multi-Threading and Parallelization

Now that JEDi can perform so many tasks, it seems we should be thinking about parallelization. This would require implementing multi-threading and serialization of the java code. Thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.