Code Monkey home page Code Monkey logo

ausaxs's Introduction

title_light title_dark

Main features

  • Simple foundation: We have implemented the methods in the simplest possible way, making as few assumptions about your data as possible. With the Debye equation as the basis for the scattering profiles, the only loss of accuracy is through the histogram approximation, where we support using both weighted and unweighted bins depending on your preferences. By implementing the technique in modern C++ with efficiency in mind, we have managed to achieve some of the best performance available.
  • Fitting of high-resolution models to SAXS curves: Fit atomic structure files using experimental SAXS data using an efficient implementation of the Debye equation. Various options are available regarding the handling of both the excluded volume and hydration shell.
  • Validation of electron microscopy maps: Validate EM maps using experimental SAXS data. By using the information contained within the EM map itself, dummy structures can be constructed and compared against the SAXS data, serving as a quick quality check on the conformation of the map.
  • Rigidbody optimization: (Still under development) Perform self-consistent and customizable rigidbody optimizations, generating a new hydration shell for each step. Optional calibration with scattering curves predicted by molecular dynamics simulations can limit the number of free parameters to just 2, dramatically reducing the capability of overfitting.

User-guides to all of these programs can be found in the wiki.

Installation

Download precompiled binaries

The fastest way to get started is using the most recent precompiled executables available in the releases. Alternatively you can follow the next section to compile the library yourself.

Compile from source

The software can easily be compiled from source with only a few steps. GCC v11+, Clang v15+, and MSVC 2022+ are supported, though GCC is the preferred option for optimal efficiency.

Linux

  1. Make sure you have the prerequisites installed
    apt-get install cmake make g++ libcurl4-openssl-dev

  2. Clone this repository
    git clone https://github.com/klytje/AUSAXS.git.

  3. Prepare the project for compilation
    cmake -B build -S .

  4. Compile your choice of executable
    cmake --build build --target em_fitter

The possible targets are em_fitter, intensity_fitter, and em_fitter_gui. Use the multithreading flag -jX for significantly shorter compilation times.

Windows

  1. Make sure CURL and OpenSSL are available on your system, e.g. through vcpkg

  2. Download or clone this repository git clone https://github.com/klytje/AUSAXS.git.

  3. Open the project with Visual Studio and compile your choice of executable. Note that this is very memory-intensive with the MSVC compiler, requiring 12GB+ of available memory due to their inefficient handling of constant expressions.

References

Several articles documenting the methods used in this project are currently in various stages of development. The first, on the EM validation methods, is expected to be published soon. Direct links will be provided in this section once they are publically available.

This project is licenced under the GNU General Public Licence v3. Alternative licencing arrangements can be discussed upon request. Supported by grant 1026-00209B from the Independent Research Fund Denmark.

ausaxs's People

Contributors

klytje avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

klytje sasview

ausaxs's Issues

Optimization: Implement custom fitting algorithm

Currently we rely on dlib only for their fitting routines. This is a big dependency to have just for performing fitting. We should consider using a more specialized library, or even implementing our own algorithm.

Windows: Exception handling

In contrast to GCC, the MSVC compiler apparently does not output unhandled exception messages to the terminal, but instead outputs some cryptic crash dumps. Since we rely heavily on these exceptions for error messaging, we probably should add a try/catch block around the main code blocks in the executables.

sasview: Add dynamic binning and q ranges

To be more generally useful in sasview, we need to dynamically be able to change the bin widths for highly ordered structures, and to extend the q-range to both lower and higher values than what is currently allowed.

Implement simple unit class

Implementing a simple unit class would get rid of conversion-errors and make it clearer to potential users what kind of values are being returned. We don't do a lot of these, though.

Expand parsing support of PDB atom names

Currently we download the specification file for each residue from the RCSB database, and expect the atom names (e.g. CA, OXT) to match these perfectly. However, some programs seems to use the remoteness indicators instead, which may not necessarily match the names in these specification files.

We should consider adding support for this by reconstructing each residue when downloaded and add these additional aliases to the parsed master file.

Windows: Broken em_fitter GUI

The em_fitter_gui is partially broken for Windows, and should probably receive more proper tests. Currently the following errors are present:

  • Adding the map file first seems to cause crashing.
  • At the end of a fit the program crashes, likely due to being unable to plot the results.
  • Removing the resource folder from scope causes weird graphical issues. This will likely happen on Linux as well. Perhaps the most important resource files should be compiled into bytecode and distributed directly with the executable.

Rigidbody optimization: improve BodyCounterCulling implementation

By request I implemented an early version of body-specific hydration targets, to allow for less hydration molecules on the flexible strands of a protein. While the implementation is working, it could be more efficient. Specifically, using distance calculations to determine which body a given hydration molecule belongs to is horribly inefficient, and should be reworked.

ImageStack: Reduce memory consumption

Currently the ImageStack fitting process can consume a lot of memory; so much that the operating system may terminate the program. The main culprit is the generation of the dummy structure, containing a ton of Atom instances. We can alleviate this by using introducing a reduced SimpleAtom class, since we have do not need most of the additional information contained within the complete Atom class.

HistogramManager consistency

There seems to be a factor 2 of difference between the fitted hydration scattering power between the PartialHistogramManagers and the usual HistogramManagers. This should be investigated further.

Built-in plotting functionalities

Currently plots are created through custom "plot script files" handled by an external Python invocation of the plot.py script. It would be a significant improvement to be able to do everything internally instead, which would save us a lot of trouble, both with the coding itself, but also security-wise since I imagine the Windows Smartscreen is not too happy about those external command-line invocations we're currently making.

Possible alternatives:
https://github.com/alandefreitas/matplotplusplus

ARM NEON intrinsics

Currently we only support x86-64 intrinsics. We should consider adding support for arm64 by expanding the CompactCoordinatesData class with additional ARM NEON intrinsics.

Independent GUI executables

Currently the resource folder must be located in the same directory as the executable itself, otherwise the interface will look weird due to missing symbols. Perhaps the most important resource files should be compiled into bytecode and be distributed within the executable itself.

Rigidbody optimization: reduce grid size

As an emergency solution, a minimum size of 1000x1000x1000 has been added for the grid when used in conjunction with the rigidbody optimizer. This was necessary to avoid potential crashes with long flexible proteins, where the initial closed size is not indicative of the total open size. When the structure is opened up during the optimization, the grid will be too small and trigger crashes.

There must be a better way to solve this. Perhaps it should be a user-adjustable parameter, in which case a better error message should be triggered.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.