Code Monkey home page Code Monkey logo

esmpair's Introduction

ESMPair

Requirement

  • Installing the lastest version of AlphaFold.

Installation

  • pip install fair-esm
  • cd msa_pair; pip install -e .

ESMPair pipeline

  • For saving GPU memory, you need to replace the esm/model/msa_transformer.py in the directory of {your python path}/site-packages/ with the provided msa_transformer.py. In the new msa_transformer.py, we remove redundant operations.
  • Run the ColAttn MSA Pairing, take the 2p01A in the dataset for example, let's build a paired MSA on heterodimers.
    • Making the subdirectory A and B in the directory of 2p01A;
    • Renaming the 2p01A_domain_0_start_21_end_91.a3m with uniref90.a3m and moving it to the subdirectory A. Similarly, renaming 2p01A_domain_1_start_111_end_215.a3m with uniref90.a3m and moving it to B;
    • Running: python colattn_pair.py ./dataset/ {device_id} to get the scoring output: col_scores_512.json and the final paired output col_pr_512.json

Output format

  • Scoring output: col_scores_512.json { 'A':{ "{msa_index}":{ "blocknum": xxx, "description": msa description, "score": colattn score. } } 'B':{ "{msa_index}":{ "blocknum": xxx, "description": msa description, "score": colattn score. } } }

  • Paired output: col_pr_512.json { "A":[ 0, # 0 is the index of the primary sequence. 3, # index of other msas. ... ], "B":[ 0,
    5, ... ] }

Notaly, msas with the same rank from the two chain lists should be paired, such as the (3+1)th sequence from chainA and the (5+1)th sequence from chainB should be paired.

esmpair's People

Contributors

allanchen95 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

esmpair's Issues

ems does not have model.py

Hi,

There is no model.py file under the ems folder after installing fair-esm. How do we replace the esm/model.py in the directory of site-packages/ with the provided model.py ?

Reasoning behind use of uniref90.a3m MSA files for pairing

Hi,

Your paper claims:

"We use of the default AlphaFold-Multimer MSA search setting with JackHMMER to search the UniProt database for MSA pairing."

However, after examining the provided code and example, it appears that the code exclusively deals with 'uniref90.a3m' MSA files. This means that the code relies on MSAs generated by the Uniref90 database in the 'a3m' format for pairing, which diverges from AlphaFold's default approach, where the MSA file result is based on JackHMMER and uniprot in the form of 'uniprot.sto'.

Additionally, the code employs TaxID for species grouping. Nevertheless, TaxID from the Uniref90 database encompasses different taxonomy ranks beyond just species, which may introduce challenges in the pairing process.

I kindly request clarification regarding these disparities and the reasoning behind.
Thank you for your attention to these concerns, and I eagerly await your response.

Sincerely,
Maryam

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.