Code Monkey home page Code Monkey logo

matsearch's Introduction

matsearch

Overview

matsearch provides an API for searching materials based on their composition using deep learning and material science techniques. It leverages a FAISS (Facebook AI Similarity Search) index for efficient similarity searching and DeepChem for feature extraction of material compositions. This system is designed to aid in the discovery and analysis of new material compositions, drawing inspiration from recent advances in AI-driven material science research.

Convenience Note

The faiss.index and feature_vectors.npy files were pre-generated from a dataset of 380,000 materials by DeepMind (GNoME Project), enabling direct api use without needing to run vectorize and create_index.

Components

The project consists of several key services: api, vectorize and create_index.

api

Running the API

To run the api, execute the following command:

./start.sh api

This will build a Docker container and start the API service, accessible on port 8080.

Usage

To search for materials similar to a given composition, send a POST request to the /search endpoint with the composition data:

curl -X POST http://localhost:8080/search -H "Content-Type: application/json" -d '{"composition": "KCl"}'

Response Structure

The response includes two key pieces of information:

  • distances: A list of distances from the query composition to the similar materials found. Lower values indicate closer similarity to the queried composition.
  • similar: A list of similar material compositions.
{
    "distances": [
        0.0023
    ],
    "similar": [
        "NaCl"
    ]
}

vectorize

The vectorize service is responsible for processing the material compositions and converting them into feature vectors. This is done using the ElementPropertyFingerprint from DeepChem, which creates a fingerprint based on elemental stoichiometry.

Running the Service

Execute:

./start.sh vectorize

This will read material compositions from a CSV file, featurize each composition, and save the resulting feature vectors as a NumPy array.

create_index

The create_index service creates a FAISS index from the feature vectors generated by vectorize. This index is used for efficient similarity searches in the api.

Running the Service

Execute:

./start.sh create_index

This will load the feature vectors, create a FAISS index, and save it for use by the api.

Technologies

  • DeepChem: Used for featurizing material compositions.
  • FAISS: Provides efficient similarity search for high dimensional vectors.
  • Flask: Serves the API for searching material compositions.
  • Pandas & NumPy: For data manipulation and array operations.
  • Docker: For containerizing and orchestrating the services.

Contact us for clarifications or contributions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.