Code Monkey home page Code Monkey logo

clam's Introduction

CLAM: Clustered Learning of Approximate Manifolds (v0.12.3)

CLAM is a Rust/Python library for learning approximate manifolds from data. It is designed to be fast, memory-efficient, easy to use, and scalable for big data applications.

CLAM provides utilities for fast search (CAKES) and anomaly detection (CHAODA).

As of writing this document, the project is still in a pre-1.0 state. This means that the API is not yet stable and breaking changes may occur frequently.

Usage

CLAM is a library crate so you can add it to your crate using:

> cargo add [email protected]

Here is a simple example of how to use CLAM to perform nearest neighbors search:

use abd_clam::cluster::PartitionCriteria;
use abd_clam::dataset::VecVec;
use abd_clam::cakes::CAKES;
use abd_clam::utils::synthetic_data;

fn euclidean(x: &[f32], y: &[f32]) -> f32 {
    x.iter()
        .zip(y.iter())
        .map(|(a, b)| (a - b).powi(2))
        .sum::<f32>()
        .sqrt()
}

fn search() {
    // Get the data and queries.
    let seed = 42;
    let data: Vec<Vec<f32>> = synthetic_data::random_f32(100_000, 10, 0., 1., seed);
    let queries: Vec<Vec<f32>> = synthetic_data::random_f32(1_000, 10, 0., 1., 0);

    let dataset = VecVec::new(data, euclidean, "demo".to_string(), false);
    let criteria = PartitionCriteria::new(true).with_min_cardinality(1);
    let model = CAKES::new(dataset, Some(seed)).build(&criteria);
    // The CAKES struct provides the functionality described in our
    // [CHESS paper](https://arxiv.org/abs/1908.08551).

    let (query, radius, k) = (&queries[0], 0.05, 10);

    let rnn_results: Vec<(usize, f32)> = model.rnn_search(query, radius);
    // This is how we perform ranged nearest neighbors search with radius 0.05
    // around the query.

    let knn_results: Vec<(usize, f32)> = model.knn_search(query, 10);
    // This is how we perform k-nearest neighbors search for the 10 nearest
    // neighbors of query.

    // Both results are a Vec of 2-tuples where each tuple is the index and
    // distance to points in the data.

    todo!()
}

License

MIT

Citation

TODO

clam's People

Contributors

nishaq503 avatar thoward27 avatar dependabot[bot] avatar morganprior avatar olwmc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.