Code Monkey home page Code Monkey logo

slurp's Introduction

Slurp

Slurp is a command-line tool that allows you to fetch, list, rank, and summarize research papers from arXiv based on your interests. It provides functionality to rank research papers based on keywords, and generate simplified summaries of paper abstracts using OpenAI or Anthropic APIs.

Install

  1. Create and activate a virtual env:
python -m venv myvenv
source myvenv/bin/activate
  1. Install:
pip install git+https://github.com/0v00/slurp.git
  1. Alternatively, for developing:
git clone https://github.com/0v00/slurp.git
cd slurp
pip install -e .

Setup

Run the setup command to configure your research interests and generate a config.toml file:

slurp setup

Enter your research interests and keywords when prompted, separated by commas.

Add interests according to arXiv category taxonomy e.g. ro for robotics, cv for computer vision etc.

Keywords can be any strings e.g. llm, reinforcement learning etc.

Open the generated config.toml file and add your OpenAI and/or Anthropic API keys.

Usage

Scrape Papers

To scrape info from papers based on your configured interests, run the following command:

slurp up

This will scrape arXiv for papers matching your interests and save them to a local db.

List Papers

To list the scraped papers, use the list command:

slurp list [--interest INTEREST] [--count COUNT]

  • --interest: Filter papers by a specific interest. Uses arXiv category taxonomy.
  • --count: Specify the number of papers to list (default: 10). Use all to list all papers.

Example:

slurp list --interest "ro" --count 3

Output:

Title: GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary  Robotic Grasping
URL: https://arxiv.org/abs/2403.09637
PDF URL: https://arxiv.org/pdf/2403.09637

Title: Scalable Autonomous Drone Flight in the Forest with Visual-Inertial SLAM  and Dense Submaps Built without LiDAR
URL: https://arxiv.org/abs/2403.09596
PDF URL: https://arxiv.org/pdf/2403.09596

Title: ExploRLLM: Guiding Exploration in Reinforcement Learning with Large  Language Models
URL: https://arxiv.org/abs/2403.09583
PDF URL: https://arxiv.org/pdf/2403.09583

Rank Papers

To rank the scraped papers based on title and abstract similarity to config.toml keywords, use the rank command:

slurp rank COUNT

  • COUNT: Specify the number of top papers to rank and display.

These rankings will be generated from analyzing all papers (e.g. machine learning, robotics, etc.) in the db. This uses all-MiniLM-L6-v2 and util.cos_sim. It might take awhile for this command to run for the first time as it will download the model.

Example:

slurp rank 3

Output:

Keyword: reinforcement
--------------------------------------------------
Top 1
Title: Rethinking Adversarial Inverse Reinforcement Learning: From the Angles  of Policy Imitation and Transferable Reward Recovery
URL: https://arxiv.org/abs/2403.14593
PDF URL: https://arxiv.org/pdf/2403.14593
Interest: ml

Top 2
Title: Physics-Based Causal Reasoning for Safe & Robust Next-Best Action  Selection in Robot Manipulation Tasks
URL: https://arxiv.org/abs/2403.14488
PDF URL: https://arxiv.org/pdf/2403.14488
Interest: ro

Top 3
Title: Tell Me What You Want (What You Really, Really Want): Addressing the  Expectation Gap for Goal Conveyance from Humans to Robots
URL: https://arxiv.org/abs/2403.14344
PDF URL: https://arxiv.org/pdf/2403.14344
Interest: ro

Generate Abstract Summary

To generate a simplified summary of a paper's abstract, as well as discover key words and terms useful for understanding the paper, use the abstract command:

slurp abstract PAPER_URL --service SERVICE

  • PAPER_URL: The arXiv URL of the paper.
  • --service: Select the service to use for generating the summary (openai or anthropic).

If you use an arXiv URL that is not yet in the DB, it will be scraped and added before generating a summary.

Example:

slurp abstract https://arxiv.org/abs/2403.14626 --service anthropic

Output:

Key Terms and Ideas:
1. Synchronisation: The process of making multiple systems or agents work together in a coordinated manner.
2. Consensus methods: Techniques used to achieve agreement among multiple agents in a distributed system.
3. Peaking phenomenon: An undesirable transient behaviour where the system's output exhibits a large overshoot before settling to the desired value.

# ...more key terms and ideas

Key Takeaway:
This study proposes a synchronisation-oriented approach to adaptive control, 
which views model reference adaptation as a synchronisation problem between actual and virtual dynamic systems.
By designing a coupling input to achieve desired closed-loop error dynamics
and shaping the collective behaviour through input allocation,
the proposed approach enables a more systematic and generalised way
to design adaptive control systems with improved transient response characteristics
and mitigated peaking phenomenon.

Download Paper

To download a paper's PDF based on its URL, use the download command:

slurp download PAPER_URL [--path PATH]

  • PAPER_URL: The arXiv URL of the paper.
  • --path: Specify the download path (default: current working directory).

Example:

slurp download https://arxiv.org/abs/2403.14626 --path /PATH/TO/DIR/

Output:

Paper downloaded successfully: /PATH/TO/DIR/TITLE.pdf

TODO:

  • Add option for slurp up to only fetch the N most recent papers
  • Summarize entire paper
  • Summarize page in a paper
  • Use arXiv API so we don't get rate-limited or blocked
  • Add paper overlap feature - use embeddings
  • Do the meme, rewrite in Rust and make into a TUI

slurp's People

Contributors

0v00 avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.