Code Monkey home page Code Monkey logo

visionscript's Introduction

VisionScript logo

VisionScript

VisionScript is an abstract programming language for doing common computer vision tasks, fast.

VisionScript is built in Python, offering a simple syntax for running object detection, classification, and segmentation models. Read the documentation.

View the demo.

Get Started ๐Ÿš€

First, install VisionScript:

pip install visionscript

You can then run VisionScript using:

visionscript

This will open a VisionScript REPL in which you can type commands.

Run a File ๐Ÿ“

To run a VisionScript file, use:

visionscript ./your_file.vic

Use VisionScript in a Notebook ๐Ÿ““

VisionScript offers an interactive web notebook through which you can run VisionScript code.

To use the notebook, run:

visionscript --notebook

This will open a notebook in your browser. Notebooks are ephermal. You will need to copy your code to a file to save it.

Quickstart ๐Ÿš€

Find people in an image using object detection

Load["./photo.jpg"]
Detect["person"]
Say[]

Find people in all images in a folder using object detection

In["./images"]
    Detect["person"]
    Say[]

Replace people in a photo with an emoji

Load["./abbey.jpg"]
Size[]
Say[]
Detect["person"]
Replace["emoji.png"]
Save["./abbey2.jpg"]

Classify an image

Load["./photo.jpg"]
Classify["apple", "banana"]

Installation ๐Ÿ‘ท

To install VisionScript, clone this repository and run pip install -r requirements.txt.

Then, make a file ending in .vic in which to write your VisionScript code.

When you have written your code, run:

visionscript ./your_file.vic

Run in debug mode

Running in debug mode shows the full Abstract Syntax Tree (AST) of your code.

visionscript ./your_file.vic --showtree=True

Debug mode is useful for debugging code while adding new features to the VisionScript language.

Inspiration ๐ŸŒŸ

The inspiration behind this project was to build a simple way of doing one-off tasks.

Consider a scenario where you want to run zero-shot classification on a folder of images. With VisionScript, you can do this in three lines of code:

In["./images"]
    Classify["cat", "dog"]
    Say[]

VisionScript is not meant to be a full programming language for all vision tasks, rather an abstract way of doing common tasks.

VisionScript is ideal if you are new to concepts like "classify" and "segment" and want to explore what they do to an image.

Syntax

The syntax is inspired by both Python and the Wolfram Language. VisionScript is an interpreted language, run line-by-line like Python. Statements use the format:

Statement[argument1, argument2, ...]

This is the same format as the Wolfram Language.

Lexical Inference and Memory

An (I think!) unique feature in VisionScript compared to other languages is lexical inference.

You don't need to declare variables to store images, etc. Rather, you can let VisionScript do the work. Consider this example:

Load["./photo.jpg"]
Size[]
Say[]

Here, Size[] and Say[] do not have any arguments. Rather, they use the last input. Wolfram Alpha has a feature to get the last input using %. VisionScript uses the same concept, but with a twist.

Indeed, Size[] and Say[] don't accept any arguments.

Developer Setup ๐Ÿ› 

If you want to add new features or fix bugs in the VisionScript language, you will need to set up a developer environment.

To do so, clone the language repository:

git clone https://github.com/capjamesg/VisionScript

Then, install the required dependencies and VisionScript:

pip install -r requirements.txt
pip install -e .

Now, you can run VisionScript using:

visionscript

Supported Models ๐Ÿ“š

VisionScript provides abstract wrappers around:

  • CLIP by OpenAI (Classification)
  • Ultralytics YOLOv8 (Object Detection Training, Segmentation Training)
  • FastSAM by CASIA-IVA-Lab. (Segmentation)
  • GroundedSAM (Object Detection, Segmentation)
  • BLIP (Caption Generation)
  • ViT (Classification Training)

License ๐Ÿ“

This project is licensed under an MIT license.

visionscript's People

Contributors

capjamesg avatar mahimairaja avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.