VisualNexus

VisualNexus is an open-source training pipeline designed to facilitate the segmentation and labeling of visual datasets using a single model. It allows you to generate richly detailed labeled datasets for downstream fine-tuning of models. With VisualNexus, you can streamline the process of data preparation and enhance the efficiency of training computer vision models.

Getting Started

To get started with VisualNexus, please follow the introductary instructions below or at the bottom.

Clone the VisualNexus repository:

git clone https://github.com/kyegomez/VisualNexus.git
cd VisualNexus

Install the required dependencies:

pip install -r requirements.txt

cd VisualNexus then cd models
Enter in your dataset in either sag_img.py and sag_video.py

python3 sag_img.py

The labeled dataset will be stored in the specified output folder, ready for downstream fine-tuning.

Key Features

Seamless segmentation and labeling of visual datasets.
Efficient data preprocessing and augmentation techniques.
Support for multiple input data formats.
Richly detailed labeled datasets for downstream model training.
Easy integration with existing computer vision workflows.
Share with friends feature to spread awareness across various social media platforms.

Architecture

VisualNexus employs a simple yet effective architecture to perform segmentation and labeling tasks. The pipeline consists of the following steps:

Data Input: VisualNexus accepts input visual datasets in various formats, including images, videos, or any other compatible data format.
Segmentation and Labeling: The dataset is processed using a pre-trained segmentation model, which accurately identifies and segments objects or regions of interest within the visuals. This step produces richly detailed labels for each data instance.
Dataset Storage: The labeled dataset is stored in a structured format, ensuring the preservation of original data along with the corresponding segmentation and labeling information. This format facilitates easy integration with downstream model training processes.
Downstream Fine-tuning: The generated labeled dataset can be seamlessly used for fine-tuning other computer vision models. The labeled data provides valuable annotations, enabling the models to learn from high-quality labeled examples and improve their performance.

Detailed Instructions:

For Images Only

If you are looking to use VisualNexus for only images, the SAG_IMG class is suitable for this purpose. Follow these steps:

Import the SAG_IMG class from visualnexus.py:

from visualnexus import SAG_IMG

Create an instance of SAG_IMG with the necessary parameters. These parameters include the paths to your images, model parameters, and optional settings. Below is an example:

image_file_paths = load_hf_dataset(dataset_name="your_dataset") # Replace "your_dataset" with your dataset's name
img_seg = SAG_IMG(image_file_paths)

Call the segment method on the SAG_IMG instance to perform the segmentation:

img_seg.segment()

If you want to save the resulting dataset, call the create_dataset method:

img_seg.create_dataset()

For Videos Only

If you are dealing with videos, the SAG_VID class can be used. Follow these steps:

Import the SAG_VID class from visualnexus.py:

from visualnexus import SAG_VID

Create an instance of SAG_VID with the necessary parameters. These parameters include the paths to your videos, model parameters, and optional settings. Below is an example:

video_file_paths = load_hf_dataset(dataset_name="your_dataset") # Replace "your_dataset" with your dataset's name
vid_seg = SAG_VID(video_file_paths)

Call the segment method on the SAG_VID instance to perform the segmentation:

vid_seg.segment()

If you want to save the resulting dataset, call the create_dataset method:

vid_seg.create_dataset()

For Both Images and Videos

If your dataset contains both images and videos, the SAG_MEDIA class can be used. Follow these steps:

Import the SAG_MEDIA class from visualnexus.py:

from visualnexus import SAG_MEDIA

Create an instance of SAG_MEDIA with the necessary parameters. These parameters include the paths to your images and videos, model parameters, and optional settings. Below is an example:

image_file_paths = load_hf_dataset(dataset_name="your_dataset") # Replace "your_dataset" with your dataset's name
video_file_paths = load_hf_dataset(dataset_name="your_dataset") # Replace "your_dataset" with your dataset's name
media_seg = SAG_MEDIA(image_file_paths, video_file_paths)

Call the segment method on the SAG_MEDIA instance to perform the segmentation:

media_seg.segment()

If you want to save the resulting dataset, call the create_dataset method:

media_seg.create_dataset()

Now your dataset is ready for downstream tasks like model fine-tuning. Please note that you will need to replace "your_dataset" with the actual name of your dataset when loading the images and videos.

For more details, please refer to the documentation in the visualnexus.py script. The script is designed to be flexible and customizable, so feel free to modify it according to your needs.

Share with Friends

Help us spread awareness about VisualNexus by sharing it on various social media platforms:

Let's share VisualNexus and help the community benefit from this open-source training pipeline for visual dataset segmentation and labeling!

Contributing

Contributions are welcome! If you have any ideas, suggestions, or bug reports, please feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License.

Acknowledgements

We would like to express our gratitude to the open-source community for their invaluable contributions and inspiration.

Contact

Email kye at [email protected]

2 Pipelines

2 potential pipelines, what do you think? We should make one for robotic datasets

Infinigen -> Segment Anything Video -> Very Rich and Detailed Dataset.
Segment Anything for Image and or Video -> Iterate over Dataset and segment-> Very Rich and Detailed Dataset structured dataset for pretraining?
Video Pipeline vidoe instruction dataset -> Segment Anything Video -> Orca-Like explain your reasoning instruction tuning creation -> super SOTA visual pretraining dataset for vision models
from discord chat in LAION with Christop

*"This looks wonderful
reclaimer — Today at 7:33 AM
We could make like an multi modal dataset with images or video dataset like Orca’s where the model explains its multi modal reasoning 
Christoph — Today at 10:13 AM
Yes, exactly. Or we take lavar captions and Feed them in GPT 4 to reason based on the Text about the Image contents, the causes, consequeces.... 😉
reclaimer — Today at 11:41 AM
We could also sprinkle in video segment anything to show the model the segmentations with detailed reasoning explanations"

Roadmap

Creaet documentation for all classes
Create examples for Robotic Vision, Medical Imaging, Autonomous Driving, Satellite Imagery:
Update examples

kyegomez / visualnexus Goto Github PK

visualnexus's Introduction

VisualNexus

Getting Started

Key Features

Architecture

Detailed Instructions:

Share with Friends

Contributing

License

Acknowledgements

Contact

2 Pipelines

Roadmap

visualnexus's People

Contributors

Stargazers

Watchers

Forkers

visualnexus's Issues

Upvote & Fund

Recommend Projects

Recommend Topics

Recommend Org