aus10powell / mit-fishery-counter Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 1.0 82.59 MB

Applying Image Recognition to Enhance Fisheries Management Capabilities

Jupyter Notebook 76.40% Python 23.60%

mit-fishery-counter's Introduction

Hi there 👋

Current

I'm currently working an applied data scientist

Statistical Data

mit-fishery-counter's People

Stargazers

Watchers

mit-fishery-counter's Issues

Create Jupyter Notebook for Demo Video Analysis

We need a Jupyter Notebook that can process a demo video to generate and display counts of objects (e.g., fish) detected in the video. This notebook will serve as a demonstration of our video analysis capabilities and should be user-friendly for new users to understand our process and results.

Requirements

The notebook should take a demo video as input. For testing, you can use the video located at /Users/aus10powell/Documents/Projects/MIT-Fishery-Counter/data/gold_dataset/videos/irwa/1_2016-04-22_12-36-58.mp4.
Utilize the existing inference pipeline as implemented in InferenceCounter for processing the video.
Display the total count of objects detected, differentiating between objects entering and exiting the frame if applicable.
Plot a graph showing the count of objects over time to visualize the flow within the video.
Include comments and explanations in the notebook to guide new users through the analysis process.

Additional Context

The analysis should leverage the tracking configuration specified in /Users/aus10powell/Documents/Projects/MIT-Fishery-Counter/code/src/utils/tracking_configs/botsort.yaml.
The model for detection is located at /Users/aus10powell/Documents/Projects/MIT-Fishery-Counter/code/notebooks/runs/detect/train133/weights/best.pt.
It would be beneficial to include a section on how to adjust parameters for different videos or scenarios.
The output should be easily interpretable, with visualizations where possible.

Expected Outcome

A Jupyter Notebook that can be run end-to-end, providing insights into the object counts within a given video.
Documentation within the notebook that explains each step of the process and how users can adapt it to their needs.
This notebook will be a valuable addition to our documentation, helping to showcase our video analysis technology and providing a practical tool for users to experiment with.

Currently running on Yolov2: subsequent versions of Yolo have extremely significant performance and speed gains

Investigate next appropriate Yolo version

Differences between YOLOv2 (current) and YOLOv3 (also considered YOLOv6):

Architectural Changes: YOLOv3 introduced several architectural changes compared to YOLOv2:

YOLOv3 uses three different scales or sizes of the detection grid, whereas YOLOv2 had only one scale. This allows YOLOv3 to detect objects of different sizes more effectively.
YOLOv3 utilizes residual blocks inspired by the ResNet architecture, enabling better feature extraction and representation.
YOLOv3 introduces skip connections, which facilitate the flow of information from earlier layers to later layers, aiding in the detection of objects at different scales.
Improved Accuracy and Performance: YOLOv3 (or YOLOv6) generally offers improved accuracy and performance compared to YOLOv2:
YOLOv3 achieves higher average precision and better localization accuracy due to its architectural improvements and multi-scale detection strategy.
YOLOv3 has a higher number of convolutional layers and more parameters, allowing it to capture more intricate features and improve detection accuracy.
YOLOv3 maintains real-time performance, although it may be slightly slower than YOLOv2 due to its increased complexity.

Pros and Cons of Upgrading from YOLOv2 to YOLOv3 (or YOLOv6):

Pros of upgrading from YOLOv2 to YOLOv3 (or YOLOv6):

Improved detection accuracy: YOLOv3 offers better localization accuracy and higher average precision, which can result in improved object detection performance.
Better handling of objects at different scales: YOLOv3's multi-scale detection strategy and skip connections enable more effective detection of objects of various sizes.
Support for more intricate features: YOLOv3's deeper architecture and residual blocks allow it to capture more complex and detailed features, potentially improving the detection of challenging objects.

Cons of upgrading from YOLOv2 to YOLOv3 (or YOLOv6):

Training time and computational requirements: YOLOv3 has a higher number of layers and more parameters, which may result in increased training time and computational resource requirements compared to YOLOv2.
Potential need for reannotation: Upgrading to YOLOv3 (or YOLOv6) might require revisiting the annotation process to ensure compatibility with the multi-scale detection approach.

REST API Creation

Fishery Counter API

System Design Outline

1. Data Collection

Camera setup at fishery locations
Video/image capture of fish passages

2. Data Processing

Object detection and tracking pipeline
- Phase 1: Detection (every N frames)
- Phase 2: Tracking (between detections)
Image labeling and annotation

3. Machine Learning Model

YOLOv8 model for object detection
Custom trained on herring images

4. API Backend

REST API endpoints
Fish counting logic
Data storage and retrieval

5. Frontend Interface

User dashboard for non-programmers
Visualization of fish counts and statistics

6. Deployment and Infrastructure

Cloud hosting (e.g. AWS, GCP)
Containerization with Docker

Prediction REST API Details

1. Endpoints

POST /predict
- Accepts image/video upload
- Returns fish count and species classification
GET /stats
- Returns summary statistics (e.g. daily/weekly counts)
GET /health
- API health check

2. Request/Response Format

JSON for metadata
Binary for image/video data

3. Authentication

API key for access control

4. Processing Pipeline

Image preprocessing
YOLOv8 model inference
Post-processing of detections
Counting logic (e.g. directional filtering)

5. Output

Fish count
Species classification (River Herring vs Not River Herring)
Confidence scores
Timestamp

6. Additional Features

Batch processing for historical data
Real-time processing for live video feeds
Integration with environmental sensors (temperature, current speed, etc.)

7. Scalability Considerations

Load balancing for multiple concurrent requests
Caching of frequent queries
Asynchronous processing for long-running tasks

8. Monitoring and Logging

Performance metrics
Error tracking
Usage statistics

Getting Started

[Instructions for setting up and running the API locally or in a development environment]

Deployment

[Instructions for deploying the API to production]

Contributing

[Guidelines for contributing to the project]

License

[License information]