Code Monkey home page Code Monkey logo

computer-vision-course-assignments's Introduction

Computer Vision Course Assignments

This repository hosts my solutions to three practical programming assignments from a Computer Vision course, taught by Professor Renato Martins. The assignments delve into computer vision's geometric and semantic analysis of real-world scenes, with a focus on image formation, feature extraction, 3D reconstruction, and the integration of machine learning techniques.

Course Goals

  • Understand geometric and semantic properties of real-world scenes from images.
  • Learn fundamental low-level vision topics, including image formation, feature extraction, and 3D reconstruction.
  • Enable practical development and training of computer vision models, paving the way for advanced studies in Deep Learning.

Assignments and Contents

Each assignment folder (Assignment1_tracker, Assignment2_recognition, Assignment3_epipolar) contains:

  • A Jupyter Notebook with detailed solutions.
  • The dataset or images used.
  • A comprehensive PDF report summarizing findings and methodologies.
  • A .yml file to recreate the environment used (myharris_track.yml, cv_recognition.yml, epipolar.yml).
  • PDF instructions provided by Professor Renato Martins for the assignments.

Assignment I - Corner Detection & Feature Tracking

Folder: Assignment1_tracker

Implements a Harris corner detector for tracking keypoints over time using patch templates and SIFT descriptors on the KITTI Visual Odometry dataset's first 200 frames.

Environment: myharris_track.yml

SIFT Robust Tracking GIF

Assignment II - Object Recognition & Augmented Reality with Homographies

Folder: Assignment2_recognition

Focuses on object recognition and robust homography estimation with RANSAC, including an augmented reality application that replaces Van Gogh's "Nuit étoilée" painting with the ESIREM logo in various MoMA museum images.

Environment: cv_recognition.yml

Assignment III - Epipolar Geometry & 8-Point Algorithm

Folder: Assignment3_epipolar

Estimates the fundamental matrix for an uncalibrated camera using the 8-Point Algorithm, demonstrating a foundational understanding of epipolar geometry in stereo vision contexts.

Environment: epipolar.yml

Technologies

  • Python
  • PyTorch (for tensor operations)
  • OpenCV (for debugging and result verification)

Setup and Execution

Each assignment folder contains a .yml file with the necessary environment setup. To create and activate the environment for an assignment, run:

conda env create -f environment_file.yml
conda activate environment_name

Replace environment_file.yml and environment_name with the appropriate file and environment name for the assignment you're working on.

Contributions

These assignments represent my original work, showcasing a comprehensive effort to apply and extend computer vision techniques learned during the course. Feedback and discussions on the methodologies and results are welcome.

Acknowledgements

Special thanks to Professor Renato Martins for his invaluable guidance and to my peers for their constructive critiques throughout the course.

Note

High-level OpenCV implementations are used solely for debugging purposes and result verification. The core tasks rely on fundamental computer vision and machine learning concepts as per course requirements.

References

Adjust the references section as needed based on your assignment requirements and external sources used.

computer-vision-course-assignments's People

Contributors

gracesevillano avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

frostbyte012

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.