College of Engineering, University of Wisconsin - Madison, WI
Neel Kelkar, Suchith Suresh, Nawal Dua
Students: [email protected], [email protected], [email protected]
Professor: [email protected]
In this project, we intend to first generate depth maps from input video. Using these depth maps, we plan to create refocusable videos.
With these refocusable videos, we can simulate human vision but with a more cinematic DoF using eye tracking technology where the video will place more emphasis on whichever subject a viewer looks at.
To get the premade depth maps that work with out code, go to Google's Mannequin Challenge repository and download and follow their setup instructions. Replace the run_and_save_DAVIS
method in mannequinchallenge/models/pix2pix_model.py
by the code in pix2pix_replacement.py
Comment and uncomment the parts of the replacement code as neccesary.
To get the depth maps, follow the run instructions on google's repo.
Download this repository and run /code/main.py
NOTES: In the GUI, choose the ‘Dynamic-Depth-of-Field-with-Eye-Tracking’ folder (project folder) as the directory as it contains the required input frames and depth maps data. The functions in the ‘outputfuncs.py’ file are accessed/executed by clicking the following buttons from the main window:
- To access the Preview function and to choose the variance and blurring parameters for processing the video
Directory --> Preview --> Generate
(Access ‘mouse_move’, ‘genpreview’, ‘preview_win’)
OR - To process a new video based on required parameters
Directory --> View Video --> Process Video --> Output Video
(Access ‘mouse_move’, ‘output_win’)
OR - To view a previously processed video
- a. (For picking the default output folder)
Directory --> View Video --> Output Video - b. (For picking a different output folder)
Directory(Optional Step) --> View Video --> Output Folder --> Output Video
The Process Video step will take anywhere from 30 to 200 seconds for a 480p resolution video, please check the python console for progress
List of libraries required are :
- tqdm
- time
- cv2 (opencv-python)
- numpy
- tkinter
- glob
- copy
- math
For a camera, Depth of Field(DoF) is the distance between the nearest and farthest object in focus. The depth of field in a video depends on numerous factors such as focal length, aperture, distance to subject, etc. The human eye too has a depth of field. DoF is generally extremely wide for objects more than a few meters away, meaning that everything is in focus at the same time. For near objects, the DoF is very shallow, giving rise to the cinematic focus effect we see in movies. The DoF in cinematic videography is done using specialized lenses and scripted positioning of subjects. Mobile phone videos do not have this positioning planned beforehand and thus the cameras have a wide DoF as the default. Mobile phone camera’s don’t have readily available interchangeable lenses either, preventing daily users to use the optimal lens for the subject.
To learn more about the project you can look at our Project Proposal or our Final Project Report
- Google's Mannequin Challenge Dataset and Depth Mapping Code