diogenesanalytics / frame-sampling Goto Github PK
View Code? Open in Web Editor NEWPython library devoted to various frame sampling strategies.
License: MIT License
Python library devoted to various frame sampling strategies.
License: MIT License
Sometimes in a large dataset of videos (especially if they're freshly scraped from the internet) you will have some video files that are corrupted, or just did not complete downloading. How should the sampler handle these scenarios?
Here is the line that needs to be addressed:
frame-sampling/src/frame_sampling/strategy.py
Lines 155 to 156 in 13696ed
Wrap a try/except
block around the _sample_single_video
method:
def sample(
self, video_dataset: VideoDataset, output_dir: Path, exist_ok: bool = False
) -> None:
"""Loop through frames and store based on certain criteria."""
# notify of sampling
print(f"Sampling frames at every {self.sample_rate} frames ...")
# loop over videos and their id
for video_idx, video_path in enumerate(tqdm(video_dataset)):
# get name of sample subdir
sample_subdir = self._create_subdir_path(output_dir, video_idx)
# check if dir exists
if exist_ok or not sample_subdir.exists():
# create dir
sample_subdir.mkdir(parents=True, exist_ok=exist_ok)
# get frame samples from single video
try:
self._sample_single_video(video_path, sample_subdir)
except av.InvalidDataError as e:
print(f"InvalidDataError while processing {video_path}: {e}")
# Handle the exception as needed, e.g., log the error or skip the video
except Exception as ex:
print(f"An unexpected error occurred while processing {video_path}: {ex}")
# Handle other exceptions if needed
Currently the base frame sampler class has no strategy in place to improve performance (e.g. take advantage of multi-cor processors).
Introduce some form of concurrency
and/or parallelism
into the base class.
Need to get a sense of the progress of the ffmpeg subprocess that is creating the video clip for the VideoClipSampler
class.
Basically something similar to this
While the MinimalSampler
class is a good general simple frame sampling strategy, there are instances where more advanced strategies may be required.
Currently the frame_sampling
package implements its own Dataset
class, which it subclasses to create the VideoDataset
class. This class has value outside of the frame_sampling
package.
The code being discussed:
frame-sampling/src/frame_sampling/dataset.py
Lines 1 to 75 in da2671e
Move code into the DiogenesAnalytics/dataset
repo.
Currently no unit tests exist to ensure the proper function of the frame_sampling
module.
Just need to develop some minimum viable unit tests for the module.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.