Code Monkey home page Code Monkey logo

frame-sampling's Introduction

Alexander the Great cuts the Gordian Knot

Problems well-defined are problems solved.

The standard that our analytic work aspires to achieve, can best be illustrated through our namesake: Diogenes. A famous (or infamous) Greek philospher, Diogenes the Cynic is the paragon of the two virtues that best represent our standard: minimalism, and logical rigor ๐ŸŽฒ

frame-sampling's People

Contributors

diogenesanalytics avatar

Watchers

 avatar  avatar  avatar

frame-sampling's Issues

Refactor: Refactor Existing Dataset Class/Subclasses Into Package

Problem

Currently the frame_sampling package implements its own Dataset class, which it subclasses to create the VideoDataset class. This class has value outside of the frame_sampling package.

Code

The code being discussed:

"""Implements classes for dealing with video data."""
from abc import ABC
from abc import abstractmethod
from dataclasses import InitVar
from dataclasses import dataclass
from pathlib import Path
from typing import Generator
from typing import Iterator
from typing import List
from typing import Union
@dataclass
class Dataset(ABC):
"""Defined the abstract base class for all datasets."""
data_dir: InitVar[Union[str, Path]]
def __post_init__(self, data_dir: Union[str, Path]) -> None:
"""Apply post constructor processing to args."""
# get data path object
self._data_path: Path = Path(data_dir)
# make sure path exists
self._dataset_exists()
# create index
self.index: List[Path] = [path for path in self._get_filepaths()]
def __iter__(self) -> Iterator[Path]:
"""Defining the iteration behavior."""
return iter(self.index)
def __len__(self) -> int:
"""Defining how to calculate length of dataset."""
return len(self.index)
def __getitem__(self, idx: int) -> Path:
"""Defining how data path objects will be accessed."""
return self.index[idx]
def _dataset_exists(self) -> None:
"""Make sure path to data dir exists."""
assert self._data_path.exists()
def _get_filepaths(self) -> Generator[Path, None, None]:
"""Scan target directory for file extensions and grab their file paths."""
# iterate target file extensions
for ext in self.file_extensions:
# loop through video paths matching ext
yield from self._data_path.glob(f"**/{ext}")
@property
@abstractmethod
def type(self) -> str:
"""Defines the type of the data found in dataset."""
pass
@property
@abstractmethod
def file_extensions(self) -> List[str]:
"""Defines the file extensions accepted for the given data type."""
pass
@property
def path(self) -> str:
"""Retuns the data path as a string."""
return str(self._data_path)
class VideoDataset(Dataset):
"""Dataset of video files."""
type = "video"
file_extensions = ["*.mp4", "*.avi", "*.mkv", "*.mov", "*.webm"]

Solution

Move code into the DiogenesAnalytics/dataset repo.

Testing: Basic Tests

Problem

Currently no unit tests exist to ensure the proper function of the frame_sampling module.

Solution

Just need to develop some minimum viable unit tests for the module.

Feature: Catch InputDataError from PyAV

Problem

Sometimes in a large dataset of videos (especially if they're freshly scraped from the internet) you will have some video files that are corrupted, or just did not complete downloading. How should the sampler handle these scenarios?

Code

Here is the line that needs to be addressed:

# get frame samples from single video
self._sample_single_video(video_path, sample_subdir)

Solution

Wrap a try/except block around the _sample_single_video method:

def sample(
    self, video_dataset: VideoDataset, output_dir: Path, exist_ok: bool = False
) -> None:
    """Loop through frames and store based on certain criteria."""
    # notify of sampling
    print(f"Sampling frames at every {self.sample_rate} frames ...")

    # loop over videos and their id
    for video_idx, video_path in enumerate(tqdm(video_dataset)):
        # get name of sample subdir
        sample_subdir = self._create_subdir_path(output_dir, video_idx)

        # check if dir exists
        if exist_ok or not sample_subdir.exists():
            # create dir
            sample_subdir.mkdir(parents=True, exist_ok=exist_ok)

            # get frame samples from single video
            try:
                self._sample_single_video(video_path, sample_subdir)
            except av.InvalidDataError as e:
                print(f"InvalidDataError while processing {video_path}: {e}")
                # Handle the exception as needed, e.g., log the error or skip the video
            except Exception as ex:
                print(f"An unexpected error occurred while processing {video_path}: {ex}")
                # Handle other exceptions if needed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.