VideoDataGenerator: A easy data tool for machine learning with videos

This option is similar to the keras.ImageDataGenerator in how its take the data in a folder and load sequentially from it.

Important: The VideoDataGenerator works only with the notation of channels in the last dimension. NFHWC (N - Batch, F - Frames, H - Height, W - Width and C - Channels).

Actual version: v2.2.2

Features
- Now video_transformation is a list with all the transformation that you want to make to the videos. If is for all videos the mode is "full" or if you want to duplicate the videos is with "augmented" mode.
- Added the option to load datasets from a matrix of python, numpy or pandas.
- Updated the example of how to use video_transformation in main.py.
Future features:
- Any ideas? Contribute now!

Documentation

Installation

Just copy the file in your project and import the class..

from DatasetsLoader import VideoDataGenerator

Dependecies

This file have only three dependecies, the opencv library (Only use imread, cvtColor and resize), numpy library and pandas. It doesn't matter the version so relax and install whatever you want ;)

Dataset Structure

Well it's simple, first we must understand how the directories must be in order to VideoDataGenerator works:

Dataset_directory
- train
  - Classes (In folder)
    - Videos (In folders)
      - Frames (Files in jpg, png, tiff or ppm)
- test
  - Classes (In folder)
    - Videos (In folders)
      - Frames (Files in jpg, png, tiff or ppm)
- dev
  - Classes (In folder)
    - Videos (In folders)
      - Frames (Files in jpg, png, tiff or ppm)

If you see, yes... Only accepts the folders of train, test and dev data (Dev is optional but train and test are required) so order you dataset and enjoy this tool for your projects.

How to use it?

dataset = VideoDataGenerator(<parameters>)

When you create the VideoDataGenerator it will ask your for this parameters:

directory_path: String of the dataset path. Obligatory.
batch_size: Default in 32, it specifies the size of batches to generate.
original_frame_size: Default None, it resize the original image before applying a transformation over it. None means the original size and you must pass the size in a tuple like (width, height).
frame_size: Default None, it specifies the final image size to return after applying transformations. None means the original size and you must pass the size in a tuple like (width, height).
video_frames: Default None, it specifies the final video frames to return.
temporal_crop: Default is (None, None), it specifies what type of operation over the temporal axis must be done. For more information read the below section.
video_transformation: Default None, it specifies what transformation must be done over the video after loaded. For more information read the below section.
frame_crop: Default is (None, None), it specifies what type of operation over the spatial axis must be done. For more information read the below section.
shuffle: Default False, Boolean that specifies if the data must be shuffle or not.
conserve_original: Default False, Boolean that specifies if for every transformation done in the data the original form of the data should be conserved. For more information read the below section.

Attributes and Methods

The following attributes and methods are public and the principal core to use VideoDataGenerator object.

Attributes
- VideoDataGenerator.train_batches: The total number of steps or batches that train data contains.
- VideoDataGenerator.test_batches: The total number of steps or batches that test data contains.
- VideoDataGenerator.dev_batches: The total number of steps or batches that dev data contains.
- VideoDataGenerator.train_batch_index: The index or position at what batch the object is in the train data.
- VideoDataGenerator.test_batch_index: The index or position at what batch the object is in the test data.
- VideoDataGenerator.dev_batch_index: The index or position at what batch the object is in the dev data.
Methods
- VideoDataGenerator.to_class: Vector of classes of VideoDataGenerator and you can access to the name of a class by its index or class number.
- VideoDataGenerator.to_number: Dictionary of classes of VideoDataGenerator and you can access to the number of a class by its name in lowercase.
- VideoDataGenerator.get_next_train_batch(n_canales = 3): Method that return a tuple in the order (batch, labels) of the train data. The parameter n_canales specifies how many channels you want to upload your frames (Default to 3).
- VideoDataGenerator.get_next_test_batch(n_canales = 3): Method that return a tuple in the order (batch, labels) of the test data. The parameter n_canales specifies how many channels you want to upload your frames (Default to 3).
- VideoDataGenerator.get_next_dev_batch(n_canales = 3): Method that return a tuple in the order (batch, labels) of the dev data. The parameter n_canales specifies how many channels you want to upload your frames (Default to 3).
- VideoDataGenerator.get_train_generator(n_canales = 3): Method that return a python generator based in VideoDataGenerator.get_next_train_batch(n_canales = 3) and yields (batch, labels) elements to pass as arguments to the neural network.
- VideoDataGenerator.get_test_generator(n_canales = 3): Method that return a python generator based in VideoDataGenerator.get_next_test_batch(n_canales = 3) and yields (batch, labels) elements to pass as arguments to the neural network.
- VideoDataGenerator.get_dev_generator(n_canales = 3): Method that return a python generator based in VideoDataGenerator.get_next_dev_batch(n_canales = 3) and yields (batch, labels) elements to pass as arguments to the neural network.

Transformation and basics

Order of transformations

The following order is how the VideoDataGenerator applies the transformations specified:

Temporal crop
Frame crop
Video transformation

Basic parameters

An important consideration is that all the dataset videos must have the same size (width and height) otherwise VideoDataGenerator will assume that all your dataset have an original size of the first frame in the first video of training videos.
In the structure of the dataset all the videos must be folders.
original_frame_size: This parameter is fundamental when you have to start with an original frame size of videos. For example you use it commonly when have to replicate experiments. If you doesn't specified this parameter then VideoDataGenerator will take the frame size of the first frame in the first video of training videos to be the original size.
conserve_original: When you apply transformation, generally, is in order to increase the dataset but sometimes you need to transform completely your data for your model so you don't need the natural data in it. When this parameter is in True then it applies first the option 'sequential' to temporal_crop and then the transformation that you need (of course if you select 'sequential' to transform the data it won't apply twice) and for this data the frame_crop option will be None so the frame size will be resized to the frame_size specified by the user.

Temporal crop

Temporal crop work in the time axis of a video performing 4 types of operations. To select the type of operation you must do it in a tuple (type, additional_parameter). The types of transformations (You must type exactly at here is) and its parameters are:

Important note: If a video have less frames that the frames required then it will be completed by adding the initial frames to complete the required, for example if video_frames is 10 and the video have 5 frames then VideoDataGenerator will add the firsts frames that need to complete 10 frames (Yes... it can be added twice if the frames required are 15) in all temporal crops.

None: When you establish this option VideoDataGenerator will take only the first frames of all videos by the option video_frames. It doesn't require parameter so you can pass None.
'sequential': When you establish this option VideoDataGenerator will take all the frames of all the videos divided in portions of video_frames. For example, if 1 video have 10 frames and video_frames is 3 then the dataset will have 3 samples of 3 frames each one and the last frame will be ignored.

It doesn't require parameter so you can pass None.

'random': Yeah, it as simple as it sound, this parameter make random temporal crops between the start and end of the frames. The additional parameter must be an Integer who define the number of random crops to apply (The image below is the example with 2 random crops).

custom: The most powerful option in VideoDataGeneratorbecause you have to pass as parameter a callback of a function that you made to apply the temporal crop. The structure of the function is the following:

As you can see it's necessary that you return a matrix of the frame to be added and every row will be a temporal crop to be added to the data. The parameter of the function receive a python list of all the frames of a video and return a python list of list (matrix).

Frame crop

Frame crop work in the spacial axis of a video performing 4 types of operations. To select the type of operation you must do it in a tuple (type, additional_parameter). The types of transformations (You must type exactly at here is) and its parameters are:

None: When you establish this option VideoDataGenerator will only resize the frames of the video to frame_size. It doesn't require parameter so you can pass None.
'sequential': When you establish this option VideoDataGenerator will take all the possible frame crop in the image. For example if the frame have an original size of 360x250 and the final frame size is 110x110 then there is only 6 (3x2) sequential crops in the following order to all the frames of all the videos:

It doesn't require parameter so you can pass None.

'random': This parameter make random frame crops over the image and it doesn't do the same like random temporal crop, here it is only random crops. The additional parameter must be an Integer who define the number of random crops to apply. (The image below is the example with 3 random crops).

custom: Another powerful option in VideoDataGeneratorbecause you have to pass as parameter a callback of a function that you made to apply the frame crop. The structure of the function is the following (Note tha x refers to the width and y to the height):

As you can see it's necessary that you return a matrix of the crops to be applied and every row with only 4 columns will be a frame crop to be added to the data. The axis $x$ belongs to the width dimension and $y$ with the height dimension. The parameter of the function receive a numpy array of the frame (loaded) and return a python list of list (matrix).

Video transformation

In construction (almost done)

Do you want to contribute?

Core explanation

In construction

Data structure explanation

In construction

How it load the data

In construction

jefelitman / videodatagenerator Goto Github PK

videodatagenerator's Introduction

VideoDataGenerator: A easy data tool for machine learning with videos

Documentation

Installation

Dependecies

Dataset Structure

How to use it?

Attributes and Methods

Transformation and basics

Do you want to contribute?

videodatagenerator's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent