Code Monkey home page Code Monkey logo

video_autoencoder's Introduction

video_autoencoder

Video lstm auto encoder built with pytorch. https://arxiv.org/pdf/1502.04681.pdf

This project auto-encodes videos by vectorizing consecutive screens of videos using an LSTM auto-encoder.

Training data

The training data is a collection of cow screen images sampled from some videos. Per image is sampled for every 50 frames and 6 consecutive images are used as a training sample. Since the video is 30 frames / second, one training sample is a summary of movements in 10 seconds. We tried to analyze the movements of cows by analyzing the output movement vectors of the model.

Preprocessing

The images are vectorized using some CNNs like Resnet before input to the LSTM auto-encoder. Here, the output vector of the last full connection layer of Resnet50 is used. So every image is transformed into a 2048 dimension vector firstly. The Resnet50 is also fine-tuned using the training data.

Learning curve

The auto-encoder is well trained according to the following learning curve.

Learned vector patterns

The movement of cows in consecutive screens is vectorized by the model. The changes of colors indicate different movements of cows.

t-SNE dimension reduction

Dimension reduction on the learned vectors according to different perplexity values.

Clustering of dimension reduction results

The labels of differnet colors are automatically generated by various clustering algorithms. Differnet labels correspond to differnet movements of cows.

Relations between generated labels and original image data

The second clustering of Ward column is used for analyzing the labels and their corresponding images. It can be seen that differnet labels correspond to different movements of cows.

The summary of movements can be listed as follows:

label movement
0 head and limbs
1 head and belly (including cow-birth)
2 head

video_autoencoder's People

Contributors

shuuchen avatar

Watchers

paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.