drivendataorg / zamba Goto Github PK

A Python package for identifying 42 kinds of animals, training custom models, and estimating distance from camera trap videos

Home Page: https://zamba.drivendata.org/docs/stable/

License: MIT License

Python 98.24% Makefile 1.76%

machine-learning gpu videos jungle chimps animals conservation ecology video-processing cli

zamba's Introduction

Zamba

monkey-vid.mp4

Zamba means "forest" in Lingala, a Bantu language spoken throughout the Democratic Republic of the Congo and the Republic of the Congo.

zamba is a tool built in Python that uses machine learning and computer vision to automatically detect and classify animals in camera trap videos. You can use zamba to:

Identify which species appear in each video
Filter out blank videos
Create your own custom models that identify your species in your habitats
Estimate the distance between animals in the frame and the camera
And more! 🙈 🙉 🙊

The official models in zamba can identify blank videos (where no animal is present) along with 32 species common to Africa and 11 species common to Europe. Users can also finetune models using their own labeled videos to then make predictions for new species and/or new ecologies.

zamba can be used both as a command-line tool and as a Python package. It is also available as a user-friendly website application, Zamba Cloud.

We encourage people to share their custom models trained with Zamba. If you train a model and want to make it available, please add it to the Model Zoo Wiki for others to be able to use!

Visit https://zamba.drivendata.org/docs/ for full documentation and tutorials.

Installing `zamba`

First, make sure you have the prerequisites installed:

Python 3.8 or 3.9
FFmpeg > 4.3

Then run:

pip install https://github.com/drivendataorg/zamba/releases/latest/download/zamba.tar.gz

See the Installation page of the documentation for details.

Getting started

Once you have zamba installed, some good starting points are:

The Quickstart page for basic examples of usage
The user tutorial for either classifying videos or training a model depending on what you want to do with zamba

Example usage

Once zamba is installed, you can see the basic command options with:

$ zamba --help

 Usage: zamba [OPTIONS] COMMAND [ARGS]...

 Zamba is a tool built in Python to automatically identify the species seen in camera trap
 videos from sites in Africa and Europe. Visit https://zamba.drivendata.org/docs for more
 in-depth documentation.

╭─ Options ─────────────────────────────────────────────────────────────────────────────────╮
│ --version                     Show zamba version and exit.                                │
│ --install-completion          Install completion for the current shell.                   │
│ --show-completion             Show completion for the current shell, to copy it or        │
│                               customize the installation.                                 │
│ --help                        Show this message and exit.                                 │
╰───────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ────────────────────────────────────────────────────────────────────────────────╮
│ densepose      Run densepose algorithm on videos.                                         │
│ depth          Estimate animal distance at each second in the video.                      │
│ predict        Identify species in a video.                                               │
│ train          Train a model on your labeled data.                                        │
╰───────────────────────────────────────────────────────────────────────────────────────────╯

zamba can be used "out of the box" to generate predictions or train a model using your own videos. zamba supports the same video formats as FFmpeg, which are listed here. Any videos that fail a set of FFmpeg checks will be skipped during inference or training.

Classifying unlabeled videos

$ zamba predict --data-dir path/to/videos

By default, predictions will be saved to zamba_predictions.csv. Run zamba predict --help to list all possible options to pass to predict.

See the Quickstart page or the user tutorial on classifying videos for more details.

Training a model

$ zamba train --data-dir path/to/videos --labels path_to_labels.csv --save_dir my_trained_model

The newly trained model will be saved to the specified save directory. The folder will contain a model checkpoint as well as training configuration, model hyperparameters, and validation and test metrics. Run zamba train --help to list all possible options to pass to train.

You can use your trained model on new videos by editing the train_configuration.yaml that is generated by zamba. Add a predict_config section to the yaml that points to the checkpoint file that is generated:

...
# generated train_config and video_loader_config
...

predict_config:
  checkpoint: PATH_TO_YOUR_CHECKPOINT_FILE

Now you can pass this configuration to the command line. See the Quickstart page or the user tutorial on training a model for more details.

You can then share your model with others by adding it to the Model Zoo Wiki.

Estimating distance between animals and the camera

$ zamba depth --data-dir path/to/videos

By default, predictions will be saved to depth_predictions.csv. Run zamba depth --help to list all possible options to pass to depth.

See the depth estimation page for more details.

Contributing

We would love your contributions of code fixes, new models, additional training data, docs revisions, and anything else you can bring to the project!

See the docs page on contributing to zamba for details.

zamba's People

Contributors

Stargazers

Watchers

zamba's Issues

Updates to error handling

We need to update error handling with two items:

(1) Catch and skip videos that error so that a single problem does not tank the entire job.
(2) Save errors in a machine readable format (file, message, and traceback) so that it is consumable by zamba cloud.

Change name of `WinningModel` class

Eg CNNEnsemble

Remove `--model_class` option from `cli.py` and test cli through mocks instead.

[email protected]('--sample_model',
-              type=bool,
-              default=False)
[email protected]('--model_class',

pjbull: Add an issue to remove this eventually when we can test with sample model through mocks instead

Don't fail entire process on invalid videos

Currently, our video loading has a number of limitations:

Does not support passing a list of videos rather than an entire directory
Assumes everything in a folder is a video
Does not check file is a video before trying to load it

If a video or file is invalid, the entire process throws an error and ends. This is probably not desirable in the case where someone is batch processing a large number of videos.

Keep `Path` objects as `Path` objects where possible

Path objects are magical, let's keep em around. Convert to str only when necessary.

Write installation instructions

FFMPEG

a little ffmeg install walkthrough
include overview of commands, 15 sec resize example in getting started

Model: Add docstrings throughout

Remove `pims` / `av` in favor of a video loading lib that works well across platforms

The RUN conda install av -c conda-forge in the Dockerfile is crucial. Installing av from pip will cause CnnEnsemble to fail.

Update folder name so that cache is busted

If this exists, we don't download zamba.zip, which is a problem.

Model: Refactoring where possible.

For example, it seems that second_stage.py and second_stage_nn.py share some function signatures such as loading data. Would it be possible to refactor here, and/or any other places?

download weights if not available

Upload zipped weights to PUBLIC s3 bucket - include mirror links for europe and asia
If not exists weight_dir: ask to download, download
-- warn about size!
-- unzip to weights dir
-- goes in init of CnnEnsemble
-- use requests, os.environ.get for weights url, default to ours
-- zipfile, in standard lib

end to end with raw vids

no hard-coded data paths!
pull some, known classes, from s3
add test on single nano vid (to ship with package)

Unblock dev by using cifar classifier as toy model

The Problem

Now that we have simple functionality, we need to hone in on specific features. To do this efficiently, we need

a lightweight model pipeline
a pipeline similar enough to the primatrix classifier so that core functionality can be developed
a pipeline simple enough so that development can proceed at a reasonable pace

A Solution

Building a convolutional network to classify the cifar10 data fits the above criteria. The cifar 10 problem is to classify 32x32 pixel RBG images into one of 10 classes. It is a dataset often used in academics, but I think can be very useful for application development as well, at least in our case.

The idea is to use cifar10 as the context, and a simple convolutional net as the "default model" (eventually dmytro's) that we can use to develop the predict, train, and tune functionality. Since the inputs (images) and outputs (row of probs for one image) are similar, minimizing the code's bias towards cifar should be less complicated than if we keep using random number data and non-predictors as a development model.

The goal is to frame as many pri-martrix problems as we can as cifar problems. My claim is that there are a non-trivial amount of pri-matrix problems that allow this framing. For example,

Handling real input directories, pre-processing their contents into temporary storage, and sending them into a model for prediction.
Handling output that is similarly structured to the chimps output, i.e., csv files indexed by a filename with class probabilities as columns.
Training, tuning, saving, and loading functionality for classifiers, not just tensorflow graphs (although that was a good start).
Development of any** additional performance metrics**, such as loss over time, validation scoring, etc.

As the tensorflow advanced tutorial on cifar says,

The reason CIFAR-10 was selected was that it is complex enough to exercise much of TensorFlow's ability to scale to large models. At the same time, the model is small enough to train fast, which is ideal for trying out new ideas and experimenting with new techniques.

In short, this approach seems to me a good solution to asynchronously working on the codebase with dmytro before his models are ready for integration.

Advantages

The above-referenced tutorial comes with a lot of code that can be edited and used to speed up development of our project, such as network architecture, training criteria, data-fetching scripts, pre-processing tasks, and various other "advanced" features that we can consciously decide to develop or not as we work on the tool.
The fact that inputs to the dmytro model are images and these data are images will allow us to develop a lot of the preprocessing pipeline (except the parts that convert the videos into images).
We can experiment with some of the ensembling that we'll eventually need to handle, such as using tensorflow and xgboost to generate predictions.

Disadvantages

Will inevitably have to deal with some idiosyncrasies of the cifar10 data, but given its simplicity and the fact that we have lots of starter code for manipulating it, this shouldn't set us back too much.

Timeline

Since the code skeleton is now fairly operational, I think it's reasonable to shoot for integrating this by the end of the week. End of first rev at the latest.

Consider implementing MegaDetector for images

i.e. if you have images, we’ll run MegaDetector instead and give you results
could also consider other ways to further implement MegaDetector for videos as part of Zamba backend

improve tests

Convert testing configs to @pytest.fixtures. Remove the temporary tf testing of the workflow.

Make sure eu mirror actually mirrors the weight files on s3

The eu mirror doesn't have the weight files:

input.tar.gz
output.tar.gz
data_fast.zip

Implement ModelManager object to mediate loading, configuration, and logic of model calls.

Make cli a thin wrapper around model object. cli should be focused on user io only. Use ModelManager object to mediate loading, configuration, and logic of model calls.

Eg,

class Model(object):
    def __init__(self):
        pass




class ModelManager(object):
    def __init__(self, modelpath, tempdir=None, predict_thresh=None):
        self.model = load_model(modelpath)
        self.predict_thresh = predict_thresh

    def predict(self, datapath, outputpath):
        data = load_data(datapath)

        if self.predict_thresh:
            preds = self.model.predict(data) >= self.predict_thresh
        else:
            preds = self.model.predict(data)

        preds.to_csv(outputpath)

    def train(self):
        pass

    def tune(self):
        pass

Change "src" directories to the project name.

By convention, Python projects usually have the tool name as the folders rather than a generic folder named src.

Once we've got a name, we should update these.

Write getting started

for weights in PUBLIC s3 bucket - include mirror links for europe and asia
ffmpeg: include overview of commands, 15 sec resize example

Remove redundant comments.

Clean up comments while there aren't many to keep track of.

Long term: Put data loading (back) into Manager, or not. Decide!

long term have ModelManager "load" the data (whatever that means) and consistently pass the same thing to any model class that is implemented.

Think about this.

install_requires should use requirements.txt

Load and parse requirements.txt for use in setup.py. That way we only have to maintain dependencies in one place.

https://github.com/drivendataorg/chimps-tool/blob/master/setup.py#L8

Tests Failing on master branch

I setup an automated test run with Codeship so that we have continuous integration. Tests on master are currently failing.

(@caseyfitz I know there's WIP to update those, so this is just to track that work.)

Setup Appveyor and add CI test for WINDOWS

Use config.py imports instead of Path objects where possible

The file zamba.models.cnnensemble.src.config.py sets a lot of paths. Yet many files such as zamba.models.cnnensemble_model.py import Path and config but use Path objects local to the file instead of config.

Use config where possible throughout the cnnensemble model.

Update docs for sphinx

Do this after initial ModelManager is merged. Doc everything that exists so far.

This includes updating README.md post djamba.

raise error if video length too long

check all vids up front before handing off to dmytro code, we expect 15 sec
use skvideo.io.ffprobe (add skvideo to req)

Add Zamba Cloud to Zamba docs

https://zamba.drivendata.org/docs/

Options for Running zamba could include Zamba Cloud web application (available for conservation researchers upon request) with a link to zambacloud.com

@pjbull I had a note on this from when this has come up a couple times in the past. I think it's worth making explicit but don't feel strongly. Assigning to you for now, feel free to reassign/remove

Megadetector release v4.1

We use megadetector v3, but there are significant improvements in v4.1 so we should use that

Model: Remove unused code

We're not looking for full coverage, but there is a lot of commented out code throughout the src directory

Prediction path should be to a file, not a directory

We always output one file, correct? If that's the case, we should let users specify the full filename not just the path.

Consider adding Context R-CNN

https://arxiv.org/abs/1912.03538 (Sara Beery)
https://colab.research.google.com/github/tensorflow/models/blob/master/research/object_detection/colab_tutorials/context_rcnn_tutorial.ipynb

Replace `assert`s with `Exception`s or let fail.

Remove asserts that aren't in testing. Raise exceptions if it makes sense, or just let the default error message arise.

If we are processing one video that ffmpeg cannot load, we fail with cryptic error.

We should have a better message for users rather than failing in this case:

(1) Ask zamba to run on 1 file that is not a valid video
(2) zamba will say "skipping

Logs look like:

2019-01-11 02:05:00,757 - root - INFO - making predictions17
Predicting on 1 L1 models:   0%|          | 0/1 [00:00<?, ?it/s]18
Processing 1 videos:   0%|          | 0/1 [00:00<?, ?it/s]�[A19
Processing 1 videos: 100%|██████████| 1/1 [00:07<00:00,  7.82s/it]�[A20�[A
Predicting on 1 L1 models: 100%|██████████| 1/1 [00:38<00:00, 38.37s/it]21
Skipping file that is not a valid video:  /paperspace/videos/123d4d52-858c-49b2-82d1-ae8d490df66f_12090006.AVI
Traceback (most recent call last):
  File "script.py", line 30, in <module>
    manager.predict(str(download_dir), save=True, pred_path=output_path)
  File "/src/zamba/zamba/models/manager.py", line 95, in predict
    preds = self.model.predict(data)
  File "/src/zamba/zamba/models/cnnensemble_model.py", line 67, in predict
    l2_results = second_stage.predict(l1_results, profile=self.profile)
  File "/src/zamba/zamba/models/cnnensemble/src/second_stage.py", line 273, in predict
    xgboost_predictions = xgboost_model.predict(l1_model_results)
  File "/src/zamba/zamba/models/cnnensemble/src/second_stage.py", line 141, in predict
    return self._predict(X)
  File "/src/zamba/zamba/models/cnnensemble/src/second_stage.py", line 201, in _predict
    return self.model.predict_proba(X)
  File "/usr/local/lib/python3.6/dist-packages/xgboost/sklearn.py", line 575, in predict_proba
    ntree_limit=ntree_limit)
  File "/usr/local/lib/python3.6/dist-packages/xgboost/core.py", line 1050, in predict
    self._validate_features(data)
  File "/usr/local/lib/python3.6/dist-packages/xgboost/core.py", line 1308, in _validate_features
    data.feature_names))
ValueError: feature_names mismatch: ['f0', 'f1', 'f2 
...

cmd predict --datapath /path/to/my/new/cameratrap-footage

People will install the tool from PyPi using

pip install cmd

The issue will be resolved when we replace cmd with something... better 😉

Suggestions so far (rest in comments):

cam2chimp
snapchimp
SpeciesSpotter
camtrap.ai
chimply

Use dmytro's soln intro