This is the repository for the following study:
Taylor S.D., Browning D.M. 2022. Classification of Daily Crop Phenology in PhenoCams Using Deep Learning and Hidden Markov Models. Remote Sensing. 14(2):286. https://doi.org/10.3390/rs14020286 [Preprint, Data & Code Archive]
Note the initial version was titled "Deep learning models for identifying crop and field attributes from near surface cameras" and was changed during the review process.
phenocam_data_prep/
generate_site_list.R
- with the phenocamr package make a list of cameras to use in the study tosite_list.csv
.download_phenocam_data.R
- for each camera download the Gcc and and transition dates todata/phenocam_gcc/
.generate_training_image_list.R
- for each camera use the Gcc transition dates to partition each calendar year into distinct periods of senesced, growth, peak, and senescing. Randomly choose images among these periods for each camera to make a list of training images. Creates the filedata/images_for_annotation.csv
. Also creates the filedata/full_image_list.csv
which is is the mid-day image for every available day for all sites insite_list.csv
.download_training_phenocam_images.R
- download all images indata/images_for_annotation.csv
.generate_extra_image_list.R
- for each image indata/images_for_annotation.csv
get the download link of all images from the respective day. These are the ~80k additional images taken from 0900-1500 described in the text. Creates the filedata/extra_images_for_fitting.csv
download_extra_phenocam_images.R
- download all images indata/extra_images_for_fitting.csv
to the folderdata/extra_phenocam_train_images/
download_all_phenocam_images.R
- download all images indata/full_image_list.csv
todata/phenocam_all_images/
train_image_annotation/
imageant_config2.ias
- configuration file for the annotation software, imageant. https://gitlab.com/stuckyb/imageant. Note I used an older version that was is currently availble, and this ias file will not work with the current version. In fact, the one I used is so old it does not have a version number. But it lives at commit 3c2fd39.imageant_session2.csv
- this is a session file for imageantimage_classes.csv
- pairing for annotation numeric and text labels. (eg. dominant cover class 1 = vegetation).merge_new_crop_types.R
- A little needed data munging. See file for details.image_annotations.csv
- The final annotation file from imagant, eg. the file with all the training/validation data labels. This is used in model fitting and final evaluation.
fit_keras_model.py
- VGG16 model fitting. Uses all annotated and "extra" images to fit the model and writes the keras filedata/vgg16_v4_20epochs.h5
. Excluding images due to low class prevalance, the train/test split, and resampling using weights is done here.apply_keres_model.py
- Using the fitted VGG16 model make predictions on everything indata/extra_phenocam_train_images/
anddata/phenocam_all_images/
. Writes those predictions todata/vgg16_v4_20epochs_predictions.csv
classification_postprocessing/
prep_predictions_for_hmm.R
- takes the filedata/vgg16_v4_20epochs_predictions.csv
and preps the predictions for the HMM aspect (seeapply_hmm_model.py
). Creates several filesdata/image_predictions_for_*
.final_processing.R
- Takes output from the HMM model (./data/hmm_output.csv
) and applies the final post-processing steps (see text) producing./data/final_predictions.csv
.postprocessing_tools.R
- helper functions.
E. Hidden Markov Model (HMM)
hmm_stuff/hmm_model_definitions.py
- This describes the HMM models using the pomegranate package. https://pomegranate.readthedocs.io.apply_hmm_model.py
- Applies HMM model to thedata/image_predictions_for_*
files, applies the HMM models, and creates./data/hmm_output.csv
.
analysis/
evaluate_predictions.R
- calculate error metrcis and create manuscript F1/precision/recall figures. This uses all files in the process:data/vgg16_v4_20epochs_predictions.csv
,image_annotations.csv
, and./data/final_predictions.csv
.timeseries_figures.R
- produces the colorful timeseries figures for each site year.site_map_and_table.R
- produces supplemental map and site table.single_image_diagnostic_plots.R
- produces the supplemental figures where prediction probabilites for single images are displayed.
The modelling workflow went as so.
- Determine the needed images and download all of them (A)
- Annotate all the images (B).
- Fit the vgg16 model and predict on the full image dataset (C)
- Apply post processing (D,E).
With the VGG16 output post-processing is in the following order:
prep_predictions_for_hmm.R
apply_hmm_model.py
final_processing.R
- Analize and Visualize (F).
None of the phenocam images are in the repo but can be downloaded with the scripts in phenocam_data_prep/
.
The following files are not in the github repo because they are too large, but can be found in the zenodo repo (https://doi.org/10.5281/zenodo.5579796)
data/vgg16_v4_20epochs.h5
- this is the fitted keras classification model.data/vgg16_v4_20epochs_predictions.csv
- this contains the initial image classifications prior to post-processing.data/final_predictions.csv
- the final predictions after post-processing.
If you'd like to use the predictions in remote sensing models or elsewhere you need the following two files:
data/final_predictions.csv
- the final predictions after post-processing.site_list.csv
- site metadata.
final_predictions.csv
has, for all available sites, a date and predicted status for the three categories. site_list.csv
has the latitude and longitude of all sites, and other metadata from the phenocam database. The files are joined via the phenocam_name
column.
The final_predictions.csv
file has predictions for sites in site_list.csv
, given the constraints described in the paper, through 2021-09-27.