This is the repository for the following study:

Taylor S.D., Browning D.M. 2022. Classification of Daily Crop Phenology in PhenoCams Using Deep Learning and Hidden Markov Models. Remote Sensing. 14(2):286. https://doi.org/10.3390/rs14020286 [Preprint, Data & Code Archive]

Note the initial version was titled "Deep learning models for identifying crop and field attributes from near surface cameras" and was changed during the review process.

File Structure

A. Initial Preparation

phenocam_data_prep/

generate_site_list.R - with the phenocamr package make a list of cameras to use in the study to site_list.csv.
download_phenocam_data.R - for each camera download the Gcc and and transition dates to data/phenocam_gcc/.
generate_training_image_list.R - for each camera use the Gcc transition dates to partition each calendar year into distinct periods of senesced, growth, peak, and senescing. Randomly choose images among these periods for each camera to make a list of training images. Creates the file data/images_for_annotation.csv. Also creates the file data/full_image_list.csv which is is the mid-day image for every available day for all sites in site_list.csv.
download_training_phenocam_images.R - download all images in data/images_for_annotation.csv.
generate_extra_image_list.R - for each image in data/images_for_annotation.csv get the download link of all images from the respective day. These are the ~80k additional images taken from 0900-1500 described in the text. Creates the file data/extra_images_for_fitting.csv
download_extra_phenocam_images.R - download all images in data/extra_images_for_fitting.csv to the folder data/extra_phenocam_train_images/
download_all_phenocam_images.R - download all images in data/full_image_list.csv to data/phenocam_all_images/

B. Training Image Annotation

train_image_annotation/

imageant_config2.ias - configuration file for the annotation software, imageant. https://gitlab.com/stuckyb/imageant. Note I used an older version that was is currently availble, and this ias file will not work with the current version. In fact, the one I used is so old it does not have a version number. But it lives at commit 3c2fd39.
imageant_session2.csv - this is a session file for imageant
image_classes.csv - pairing for annotation numeric and text labels. (eg. dominant cover class 1 = vegetation).
merge_new_crop_types.R - A little needed data munging. See file for details.
image_annotations.csv - The final annotation file from imagant, eg. the file with all the training/validation data labels. This is used in model fitting and final evaluation.

C. Model Fitting and prediction

fit_keras_model.py - VGG16 model fitting. Uses all annotated and "extra" images to fit the model and writes the keras file data/vgg16_v4_20epochs.h5. Excluding images due to low class prevalance, the train/test split, and resampling using weights is done here.
apply_keres_model.py - Using the fitted VGG16 model make predictions on everything in data/extra_phenocam_train_images/ and data/phenocam_all_images/. Writes those predictions to data/vgg16_v4_20epochs_predictions.csv

D. Postprocessing

classification_postprocessing/

prep_predictions_for_hmm.R - takes the file data/vgg16_v4_20epochs_predictions.csv and preps the predictions for the HMM aspect (see apply_hmm_model.py). Creates several files data/image_predictions_for_*.
final_processing.R- Takes output from the HMM model (./data/hmm_output.csv) and applies the final post-processing steps (see text) producing ./data/final_predictions.csv.
postprocessing_tools.R - helper functions.

E. Hidden Markov Model (HMM)

hmm_stuff/hmm_model_definitions.py - This describes the HMM models using the pomegranate package. https://pomegranate.readthedocs.io.
apply_hmm_model.py - Applies HMM model to the data/image_predictions_for_* files, applies the HMM models, and creates ./data/hmm_output.csv.

F. Analayis

analysis/

evaluate_predictions.R - calculate error metrcis and create manuscript F1/precision/recall figures. This uses all files in the process: data/vgg16_v4_20epochs_predictions.csv, image_annotations.csv, and ./data/final_predictions.csv.
timeseries_figures.R - produces the colorful timeseries figures for each site year.
site_map_and_table.R - produces supplemental map and site table.
single_image_diagnostic_plots.R - produces the supplemental figures where prediction probabilites for single images are displayed.

Workflow

The modelling workflow went as so.

Determine the needed images and download all of them (A)
Annotate all the images (B).
Fit the vgg16 model and predict on the full image dataset (C)
Apply post processing (D,E). With the VGG16 output post-processing is in the following order:
1. prep_predictions_for_hmm.R
2. apply_hmm_model.py
3. final_processing.R
Analize and Visualize (F).

Data

None of the phenocam images are in the repo but can be downloaded with the scripts in phenocam_data_prep/.
The following files are not in the github repo because they are too large, but can be found in the zenodo repo (https://doi.org/10.5281/zenodo.5579796)

data/vgg16_v4_20epochs.h5 - this is the fitted keras classification model.
data/vgg16_v4_20epochs_predictions.csv - this contains the initial image classifications prior to post-processing.
data/final_predictions.csv - the final predictions after post-processing.

Using the predictions

If you'd like to use the predictions in remote sensing models or elsewhere you need the following two files:

data/final_predictions.csv - the final predictions after post-processing.
site_list.csv - site metadata.

final_predictions.csv has, for all available sites, a date and predicted status for the three categories. site_list.csv has the latitude and longitude of all sites, and other metadata from the phenocam database. The files are joined via the phenocam_name column.
The final_predictions.csv file has predictions for sites in site_list.csv, given the constraints described in the paper, through 2021-09-27.

multi-label model classification notes

Vgg16 model using 6714 annotated images. 20k train image sample size, 0.2 validation fraction, (224,224) image size.

lr = 0.01, epsilon = 0.1, 50 epochs

in confusion matrices the columns are predicted and rows are true labels

dominant cover

              precision    recall  f1-score   support

     unknown       0.59      0.71      0.65        14
  vegetation       0.90      0.91      0.91       793
     residue       0.67      0.65      0.66       293
        soil       0.57      0.53      0.55       144
        snow       0.90      0.77      0.83        82
       water       0.41      0.81      0.54        16

    accuracy                           0.80      1342
   macro avg       0.67      0.73      0.69      1342
weighted avg       0.81      0.80      0.80      1342

class_description  unknown  vegetation  residue  soil  snow  water
class_description                                                 
unknown                 10           1        3     0     0      0
vegetation               0         725       40    19     5      4
residue                  1          61      190    33     2      6
soil                     0          15       45    77     0      7
snow                     6           1        5     5    63      2
water                    0           0        2     1     0     13

crop type

               precision    recall  f1-score   support

       uknown       0.56      0.71      0.63        14
unknown_plant       0.74      0.63      0.68       327
  large_grass       0.67      0.94      0.78       238
  small_grass       0.78      0.80      0.79       258
        other       0.67      0.54      0.60       188
       fallow       0.38      0.87      0.53        15
         none       0.86      0.75      0.80       302

     accuracy                           0.73      1342
    macro avg       0.67      0.75      0.69      1342
 weighted avg       0.75      0.73      0.73      1342

class_description  uknown  unknown_plant  large_grass  small_grass  other  fallow  none
class_description                                                                      
uknown                 10              1            0            0      0       0     3
unknown_plant           0            205           48           15     22       7    30
large_grass             0              9          223            4      2       0     0
small_grass             0             12           16          206     21       2     1
other                   0             18           38           27    101       1     3
fallow                  0              0            0            0      1      13     1
none                    8             33            7           12      3      11   228

crop status

             precision    recall  f1-score   support

     unknown       0.56      0.71      0.63        14
   emergence       0.70      0.58      0.63       190
      growth       0.80      0.91      0.85       438
     flowers       0.76      0.69      0.72       160
   senescing       0.58      0.64      0.60       138
    senesced       0.56      0.55      0.55       100
     no_crop       0.87      0.78      0.83       302

    accuracy                           0.75      1342
   macro avg       0.69      0.70      0.69      1342
weighted avg       0.75      0.75      0.75      1342

class_description  unknown  emergence  growth  flowers  senescing  senesced  no_crop
class_description                                                                   
unknown                 10          0       1        0          0         0        3
emergence                0        110      53        1          4         2       20
growth                   0         22     400        8          5         3        0
flowers                  0          0      26      110         23         1        0
senescing                0          0      11       25         88        12        2
senesced                 0          5       2        1         28        55        9
no_crop                  8         20       6        0          5        26      237

class value	name	description
0	Emergence	First shoots and/or leaves are visible.
1	Growth State	Plants have several distince leaves and/or tillers visible, but no visible tassels,flowers,or fruit.
2	Tassles or Flowers	Plants have distinct tassels, flowers, or fruit.
3	Senescing	10% or more of visible plants are senescing.
4	Fully Senesced	90% or more of visible plants are senesced.
5	Harvested and/or Plowed Field	Over 50% of the primary field has been harvested or plowed.
6	Snow Covered Field	Over 10% of the camera field of view has snow.
7	Flooded Field	Over 10% of the camera field of view has standing water.
8	Unknown	The image is blurry, obstructed, or otherwise indiscernible.

sdtaylor / phenocamcnn2 Goto Github PK

phenocamcnn2's Introduction

File Structure

A. Initial Preparation

B. Training Image Annotation

C. Model Fitting and prediction

D. Postprocessing

E. Hidden Markov Model (HMM)

F. Analayis

Workflow

Data

Using the predictions

phenocamcnn2's People

Contributors

Stargazers

Watchers

phenocamcnn2's Issues

dominant cover

crop type

crop status

Principal growth stages

manuscript:

rerun everything with:

Other things

organize folders/files

sites with numerous experimental plots

Pasture sites

Other weird ones

Recommend Projects

Recommend Topics

Recommend Org