Code Monkey home page Code Monkey logo

phenology_estimators's People

Contributors

sdtaylor avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

phenology_estimators's Issues

new readme file

Estimating transition dates from status-based phenology observations: a test of methods

The is the code reposity for the following manuscript:

Taylor SD. 2019. Estimating transition dates from status-based phenology observations: a test of methods. PeerJ Preprints 7:e27629v1 https://doi.org/10.7287/peerj.preprints.27629v1

File structure

data/:
This holds the original data from the Waananen et. al. 2018 study, obtained from https://doi.org/10.5061/dryad.487db24.

derived_data/:
This holds the phenology data prepared for the analysis, and the output from the estimators. Due to size this data is not in the GitHub repo, but is in the repository (https://doi.org/10.5281/zenodo.2616771)

manuscript/:
The manuscript Rmarkdown files and figures.

tests/:
Tests to ensure the code in the estimators.R script are functioning correctly, as well as some small example data to run the tests with.

estimators.R
'weibull.R'
These 2 scripts contain the functions for all estimators.

initial_data_formatting_individual.R
initial_data_formatting_population.R
These 2 scripts generate the phenology data used by the estimators. Taking the raw data from data and generating the random Monte Carlo samples with the parameters specified in the text. They generate the following files:

- derived_data/population_flowering_data_for_estimators.csv
- derived_data/individual_flowering_data_for_estimators.csv
- derived_data/population_true_flowering_dates.csv
- derived_data/individual_true_flowering_dates.csv

run_estimators_individual.R
run_estimators_population.R
These 2 scripts apply the estimators to the derived data, and write the following result files:

- population_results_from_estimators.csv
- individual_results_from_estimators.csv

analysis_and_figures.R
analysis_proportion_of_obs_kept.R
'analysis_gam_logistic_supplement.R'
These 3 scripts generate all the figures and statistics in the analysis, including supplements.

install_packages.R
This script installs the packages used throughout the analysis.

config.R
The configuration file specifies the parameters used the Monte Carlo analysis as well as specifying all file names.

Required packages

The following R packages are used in this analysis and can be installed by running the install_packages.R script.

  • tidyverse
  • mgcv
  • survival
  • testthat
  • ggridges
  • progress

Running the analysis

To run the analysis from scratch, run the scripts in the following order:

initial_data_formatting_individual.R
initial_data_formatting_population.R
run_estimators_individual.R
run_estimators_population.R
analysis_and_figures.R
analysis_proportion_of_obs_kept.R

This will take from 12-24 hours in total to run depending on the system. To decrease this time set the bootstrap amounts in config.R to something smaller, such as 15 for the population and 3 for the individual.

Alternatively, you can obtain the results from the computationally intensive steps from the Zenodo repository (https://doi.org/10.5281/zenodo.2616771) in the derived_data folder. With these in place run analysis_and_figures.R and analysis_proportion_of_obs_kept.R to generate the figures.

address First Observed being the best sometimes at Individual level

It has the best estimate for sample size 20, percent present 75%. This is because 15 sample covers nearly the entire flowering period for a lot of individuals, so the earliest of those is super close.

"Average duration is XXX, so with an effective sample size of 15 the true onset date can be very accuratly estimates"
"If you have very dense sampling than estimating onset is not a problem..."

time to run things

to run thru the full analysis on serenity with the following config, 15 min

population_num_bootstraps = 15
individual_num_bootstraps = 3

note on expanding intro/discussion

This discussion is nicely written, however - I think it would be beneficial to delve a bit more into why, exactly, it matters to have accurate estimates of distinct transition dates. What aspects of a plants adaptation to local environment, or ability to adjust to future climate change, can be gleaned from accurate information about its phenology? and perhaps severely misinterpreted if we have inaccurate information? I believe the Miller-Rushing paper delves into ecological and evolutionary importance of correct phenological estimates a bit, as do some of your other references, and I think it would be beneficial to convince readers of the importance of accurate estimates a bit more here and in the introduction... We know this is a big deal - but not everyone does. You could also ponder and write about: if a method overestimates a phenological stage by 10 days, then so what? What exactly does that mean for our interpretation of the data and the state of our science? This explanation will help place this important work in the larger context of global change ecology and evolutionary ecology!

gam/logistic threshold notes

Logistic won in

onset: sample size 50, percent yes 0.25, threshold 0.50

end: sample size 50, percent yes 0.25, threshold 0.25
sample size 50, percent yes 0.5, threshold 0.50
sample size 100, percent yes 0.5, threshold 0.50, tied with GAM

GAM won in

onset: sample size 100, percenet yes 0.25, threshold 0.25

end: sample size 100, percent yes 0.25, threshold 0.05
sample size 100, percent yes 0.5, threshold 0.05 tied with logistic

2nd resubmission todo

  • Put A - I labels on all plots
  • knock down the DPI so they're < 3000 pixels across
  • reference new A-I labels in text

maybe some figure adjustments

J suggested:

  • Larger text peeps can zoom
  • no grid lines need grid lines so the x axis can be aligned with top most figs
  • white background check
  • borders on all panels unclear how to do with with density_ridges

TODO

  • finalize abstract
  • normalize which side the errors are on for all figs
  • figure showing how many estimates were dropped
  • software citation paragraph
  • end estimates First Observed -> Last Observed
  • deal with wiggly peak error density
  • underestimate/overestimate arrows
  • figure "Percent Yes" -> "Presence Percent"
  • figure "Weibull Curve" -> "Weibull"
  • put figures in
  • run thru grammar checker
  • make markdown/pdf version
  • make zenodo repo
  • italicise spp names
  • R2 -> R^2
  • order sample size in figures
  • supp figures
  • remove titles from figures

preprint version

  • 10pt font
  • single space
  • no line numbers
  • sup figures at end

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.