phenology_estimators's People
phenology_estimators's Issues
new readme file
Estimating transition dates from status-based phenology observations: a test of methods
The is the code reposity for the following manuscript:
Taylor SD. 2019. Estimating transition dates from status-based phenology observations: a test of methods. PeerJ Preprints 7:e27629v1 https://doi.org/10.7287/peerj.preprints.27629v1
File structure
data/
:
This holds the original data from the Waananen et. al. 2018 study, obtained from https://doi.org/10.5061/dryad.487db24.
derived_data/
:
This holds the phenology data prepared for the analysis, and the output from the estimators. Due to size this data is not in the GitHub repo, but is in the repository (https://doi.org/10.5281/zenodo.2616771)
manuscript/
:
The manuscript Rmarkdown files and figures.
tests/
:
Tests to ensure the code in the estimators.R
script are functioning correctly, as well as some small example data to run the tests with.
estimators.R
'weibull.R'
These 2 scripts contain the functions for all estimators.
initial_data_formatting_individual.R
initial_data_formatting_population.R
These 2 scripts generate the phenology data used by the estimators. Taking the raw data from data
and generating the random Monte Carlo samples with the parameters specified in the text. They generate the following files:
- derived_data/population_flowering_data_for_estimators.csv
- derived_data/individual_flowering_data_for_estimators.csv
- derived_data/population_true_flowering_dates.csv
- derived_data/individual_true_flowering_dates.csv
run_estimators_individual.R
run_estimators_population.R
These 2 scripts apply the estimators to the derived data, and write the following result files:
- population_results_from_estimators.csv
- individual_results_from_estimators.csv
analysis_and_figures.R
analysis_proportion_of_obs_kept.R
'analysis_gam_logistic_supplement.R'
These 3 scripts generate all the figures and statistics in the analysis, including supplements.
install_packages.R
This script installs the packages used throughout the analysis.
config.R
The configuration file specifies the parameters used the Monte Carlo analysis as well as specifying all file names.
Required packages
The following R packages are used in this analysis and can be installed by running the install_packages.R
script.
- tidyverse
- mgcv
- survival
- testthat
- ggridges
- progress
Running the analysis
To run the analysis from scratch, run the scripts in the following order:
initial_data_formatting_individual.R
initial_data_formatting_population.R
run_estimators_individual.R
run_estimators_population.R
analysis_and_figures.R
analysis_proportion_of_obs_kept.R
This will take from 12-24 hours in total to run depending on the system. To decrease this time set the bootstrap amounts in config.R
to something smaller, such as 15 for the population and 3 for the individual.
Alternatively, you can obtain the results from the computationally intensive steps from the Zenodo repository (https://doi.org/10.5281/zenodo.2616771) in the derived_data
folder. With these in place run analysis_and_figures.R
and analysis_proportion_of_obs_kept.R
to generate the figures.
address First Observed being the best sometimes at Individual level
It has the best estimate for sample size 20, percent present 75%. This is because 15 sample covers nearly the entire flowering period for a lot of individuals, so the earliest of those is super close.
"Average duration is XXX, so with an effective sample size of 15 the true onset date can be very accuratly estimates"
"If you have very dense sampling than estimating onset is not a problem..."
time to run things
to run thru the full analysis on serenity with the following config, 15 min
population_num_bootstraps = 15
individual_num_bootstraps = 3
note on expanding intro/discussion
This discussion is nicely written, however - I think it would be beneficial to delve a bit more into why, exactly, it matters to have accurate estimates of distinct transition dates. What aspects of a plants adaptation to local environment, or ability to adjust to future climate change, can be gleaned from accurate information about its phenology? and perhaps severely misinterpreted if we have inaccurate information? I believe the Miller-Rushing paper delves into ecological and evolutionary importance of correct phenological estimates a bit, as do some of your other references, and I think it would be beneficial to convince readers of the importance of accurate estimates a bit more here and in the introduction... We know this is a big deal - but not everyone does. You could also ponder and write about: if a method overestimates a phenological stage by 10 days, then so what? What exactly does that mean for our interpretation of the data and the state of our science? This explanation will help place this important work in the larger context of global change ecology and evolutionary ecology!
gam/logistic threshold notes
Logistic won in
onset: sample size 50, percent yes 0.25, threshold 0.50
end: sample size 50, percent yes 0.25, threshold 0.25
sample size 50, percent yes 0.5, threshold 0.50
sample size 100, percent yes 0.5, threshold 0.50, tied with GAM
GAM won in
onset: sample size 100, percenet yes 0.25, threshold 0.25
end: sample size 100, percent yes 0.25, threshold 0.05
sample size 100, percent yes 0.5, threshold 0.05 tied with logistic
peak estimate errors
add stuff about other taxon
Obviously this same idea applies to birds, and insects (bee paper here does first observed https://doi.org/10.1111/gcb.14358, and moths https://doi.org/10.1371/journal.pone.0202850)
2nd resubmission todo
- Put A - I labels on all plots
- knock down the DPI so they're < 3000 pixels across
- reference new A-I labels in text
maybe some figure adjustments
J suggested:
Larger textpeeps can zoomno grid linesneed grid lines so the x axis can be aligned with top most figswhite backgroundcheckborders on all panelsunclear how to do with with density_ridges
TODO
- finalize abstract
- normalize which side the errors are on for all figs
- figure showing how many estimates were dropped
- software citation paragraph
- end estimates First Observed -> Last Observed
- deal with wiggly peak error density
- underestimate/overestimate arrows
- figure "Percent Yes" -> "Presence Percent"
- figure "Weibull Curve" -> "Weibull"
- put figures in
- run thru grammar checker
- make markdown/pdf version
- make zenodo repo
- italicise spp names
- R2 -> R^2
- order sample size in figures
- supp figures
- remove titles from figures
preprint version
- 10pt font
- single space
- no line numbers
- sup figures at end
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.