Code Monkey home page Code Monkey logo

streammetabolizer's Introduction

streamMetabolizer: Models for Estimating Aquatic Photosynthesis and Respiration

! In summer or fall 2023, this package will move from
! https://github.com/USGS-R/streamMetabolizer to
! https://github.com/DOI-USGS/streamMetabolizer.
! Please update your links accordingly.

The streamMetabolizer R package uses inverse modeling to estimate aquatic photosynthesis and respiration (collectively, metabolism) from time series data on dissolved oxygen, water temperature, depth, and light. The package assists with data preparation, handles data gaps during modeling, and provides tabular and graphical reports of model outputs. Several time-honored methods are implemented along with many promising new variants that produce more accurate and precise metabolism estimates.

This package has been described, with special focus on the Bayesian model options, by Appling et al. 2018a. An application to 356 streams across the U.S. is described in Appling et al. 2018b.

Appling, A. P., Hall, R. O., Yackulic, C. B., & Arroita, M. (2018a). Overcoming equifinality: Leveraging long time series for stream metabolism estimation. Journal of Geophysical Research: Biogeosciences, 123(2), 624–645. https://doi.org/10.1002/2017JG004140

Appling, A. P., Read, J. S., Winslow, L. A., Arroita, M., Bernhardt, E. S., Griffiths, N. A., Hall, R. O., Harvey, J. W., Heffernan, J. B., Stanley, E. H., Stets, E. G., & Yackulic, C. B. (2018b). The metabolic regimes of 356 rivers in the United States. Scientific Data, 5(1), 180292. https://doi.org/10.1038/sdata.2018.292

To see the recommended citation for this package, please run citation('streamMetabolizer') at the R prompt.

citation('streamMetabolizer')
## 
## To cite streamMetabolizer in publications, please use:
## 
##   Appling, Alison P., Robert O. Hall, Charles B. Yackulic, and Maite
##   Arroita. “Overcoming Equifinality: Leveraging Long Time Series for
##   Stream Metabolism Estimation.” Journal of Geophysical Research:
##   Biogeosciences 123, no. 2 (February 2018): 624–45.
##   https://doi.org/10.1002/2017JG004140.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     author = {Alison P. Appling and Robert O. {Hall Jr.} and Charles B. Yackulic and Maite Arroita},
##     title = {Overcoming Equifinality: Leveraging Long Time Series for Stream Metabolism Estimation},
##     journal = {Journal of Geophysical Research: Biogeosciences},
##     year = {2018},
##     volume = {123},
##     number = {2},
##     doi = {10.1002/2017JG004140},
##     url = {https://github.com/USGS-R/streamMetabolizer},
##   }

Installation

To install the streamMetabolizer package, use the remotes package (running install.packages('remotes') first if needed). To use remotes::install_github() it is convenient to set a GitHub Personal Access Token (PAT). There are several methods for setting your PATs within R; the simplest is to call `Sys.setenv(GITHUB_PAT=“yyyy”), replacing yyyy with the PAT you established on the GitHub website.

You may first need to install the unitted dependency:

remotes::install_github('appling/unitted')

You can then install the most cutting edge version of streamMetabolizer with this command:

remotes::install_github(
  "USGS-R/streamMetabolizer", # soon to be "DOI-USGS/streamMetabolizer"
  build_vignettes = TRUE)

Software dependencies for Bayesian models

The major dependency for Bayesian models is the rstan package, and installation of that package is rarely as simple as a call to install.packages(). Start at the rstan wiki page for the most up-to-date installation instructions, which differ by operating system.

Getting started

After installing and loading streamMetabolizer, run vignette() in R to see tutorials on getting started and customizing your metabolism models.

vignette(package='streamMetabolizer')
## displays a list of available vignettes

vignette('get_started', package='streamMetabolizer')
## displays an html or pdf rendering of the 'get_started' vignette

You can also view pre-built html versions of these vignettes in the “inst/doc” folder in the source code, e.g., inst/doc/get_started.html, which you can download and then open in a browser.

Development and Maintenance Status

streamMetabolizer is a USGS Archive Research Package: USGS Status

Project funding has ended and our maintenance time is limited, but we do attempt to provide bug fixes and lightweight support as we are able. Submit questions or suggestions to https://github.com/USGS-R/streamMetabolizer/issues.

Contributing

We want to encourage a warm, welcoming, and safe environment for contributing to this project. See CODE_OF_CONDUCT.md for more information.

For technical details on how to contribute, see CONTRIBUTING.md

Development History

streamMetabolizer was developed 2015-2018 with support from the USGS Powell Center (through a working group on Continental Patterns of Stream Metabolism), the USGS National Water Quality Program, and the USGS Office of Water Information.

Model Archive

The following version of R and package dependencies were used most recently to pass the embedded tests within this package. There is no guarantee of reproducible results using future versions of R or updated versions of package dependencies; however, we aim to test and update future modeling environments.

sessioninfo::session_info()

## ─ Session info ───────────────────────────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.2.3 (2023-03-15)
##  os       macOS Ventura 13.4.1
##  system   x86_64, darwin17.0
##  ui       RStudio
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       America/New_York
##  date     2023-07-02
##  rstudio  2023.06.0+421 Mountain Hydrangea (desktop)
##  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────
##  package           * version  date (UTC) lib source
##  cli                 3.6.1    2023-03-23 [1] CRAN (R 4.2.0)
##  deSolve             1.35     2023-03-12 [1] CRAN (R 4.2.0)
##  digest              0.6.32   2023-06-26 [1] CRAN (R 4.2.0)
##  dplyr               1.1.2    2023-04-20 [1] CRAN (R 4.2.0)
##  evaluate            0.21     2023-05-05 [1] CRAN (R 4.2.0)
##  fansi               1.0.4    2023-01-22 [1] CRAN (R 4.2.0)
##  fastmap             1.1.1    2023-02-24 [1] CRAN (R 4.2.0)
##  generics            0.1.3    2022-07-05 [1] CRAN (R 4.2.0)
##  glue                1.6.2    2022-02-24 [1] CRAN (R 4.2.0)
##  htmltools           0.5.5    2023-03-23 [1] CRAN (R 4.2.0)
##  knitr               1.43     2023-05-25 [1] CRAN (R 4.2.0)
##  LakeMetabolizer     1.5.5    2022-11-15 [1] CRAN (R 4.2.0)
##  lazyeval            0.2.2    2019-03-15 [1] CRAN (R 4.2.0)
##  lifecycle           1.0.3    2022-10-07 [1] CRAN (R 4.2.0)
##  lubridate           1.9.2    2023-02-10 [1] CRAN (R 4.2.0)
##  magrittr            2.0.3    2022-03-30 [1] CRAN (R 4.2.0)
##  pillar              1.9.0    2023-03-22 [1] CRAN (R 4.2.0)
##  pkgconfig           2.0.3    2019-09-22 [1] CRAN (R 4.2.0)
##  plyr                1.8.8    2022-11-11 [1] CRAN (R 4.2.0)
##  purrr               1.0.1    2023-01-10 [1] CRAN (R 4.2.0)
##  R6                  2.5.1    2021-08-19 [1] CRAN (R 4.2.0)
##  Rcpp                1.0.10   2023-01-22 [1] CRAN (R 4.2.0)
##  rLakeAnalyzer       1.11.4.1 2019-06-09 [1] CRAN (R 4.2.0)
##  rlang               1.1.1    2023-04-28 [1] CRAN (R 4.2.0)
##  rmarkdown           2.22     2023-06-01 [1] CRAN (R 4.2.0)
##  rstudioapi          0.14     2022-08-22 [1] CRAN (R 4.2.0)
##  sessioninfo         1.2.2    2021-12-06 [1] CRAN (R 4.2.0)
##  streamMetabolizer * 0.12.1   2023-07-02 [1] local
##  tibble              3.2.1    2023-03-20 [1] CRAN (R 4.2.0)
##  tidyr               1.3.0    2023-01-24 [1] CRAN (R 4.2.0)
##  tidyselect          1.2.0    2022-10-10 [1] CRAN (R 4.2.0)
##  timechange          0.2.0    2023-01-11 [1] CRAN (R 4.2.0)
##  unitted             0.2.9    2023-06-05 [1] Github (appling/unitted@d1f1172)
##  utf8                1.2.3    2023-01-31 [1] CRAN (R 4.2.0)
##  vctrs               0.6.3    2023-06-14 [1] CRAN (R 4.2.0)
##  xfun                0.39     2023-04-20 [1] CRAN (R 4.2.0)
##  yaml                2.3.7    2023-01-23 [1] CRAN (R 4.2.0)
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library

Disclaimer

This software is preliminary or provisional and is subject to revision. It is being provided to meet the need for timely best science. The software has not received final approval by the U.S. Geological Survey (USGS). No warranty, expressed or implied, is made by the USGS or the U.S. Government as to the functionality of the software and related material nor shall the fact of release constitute any such warranty. The software is provided on the condition that neither the USGS nor the U.S. Government shall be held liable for any damages resulting from the authorized or unauthorized use of the software.

streammetabolizer's People

Contributors

aappling-usgs avatar appling avatar arroita avatar jesse-ross avatar jpadilla-usgs avatar ldecicco-usgs avatar waffle-iron avatar weisoon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

streammetabolizer's Issues

allow K600 to be specified as instantaneous values

You can currently (soon) specify K600 as daily values in metab_mle. but why not also allow instantaneous values, given that K600 will probably be predicted from Q with the nighttime regression method, anyway? this would be cool but is nonessential, so marking as a task for down the road.

clean up use of mm_model_by_ply

The same pattern applies, with minor modifications, to all descendants of metab_model and other uses of mm_model_by_ply. Here's the formatting I'll use:

#the metab_model function has args:
#' @inheritParams metab_model_prototype
#' @inheritParams mm_is_valid_day
#' @inheritParams model_specific_fun
metab_model <- function(
  data, data_daily, info, day_start=xx, day_end=xx, # inheritParams metab_model_prototype
  tests=c('full_day', 'even_timesteps', 'complete_data'), # inheritParams mm_is_valid_day
  model-specific args # inheritParams model_specific_fun
) {
  ...
  metab_model("metab_xxx", ...
    args=list(day_start=day_start, day_end=day_end, tests=tests, model-specific args) ...)
  ...
}

#it calls mm_model_by_ply with:
mm_model_by_ply(
  model_fun, data=data, data_daily=data_daily, # for mm_model_by_ply
  day_start=day_start, day_end=day_end, # for mm_model_by_ply and mm_is_valid_day
  tests=tests, # for mm_is_valid_day
  model-specific args # for model_specific_fun
)

#the model_fun function has args:
#' @inheritParams mm_model_by_ply_prototype
#' @inheritParams mm_is_valid_day
#' @inheritParams model_specific_fun
model_fun <- function(
  data_ply, data_daily_ply, day_start=xx, day_end=xx, local_date, # inheritParams mm_model_by_ply_prototype
  tests=c('full_day', 'even_timesteps', 'complete_data'), # inheritParams mm_is_valid_day
  model-specific args # inheritParams model_specific_fun
) {
  ...
  model_specific_fun(model-specific args)
  ...
}

This all gets simplified when mm_is_valid_day is not called and the first function is not a model constructor.

#the multi-day function has args:
#' @inheritParams model_specific_fun
metab_model <- function(
  data, data_daily, day_start=xx, day_end=xx, # inheritParams metab_model_prototype
  model-specific args # inheritParams model_specific_fun
) {
  ...
  metab_model("metab_xxx", ...
    args=list(day_start=day_start, day_end=day_end, model-specific args) ...)
  ...
}

#it calls mm_model_by_ply with:
mm_model_by_ply(
  model_fun, data=data, data_daily=data_daily, # for mm_model_by_ply
  day_start=day_start, day_end=day_end, # for mm_model_by_ply
  model-specific args # for model_specific_fun
)

#the model_fun function has args:
#' @inheritParams mm_model_by_ply_prototype
#' @inheritParams model_specific_fun
model_fun <- function(
  data_ply, data_daily_ply, day_start=xx, day_end=xx, local_date, # inheritParams mm_model_by_ply_prototype
  model-specific args # inheritParams model_specific_fun
) {
  ...
  model_specific_fun(model-specific args)
  ...
}

runjags and rjags installation

On Travis-CI, e.g., https://travis-ci.org/USGS-R/streamMetabolizer/builds/70088636#L1478:

* installing *source* package ‘runjags’ ...
** package ‘runjags’ successfully unpacked and MD5 sums checked
checking for prefix by checking for jags... no
configure: error: "Location of JAGS headers not defined. Use configure arg '--with-jags-include' or environment variable 'JAGS_INCLUDE'"
ERROR: configuration failed for package ‘runjags’
* removing ‘/usr/local/lib/R/site-library/runjags’

saw similar issues with rjags on Condor before I [temporarily] removed it.

align default dates for all inter-operating models

e.g., for metab_mle, we may be using 6am to 6am for multiple days in a row. if this is the case, then the date for a metabolism estimate should refer to the 6am-12am period of that day, and the date for a separately estimated K600 estimate (e.g., from metab_night) should refer to the ~8pm-12am period of that night (i.e., the first rather than the last date represented by local.time for a time series used in metab_night).

add K600 argument to metab_mle

K600 can be a ts-style data.frame with Date as the first column. Needs to be passed through mm_model_by_ply to mle_1ply, filtered there to the date of the ply, and passed as a single number to the negloglik function. if it's possible to share code between the PRK and PR versions of metab_mle, that'll be great

data interpolation functions

data interpolation is currently outside the scope of this package; users need to provide pre-interpolated data for the models to work. (see thread in #79)

  1. it would be nice to help users with interpolation by providing functions to do it well. some functionality is already in mda.streams (e.g., combine_ts) and could be ported over

  2. this is a problem that has been solved for many other cases already; there's probably a good existing package for it. look into existing options before getting fancy.

Create html_vignette for basic usage

This is probably lower priority, but once we have a full pass-through of data, it would be good to start documenting the patterns in a vignette.

calc_DO_mod

port from core_model_metab_mle. create wrapper and/or shared internal function to keep this function efficient but also make it possible to add noise (observation, process, or autocorrelated process) for simulating data

Include NA rows when there are missing data

Here what I do to make sure the time lag from row to row is constant. Let me know if you want me to modify it or check the data sets! Hope it helps.

#Just to use an example
library(powstreams)
t<-"nwis_07239450"
d<-c("depth_calcDischHarvey","doobs_nwis","dosat_calcGGbts","wtr_nwis","par_calcLat")
dat<-get_ts(d,t)

#Create reference date and time column. I used POSIXct format because this is the format of the data set
#For each stream starting and ending dates and frequence should be checked.
library(chron)
dtime<-seq(from=as.POSIXct("2007-10-01 05:00:00",tz="GMT"),to=as.POSIXct("2015-06-06 23:30:00",tz="GMT"),by="30 min")   

#I checked the dimensions for both and it seems in this particular case there are a lot of rows missing
Plotting the DateTime column works too
dim(dat)
length(dtime)
plot(dat$DateTime)

#Match DateTime column in your data set with the reference dtime column
depth<-dat$depth[match(dtime,dat$DateTime)]
doobs<-dat$doobs[match(dtime,dat$DateTime)]
#....

#I usually create a new data frame. It's also great to check it matches the number of rows you expect considering the number of days and frequency you have
dat<-data.frame(dtime,depth,doobs...)

implement hierarchical bayesian options

(expansion on issue #56 for hierarchical bayesian models)

All the hierarchical models I think we might implement include an expectation for a distribution of daily values of GPP, ER, and K600. They therefore usually include these lines:

for(d in 1:nday) {
  GPP.daily ~ dnorm(GPP.daily.mu, GPP.daily.tau)
  ER.daily ~ dnorm(ER.daily.mu, ER.daily.tau)
  K600.daily ~ dnorm(K600.daily.mu, K600.daily.tau)
}

For non-hierarchical models, GPP.daily.mu, GPP.daily.tau, etc. are constants supplied by the user, and so the above lines can be the end of the model description.

For hierarchical variants, additional lines need to be included.

  • Constrain overall average P, R, and/or K, i.e., fit values for the probability distributions of GPP.daily.mu and/or GPP.daily.tau, etc.
# constrain the means
GPP.daily.mu ~ dnorm(GPP.daily.mu.mu, GPP.daily.mu.tau)
ER.daily.mu ~ dnorm(ER.daily.mu.mu, ER.daily.mu.tau)
K600.daily.mu ~ dnorm(K600.daily.mu.mu, K600.daily.mu.tau)
# and/or constrain the taus
GPP.daily.tau ~ dgamma(GPP.daily.tau.r, GPP.daily.tau.lambda)
ER.daily.tau ~ dgamma(ER.daily.tau.r, ER.daily.tau.lambda)
K600.daily.tau ~ dgamma(K600.daily.tau.r, K600.daily.tau.lambda)
  • Constrain day-to-day variation in P, R, and/or K, i.e., specify a normal distribution with mu=0 for the diffs between consecutive days, e.g., (GPP.daily[d] - GPP.daily[d-1])
for(d in 2:nday) {
  GPP.daily.diff[d-1] <- GPP.daily[d] - GPP.daily[d-1]
  GPP.daily.diff[d-1] ~ dnorm(0, GPP.daily.diff.tau)
  # and so on for ER and/or K600
}
  • Constrain K to be near the daily K values estimated by e.g. D_OLS__K, i.e., accept daily prior means for K600.daily (K600.daily.mu[d]) and either set or fit the value of K600.daily.tau for K600.daily ~ dnorm(K600.daily.mu[d], K600.daily.tau)
# replace the first code chunk in this issue comment with these lines:
for(d in 1:nday) {
  GPP.daily ~ dnorm(GPP.daily.mu, GPP.daily.tau)
  ER.daily ~ dnorm(ER.daily.mu, ER.daily.tau)
  # K600.daily ~ dnorm(K600.daily.mu, K600.daily.tau) # replace this with the following:
  K600.daily.fQ <- K600.A + K600.B*Q[d]
  K600.daily ~ dnorm(K600.daily.fQ, K600.daily.tau)
  # K600.A and K600.B are K v Q coefficients that are either given as inputs or are fit here e.g. by 
  #   K600.A ~ dnorm(K600.A.mu, K600.A.tau)
  #   K600.B ~ dnorm(K600.B.mu, K600.B.tau)
}

Could do any combination of the above. Choosing among these options requires selecting different values for:

  • the JAGS txt file, which specifies the hierarchy to assume. See the code chunks above.
  • the constants and/or daily values supplied to metab_bayes, then passed to prepjags_bayes_simple, then included in the jags dataList, and ultimately made available within the JAGS model, because the priors differ for each option. The necessary priors are implied by the above code chunks.
  • the outputs stored in the metab_bayes@fit slot, because different parameters are fit in each option. The useful outputs will include any parameters that are fit in a given variant.

function[s] to interpolate K600

model_K600 - produce some model to predict daily K from a smaller number of daily K estimates (e.g., from metab_night) - start with a spline of K v Q with mean(K) replacing Q bins with few observations.

calc_K600_interp - accept an output from model_K600 and a ts of discharge and/or dates; produce a ts of daily K600 estimates

alternatively, calc_K600_interp could do it all, first model and then predict. this would be simpler for multi-part models like the spline + means, but makes it harder to access the model fits themselves. give option to also return the model[s], maybe?

take advantage of data_daily handling for mm_predict_1ply

mm_predict_1ply currently picks out the right row from metab_ests. but now that mm_model_by_ply splits both instantaneous and daily data, mm_predict_1ply should be able to pass metab_ests as the daily data and expect the row to be picked out for it.

is mm_is_valid_day::need_complete obsolete?

think about it. it might be, now that mm_data records which columns are optional.

if it's not, then more functions ought to be accepting and passing this argument. metab_bayes, metab_mle, etc. would be candidates.

switch from local.time to solar.time

doesn't actually matter which time we use as long as it has regular time steps (i.e., not apparent solar time) and is interpretable (UTC time is a poor choice because in plots/tables you'll have to guess at the timing of peak sun) and comes close to representing solar time (because model_by_ply splits data up into days according to the hours specified in this time column).

both local.time(standard) and solar.time(mean) are decent candidates, but solar.time is best because it's even more closely aligned to true solar time. users will generally have to calculate it anyway in order to model light.

does this mean we should also switch from 'local.date' to 'solar.date'? or just 'date', maybe.

limiting dependencies?

How careful do we want to be about additional dependencies? For example, in PR #34 I just added lubridate to aid in a single function, but I could probably remove that dependency if needed. Not sure yet whether lubridate will be useful in functions we'll be writing later.

handle tryCatch warnings better

I've learned how to catch warnings better - something about muffleWarnings. this needs to be done in mle_1ply and bayes_simple_1ply in particular.

calc_depth.R

Need a function that takes discharge (or something fancier, later) and returns an estimate of depth in m.

travis & appveyor issues

as of 6a3afb2:

appveyor is currently failing because unitted has that unitted_ordered S4 problem in the new (devel?) version[s] of R.

travis is failing here, in a testthat check: https://travis-ci.org/USGS-R/streamMetabolizer#L7019

Running the tests in ‘tests/testthat.R’ failed.
Last 13 lines of output:
         time.type = "standard"))
  13: with_tz(local.time, "GMT")
  14: is.POSIXlt(time)
  15: is(x, "POSIXlt")
  16: convert_GMT_to_localtime(adate, latitude = 40, longitude = -103.8, time.type = "standard")
  17: stop("sorry, could not find time zone for specified lat/long")

This works on my computer, so I suspect it has to do with the call from a Travis process to Google to get the local time zone.

plan interface for the suite of expected models

(This issue will be modified as I continue to think about it)

Desired models:

  • Day-by-day MLE with observation error to estimate P+R+K
  • Day-by-day Bayes with observation error to estimate P+R+K
  • Day-by-day MLE with process error to estimate P+R+K
  • Day-by-day Bayes with process error to estimate P+R+K
  • Nighttime regression by OLS to estimate K
  • Bayesian hierarchical approach to estimate K vs Q function?
  • Day-by-day MLE with observation error to estimate P+R given K
  • Day-by-day Bayes with observation error to estimate P+R given K
  • Day-by-day MLE with process error to estimate P+R given K
  • Day-by-day Bayes with process error to estimate P+R given K
  • Hierarchical Bayes with observation error to estimate P+R+K with which hierarchy? So many options. Could do any combination of the following. See issue #57 for more.
    • Constrain overall average (mean and/or tau) P, R, and/or K.
    • Constrain day-to-day variation in P, R, and/or K.
    • Constrain K to be near the daily K values estimated by nighttime regression

Options shared across MLE, Bayesian models

  • observation vs process error: calc_DO_fun = c('calc_DO_mod', 'calc_DO_mod_by_diff')
  • date delineation: c(start_hour, end_hour)
  • if taking K as given, then ts of K values should be supplied as an arg to metab_xxx

MLE models: metab_mle

  • constant parameters: inits are c(GPP=3, ER=-5, K600=5)
  • if taking K as given, then we should use a variant on onestation_negloglik that doesn't expect K600.daily among the params

Bayesian models: metab_bayes

  • if non-hierarchical (independent days), constant parameters: DO.err.tau.shape=0.001, DO.err.tau.rate=0.001, GPP.daily.mu = 10, GPP.daily.tau = 1/(10^2), ER.daily.mu = -10, ER.daily.tau = 1/(10^2), K600.daily.mu = 10, K600.daily.tau = 1/(10^2)
  • if hierarchical, use mm_model_by_ply to produce new input data with non-overlapping (partially copied) plys
  • if taking K as given, then we should use variants on prepjags_bayes_simple and runjags_bayes_simple that don't expect K600.daily among the params to estimate

Questions

  • Is it OK to have overlapping days when modeling consecutive days? Does it matter whether the model is distinct for each day vs hierarchical using the distribution of daily estimates?
  • What to do about hierarchical models for which we have missing days? Can we ignore that there are gaps?

simulate ER to compare the effectiveness of different daily time windows

See #56. From that thread, here's a copy of Bob's thoughts on simulating ER data to evaluate the utility of ~31 hour days in metabolism estimation:

I think we should base how we solve for ER based on data and not a hunch one way or the other. The thing to do would be to be to generate a month long time series with varying and known ER and then try both approaches and see which gives back the best ER. The key is how to generate the fake data? We do not want ER varying randomly. Maybe allow it to wander. Or put a shock in (say a flood) lowering ER and then it recovers.

One thing about ER is that it particularly tricky to measure. Unlike GPP which is a relative change in O2, ER is absolute difference. So from an estimation perspective it is probably best to use both nights to get more data. Yes it adds autocorrelation, but if ER is not biologically auto correlated, then we have big problems. The one way that we could make a mistake with using both nights is in high GPP streams where the ER on any one night is a function of GPP the day before, so that daily variation in ER responses to the daily variation in GPP. Furthering the problem is that ER will change through the night as the stream temp changes or as the yummy carbon from the day's photosynthesis gets eaten up. And ER during the day might be 2-10 time higher than the night, but there is not much we can do about that.

Hmm, that might be a way to vary ER with fake data, vary GPP and make ER a fraction of GPP above some base, as did Hall and Beaulieu.

Model to simulate data

This might not make sense, but question: I agree that simulated data should be created including the process error, but can the model used to create the data influence the results of the comparison between this model and another (the one without process error). I mean, could someone say results obtained with model A are better because data were created with model A? This is why the objective of
confirming we’ve made the right assumption about the presence of process error in real data was included.

Need to figure out what phi should be to create simulated data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.