bfast2 / bfast Goto Github PK
View Code? Open in Web Editor NEWBreaks For Additive Season and Trend
Home Page: https://bfast2.github.io
License: GNU General Public License v2.0
Breaks For Additive Season and Trend
Home Page: https://bfast2.github.io
License: GNU General Public License v2.0
The current plot method in plot.bfast
shows the components one by one, but there is no way to plot everything in one plot. It would be very convenient to have that.
Here's a small snippet of my custom code that produces pretty nice results:
MyTS = bfastts(ChangedEVITS[9,], dates, "10-day") # Some time series
MyTS = na.approx(MyTS) # This step is optional, just for break-free visualisation
MyBF = bfast(MyTS, GetBreakNumber(dates), season="harmonic", max.iter=3)
MyIter = length(MyBF$output)
MyFit = MyBF$output[[MyIter]]$St + MyBF$output[[MyIter]]$Tt
MyTrend = MyBF$output[[MyIter]]$Tt
MyTimes = as.numeric(time(MyBF$output[[MyIter]]$Tt))
MyTimes = MyTimes[!is.na(MyBF$output[[MyIter]]$Tt)]
MyBreaks = MyTimes[MyBF$output[[MyIter]]$bp.Vt$breakpoints]
plot(MyTS); lines(MyFit, col="blue"); lines(MyTrend, col="green"); abline(v=MyBreaks, col="red")
What's missing is a legend. Suggestions are also welcome!
There is lack of documentation in ?bfastpp
about how external regressors are handled. It is described, but an example of how to use them is missing.
It seems that there has been enough extensive testing of Marius' code that the fast options should be enabled by default, with the ability to disable them if needed.
This depends on fixing the upstream issue appelmar/bfast#2, however.
The description at the top is still from when it moved to GitHub and is no longer relevant. Suggestions welcome!
Make an overview of the latest pending changes as a tutorial, so then we don't need a big list of changes but we have it more visual. Useful also for AEO as a tutorial.
bfastpp produces output where season
is a factor. I think the original idea is that seasonality should be repetitive, and thus every year you have the same factors, right? But that breaks down if you have a non-integer frequency of the ts
object, because then the season
each year is different, therefore each point becomes unique, which is not useful at all...
But why not just leave it as a number? A season of 1.6 is valid as far as I can tell (the season is in between 1 and 2 of the first year). Or just use DOY altogether?
Is there an easy way to get the RMSE (of the residuals) out, as an extra band (returnLayers) in Bfastmonitor? Many thanks!!!
Equivalent of bfast2/strucchangeRcpp#18 in bfast01, with a trivial reproducer:
library(bfast)
bfast01(data = Nile)
#> Error in if (is.nan(p0) || p0 < a2 || p0 > (1 - a2)) {: missing value where TRUE/FALSE needed
Created on 2020-03-27 by the reprex package (v0.3.0)
Blame:
Line 462 in f58ba97
But really should just depend on strucchange to not duplicate code.
goodpractice package, pkgdown (documentation) en rhub (for testing on different systems)
The master branch should be stable and aimed at CRAN submission, and other branches should be implementing new features (potentially breaking CRAN requirements).
Right now, running fitted()
on a bfast
object returns NULL
(that's returned for any unknown class). It would be better if it returned x$output[[iteration]]$St + x$output[[iteration]]$Tt
instead.
By default bfastts
will fail with a cryptic message if no type
is supplied. It should either give a reasonable message or assume irregular
(or regular
if #2 is implemented).
We'd want to use e.g. testthat
to test "bad examples" or very basic things that are not well-suited to go into the examples section of function documentation.
Nevertheless, the useful tests should all go into the examples section. Even simple use cases are useful for people to see immediately, so they could run it and see the output. The examples section gets tested by Travis automatically as part of R CMD check
.
See also: #24
This call to Fstat
in bfast01()
causes an error for some inputs:
https://github.com/GreatEmerald/bfast/blob/6e86571601510ea33af4daa16b0e36564995bd54/R/bfast01.R#L156
The error is inadmissable change points: 'from' is larger than 'to'
. Setting trim
to a low number (~7) makes it not give this error any more.
I'd need some input data for reproducing it easier, but overall I'm not sure why that call to Fstat
is there? In the regular bfast, this functionality is handled by bfastts
and bfastpp
instead.
here is the reprex.html of this problem:
http://localhost:31875/session/reprex63e869fb6762/reprex_reprex.html
Right now the docs on all the bfast
-related issues are scattered all across, generating quite a bit of confusion (case in point: https://doi.org/10.1016%2Fj.isprsjprs.2017.06.013 considers bfastMonitor
to be renamed bfast
, which is not true). We could use the wiki here or such to make an overview of the docs and capabilities of each function.
A lot of code is the same between .bfastpp.full()
and .bfastpp.formula()
, but unfortunately not everything. That leads to inconsistent behaviour, e.g. season
is calculated as factor(cycle(y))
in the former and as 1 + seq(as.numeric(cycle(y[1]))-1
in the latter, which leads to incorrect type (numeric instead of the expected factor).
Following DRY principles, the code should be unified.
As an aside but related: bfastpp gives a radically different object when a formula is given, vs when only a ts
is given. It's extremely confusing, as the user normally assumes that giving a formula would simply filter out the undesired columns from the result.
At the moment the sbins parameter is documented but is lacking an example.
Perhaps it's a strucchange issue instead, but bfastmonitor returns the error:
time < start: comparison (3) is possible only for atomic and list types
When given input with a frequency of 23 and monitoring of (I believe) 29 points:
3, 5, 6, 8, 10, 13, 24, 32, 29, 29, 27, 27, 28, 21, 21, 18, 14, 11, 9, 6, 4, 5, 5, 4, 5, 6, 5, 9, 20, 24, 33, 36, 31, 36, 40, 35, 43, 33, 33, 21, 15, 11, 9, 8, 5, 5, NA, 3, 5, 6, 7, 13, 19, 28, 36, 36, 35, 29, 30, 26, 25, 22, 17, 14, 9, 9, 6, 6, 6, 8, NA, 7, 6, 11, 14, 21, 33, 39, 38, 32, 26, 27, 27, 20, 20, 18, 15, 14, 8, 7, 7, 6, 8, NA, 5, 6, 8, 13, 25, 36, 37, 36, 38, 41, 38, 32, 28, 25, 21, 16, 12, 11, 6, 4, 5, NA, NA, 6, 5, 5, 8, 11, 30, 42, 42, 40, 35, 33, 31, 29, 28, 24, 20, 17, 12, NA, 7, 6, 5, 6, 5, 8, 12, 10, 24, 33, 42, 42, 44, 48, 41, 43, 37, 31, 23, 20, 15, NA, 9, 7, 7, 7, 7, 7, 14, 14, 16, 30, 39, 43, 45, 39, 33, 29, 32, 31, 23, 20, 15, 11, 8, NA, 6, 5, NA, NA, NA, 9, 9, 12, 17, 33, 41, 42, 54, 49, 46, 34, 27, 25, 17, 14, 11, 7, 7, 6, 6, 6, 6, 7, 9, 10, 16, 28, 37, 46, 40, 37, 31, 35, 33, 29, 21, 19, 13, 10, 8, NA, 6, 4, NA, NA, NA, 10, 7, 10, 24, 39, 39, 50, 41, 42, 44, 40, 37, 35, 23, 17, 15, 13, 10, 7, 6, 10, NA, 4, 9, 11, 18, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Running bfast01(stl="both") always results in an error:
library(bfast)
mts = ts(1:10, 2000, 2011, frequency=20)
bfast01(mts, stl="both")
#> Error in parse(text = x, keep.source = FALSE): <text>:2:0: unexpected end of input
#> 1: response ~
#> ^
Created on 2019-12-04 by the reprex package (v0.2.0).
This is because the corresponding component is removed from the formula, and "both" removes both of them, resulting in an empty formula.
I don't know what the fix should be: either not allow passing "both", or perhaps not filter the components out.
Some of the documentation generated by Roxygen2 is not consistent with the original documentation in terms of formatting. All docs should be compared against the originals.
The dependency on the raster package is spurious at the moment; it is only used for man pages and to load the example dataset. It should be moved to optional requires.
Though the dependency may come back if we do #3 and integrate bfastSpatial. Or maybe by that time we'll be using stars.
GPL licenses suggest adding a license notice at the top of each source file, like this:
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This file is part of Foobar.
Foobar is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 2 of the License, or
(at your option) any later version.
Foobar is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This is where it is explicitly allowed to use later versions of GPL (the main LICENSE text doesn't specify it). So it's a good idea to add these headers to each file to make sure it's really GPLv2+ and not GPLv2 only.
Relates to #4 as the header can be added along the Roxygen comments.
To update the package on CRAN, there are a few options. The first way, the simplest one, would be:
bfast
package only.This is doable because none of the updates specifically depend on an update to strucchange
. The downside is that for much of the new functionality (e.g. increase in speed and most of the new features, plus bug fixes), the users are very interested in the updates done to strucchange
.
If we want to use the new functionality, we also want to have an update to strucchange
, on which bfast
could depend. Here the roadmap would be one of the two:
bfast2/strucchange
into its own package, e.g. strucchangeArmadillo
strucchangeArmadillo
to CRANbfast2/bfast
to upstream strucchange
bfast
for whether strucchangeArmadillo
is loaded and then if so, use it, otherwise warn users that a slow codepath is being usedbfast
to CRANThis plan has a downside in that splitting the whole codebase of bfast2/strucchange
would take a lot of time that we currently most likely do not have (volunteers welcome). Alternatively:
bfast2/strucchange
to strucchangeArmadillo
and mention that it is a fork of strucchange
at a certain revisionstrucchangeArmadillo
to CRANbfast
from strucchange
to strucchangeArmadillo
bfast
to CRANIt's useful to discuss which way we should choose to proceed.
Since we moved the repository, the README needs to be updated with new instructions about how to install it. Speaking of which, do we need devtools, or is there a more lightweight package that provides install_github()
?
When the time series consists of a straight line, p.Vt <- sctest(efp(Vt ~ ti, h = h, type = type))
returns NaN
, which then fails down the line. bfast should throw an error in case sd(Vt) == 0, and possibly catch any further errors when p.Vt
is invalid.
bfastmonitor silently fails when inputs such as date, or startdate are not logically valid.
Currently the bfast0n function doesn't include the sctest functionality, but it would be useful. Things to think about: what to output if there is no change according to the sctest (a breakpoints
object with breaks=0? NA
? -9999?).
Come up with easier-to-understand names of the existing functions for users (suggested by Andrei). Needs some creative thinking. Of course, the old names should become aliases in that case.
At the moment, bfastpp
assumes that the user wants to estimate one parameter per time step of the input time series for a year. However, that makes it impossible to use season
with h
below one year. In addition, bfastpp
also assumes that the frequency is an integer, which is not necessarily the case.
In some cases, lowering h might actually result in fewer breakpoints being detected. That is quite curious and may be good to check why that happens. An example was provided:
library(bfast)
set.seed(1234567)
opar = par(mfrow=c(1,2))
x <- rnorm(168,sd=1) # a vector without any inserted abrupt change
ts.plot(x)
bfast(ts(x),max.iter=1,season="none")
#> [1] "No seasonal model will be fitted!"
#>
#> TREND BREAKPOINTS: None
#>
#> SEASONAL BREAKPOINTS: None
y <- c(x[1:20],x[21:168]+1.5) # we insert an abrupt change from layer 20 onward
ts.plot(y)
h <- seq(.05,.2,by=0.05) # a sequance of h to use in bfast function
R <- lapply(X=1:length(h), function(i){
bf = bfast(ts(y),h = h[i],max.iter=1,season="none")
if (!is.na(bf$output[[1]]$bp.Vt))
plot(bf$output[[1]]$bp.Vt)
bf
})
#> [1] "No seasonal model will be fitted!"
#> [1] "No seasonal model will be fitted!"
#> Warning in if (!is.na(bf$output[[1]]$bp.Vt)) plot(bf$output[[1]]$bp.Vt):
#> the condition has length > 1 and only the first element will be used
#> [1] "No seasonal model will be fitted!"
#> [1] "No seasonal model will be fitted!"
R
#> [[1]]
#>
#> TREND BREAKPOINTS: None
#>
#> SEASONAL BREAKPOINTS: None
#>
#>
#> [[2]]
#>
#> TREND BREAKPOINTS
#> Confidence intervals for breakpoints
#> of optimal 2-segment partition:
#>
#> Call:
#> confint.breakpointsfull(object = bp.Vt, het.err = FALSE)
#>
#> Breakpoints at observation number:
#> 2.5 % breakpoints 97.5 %
#> 1 19 20 26
#>
#> Corresponding to breakdates:
#> 2.5 % breakpoints 97.5 %
#> 1 19 20 26
#>
#> SEASONAL BREAKPOINTS: None
#>
#>
#> [[3]]
#>
#> TREND BREAKPOINTS: None
#>
#> SEASONAL BREAKPOINTS: None
#>
#>
#> [[4]]
#>
#> TREND BREAKPOINTS: None
#>
#> SEASONAL BREAKPOINTS: None
Rt = sapply(R, function(x)as.numeric(x$Time))
par(opar)
data.frame(h=h,changepoint=Rt)
#> h changepoint
#> 1 0.05 NA
#> 2 0.10 20
#> 3 0.15 NA
#> 4 0.20 NA
Created on 2019-10-16 by the reprex package (v0.3.0)
NA
values are a problem in time series analysis for seasonality and trend calculation. What is worse is that they tend to be temporally biased towards the rainy period, and we might have no observations whatsoever over the entire time series during particular times. This means that if we just omit the time steps with NA
s, the trend estimation will also be biased towards values in the dry season.
There was also the matter of the stl
package not handling data with NA
values altogether.
One workaround (suggested by Achim) was to use bfastpp
instead of stl
to get simplified covariates, and then calculate the trend manually. Another is to use stlplus
, but that imputes values. Discussion about the pros and cons and how to communicate that would be welcome.
At the moment, attempting to plot a bfast() object that has NAs in the response results in an error:
Error in plot.window(xlim, ylim, log, ...) : need finite 'ylim' values
Which is actually right because the plot method doesn't expect range()
to return NA
.
If we have the leap day (Feb 29) and also the day after (March 1) in the time series, bfastts
fails with an error:
Error in merge.zoo(zoo(coredata(x), tt), zoo(, tt2)) :
series cannot be merged with non-unique index entries in a series
In addition: Warning messages:
1: In zoo(data, 1900 + as.POSIXlt(dates)$year + (yday365(dates) - 1)/365, :
some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique
MWE:
Fails:
a = c(as.Date("2016-02-29"), as.Date("2016-03-01"))
b = 1:2
bfastts(b, a, type="irregular")
Works if we set it to be one day later:
a = c(as.Date("2016-02-29"), as.Date("2016-03-01"))
b = 1:2
bfastts(b, a+1, type="irregular")
So far we didn't have a big need to change the code of strucchange
, but I'm slowly but surely finding more and more bugs there as well, so a fork may be inevitable. In that case, we'll also need to do #35 also for the new fork, so should we fork it sooner rather than later?
Also, should we do the same for strucchange
as we have for bfast
with the contributor guidelines etc.? And what about new CRAN releases for strucchange
?
The function read.zoo()
is very convenient for plotting the results of bfastpp, and the examples section is already using the zoo package, so I suggesting adding this to it:
plot(zoo::read.zoo(d1))
This is very easy and gives a very nice plot for the bfastpp result:
library(bfast)
#> Registered S3 method overwritten by 'xts':
#> method from
#> as.zoo.xts zoo
#> Registered S3 method overwritten by 'quantmod':
#> method from
#> as.zoo.data.frame zoo
library(zoo)
#>
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#>
#> as.Date, as.Date.numeric
ndvi <- as.ts(zoo(cbind(a = som$NDVI.a, b = som$NDVI.b), som$Time))
ndvi <- window(ndvi, start = c(2006, 1), end = c(2009, 23))
## parametric season-trend model
d1 <- bfastpp(ndvi, order = 2)
plot(zoo::read.zoo(d1))
Created on 2020-02-24 by the reprex package (v0.3.0)
Can be based on http://bfast.r-forge.r-project.org/
Travis CI can be used (at least as a starting point). It should be enabled for the repo by making a travis.yml
file.
zoo
is able to convert regular time series zoo
objects into ts
objects, so it is possible to automate making a bfastts
from a zoo
object that way. The problem is that frequency(bfastts)
is then < 1, so all code using that function needs to use a separate parameter instead. That requires some code refactoring.
README.md
fileCONTRIBUTING.md
fileIf I use type="16-day", even if the underlying time series is exactly 16-day, the resulting ts
has a frequency of 1, which is not what we need at all.
# Unexpected
bfast::bfastts(1:240, seq.Date(as.Date("2009-01-01"), by=16, length.out=240), type="16-day")
#> Warning in as.ts.zoo(zz): 'x' does not have an underlying regularity
#> Time Series:
#> Start = 1
#> End = 240
#> Frequency = 1
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
#> [18] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
#> [35] 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
#> [52] 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
#> [69] 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
#> [86] 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102
#> [103] 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
#> [120] 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136
#> [137] 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153
#> [154] 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170
#> [171] 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187
#> [188] 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204
#> [205] 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221
#> [222] 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238
#> [239] 239 240
# Expected
ts(1:240, 2009, frequency=356.25/16)
#> Time Series:
#> Start = 2009
#> End = 2019.73403508772
#> Frequency = 22.265625
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
#> [18] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
#> [35] 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
#> [52] 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
#> [69] 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
#> [86] 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102
#> [103] 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
#> [120] 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136
#> [137] 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153
#> [154] 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170
#> [171] 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187
#> [188] 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204
#> [205] 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221
#> [222] 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238
#> [239] 239 240
Created on 2019-10-19 by the reprex package (v0.3.0)
Since most of R libs are licensed under GNU GPL v2/v3, we should use (one of) these before development.
If we have the leap day (Feb 29) and also the day after (March 1) in the time series, bfastts
fails with an error:
Error in merge.zoo(zoo(coredata(x), tt), zoo(, tt2)) :
series cannot be merged with non-unique index entries in a series
In addition: Warning messages:
1: In zoo(data, 1900 + as.POSIXlt(dates)$year + (yday365(dates) - 1)/365, :
some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique
MWE:
Fails:
a = c(as.Date("2016-02-29"), as.Date("2016-03-01"))
b = 1:2
bfastts(b, a, type="irregular")
Works if we set it to be one day later:
a = c(as.Date("2016-02-29"), as.Date("2016-03-01"))
b = 1:2
bfastts(b, a+1, type="irregular")
The method suggested by Achim to deal with NAs (breakpoints()
over bfastpp()
) should be added as a function.
For ease of applicability, having bfast
generate spatial maps rather than only pixel time series would be good to have. bfastSpatial
has some of that functionality, specifically with bfmSpatial
, but it only deals with bfastMonitor
output. Also, bfastSpatial
is useful as a generic preprocessing framework, the output does not necessarily have to be used with bfast
.
So an idea would be to add something like bfSpatial
for the core bfast
function. Perhaps with coordination with Loïc.
Thanks to pull #1, we have an opportunity to consolidate all the changes across the different forks of bfast, with a view of eventually updating the package on CRAN.
For ease of reviewing and consolidating code, it would be nice to have collaboration across the different code authors on a single codebase. GitHub supports this somewhat, using protected branches so that the code does not conflict/does not get lost.
Suggestions about how best to go ahead with this are welcome!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.