Comments (5)
Thanks both for the constructive suggestions.
I guess an alternative option is using na_ma to filling with a moving average mean. A colleague pointed out that this might even be more robust than linear interpolation in settings with highly variable wind direction, if the window is correctly specified. It would be wonderful to have this supported for degrees in imputeTS ('na_win_dir_ma'?).
I agree the linear interpolation of circular coordinates seems sensible for short gaps in many cases. As you say, it would ideally to handle evenly split special cases (e.g. 90° to 180°). It would be wonderful to have this supported in imputeTS, but I might see what I can come up with here.
A more sophisticated approach might seek to account for temporal patterns associated with diel and seasonal differences in wind direction, trained against the available information. This would probably be more robust for longer gaps, although it would probably be more complex to implement.
from imputets.
@AndrewCunliffe - I don't develop imputeTS, but I do work with wind data. One trick I've found is to convert wind direction into unit circle coordinates. A good description is here: https://blog.tomgibara.com/post/11016504425/clustering-angles This would create a multivariate dataset of two coordinates. In that light, perhaps consider a multivariate imputation package such as mice: https://www.rdocumentation.org/packages/mice/versions/3.13.0
from imputets.
Interesting problem.
The solution of @glitt13 with the transformation into unit circle coordinates sounds pretty good.
I guess you want to do e.g. linear interpolation - which you then could do on these coordinates.
You will have a time series for the x and one for the y values (on both you would run na_interpolation).
Afterwards you transform back to one time series with degrees.
(also handling for some special cases is needed)
I'll try to post a code example later if I find the time.
Although I wouldn't switch to mice here (since it would ignore the time aspects).
Meaning if you have 10 degree, NA,NA,NA,NA, 100 degree. You probably want a results like 10 ,28 ,46 ,64 ,82 ,100.
Which would be only possible with mice, if you additionally somehow model the time aspects into additional variables.
from imputets.
Here is a simple code example:
library("useful")
library("imputeTS")
#Test data
data <- c(0,NA,30,100,NA,NA,200,300,NA,100,359,NA,NA,5,90,NA,270)
cartesian <- pol2cart(r = 1, theta = data, degrees = TRUE)
imp <- na_interpolation(cartesian, option ="linear")
result <- cart2pol(imp$x,imp$y, degrees = TRUE)$theta
result
Linear interpolation should produce reasonable results. But you could also exchange the na_interpolation in the code with another algorithm (but might be that not all algorithms make sense in this specific setting).
The transformation above is not so complicated, luckily there is a pol2cart and a cart2pol function in package 'useful'. You have to be a little bit careful, when doing it on your own instead of using this package, since the R functions sin(), cos() expect radians instead of degrees as input. In my very shallow tests the evenly split special cases (e.g. 90° to 270°) still produced output. Might be, because the x and y values never really were exactly the same. Guess the function from package 'useful' first does a degree to radiant conversion first and the Degree * pi/180 produces this minimal variation, which is needed to avoid 0 / 0 points after interpolation.
I don't know how often and how quickly the wind direction changes. But you are probably right, interpolation does not seem like a good option for very long gaps. There is the maxgap
option in every imputation function of imputeTS. With na_interpolation(x, option = "linear", maxgap = 3)
you would just interpolate for NA gaps that are no larger than 3. Longer gaps will be left NA. Maybe after performing interpolation for the short gaps you could use another imputation function of imputeTS for the long gaps. Maybe something like na_mean(x, option ="median")
, or na_replace
where you just impute the most common wind direction. Or maybe na_seasplit(x, algorithm= "mean"
) to have the mean per season. Guess a lot is possible there ... probably depends a lot on the data. Also your colleague might be right, if it is a lot of sudden back and forth changes instead of gradual changes of wind direction, a moving average might give better results than pure interpolation.
Think you might have to do some testing, what works best.
( maybe by simulating missing data for complete parts of your dataset - as described in this issue #52 )
from imputets.
I also now thought a while about adding the above outlined solution to the package ... but I came to the conclusion, the use case is probably a little bit too specific. Every additional feature also always has the downside, that the package gets more complicated for the average user (mental overload by too many parameter/function options to choose). Think instead I'll try to add an additional documentation / vignette to the package in one of the next versions. I think a vignette about handling special cases / problems with provided code examples could be quite nice and helpful.
from imputets.
Related Issues (20)
- na_replace doesn't allow replacement full NA vector HOT 3
- na_kalman is slow for long time series HOT 8
- Feature: Allow bounded time series interpolation HOT 1
- plotNA.imputation etc. not working with par()/layout() HOT 1
- Detailed Model Summary in na_kalman() HOT 2
- Faceting HOT 2
- Able to install but not load HOT 6
- Suggestion: Applying the na_mean function considering only values from the same periods. HOT 1
- Documentation needs updating HOT 5
- How to choose the best algorithm ? HOT 2
- Getting Error on part of my time series HOT 3
- Return fitting statistics and/or residuals HOT 2
- model0 or model In file na_kalman.R? HOT 1
- multiple imputations
- na_kalman: possible convergence problem: 'optim' gave code = 52 and message 'ERROR: ABNORMAL_TERMINATION_IN_LNSRCH'
- possible convergence problem: 'optim' gave code = 1 and message 'NEW_X'
- 'libRblas.so: No such file or directory' during package installation HOT 3
- Converting from ee.Image data to Numeric Vector (vector) or Time Series (ts) object HOT 2
- Error in `optim()`: ! L-BFGS-B needs finite values of 'fn' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from imputets.