Code Monkey home page Code Monkey logo

Comments (5)

AndrewCunliffe avatar AndrewCunliffe commented on May 25, 2024 2

Thanks both for the constructive suggestions.

I guess an alternative option is using na_ma to filling with a moving average mean. A colleague pointed out that this might even be more robust than linear interpolation in settings with highly variable wind direction, if the window is correctly specified. It would be wonderful to have this supported for degrees in imputeTS ('na_win_dir_ma'?).

I agree the linear interpolation of circular coordinates seems sensible for short gaps in many cases. As you say, it would ideally to handle evenly split special cases (e.g. 90° to 180°). It would be wonderful to have this supported in imputeTS, but I might see what I can come up with here.

A more sophisticated approach might seek to account for temporal patterns associated with diel and seasonal differences in wind direction, trained against the available information. This would probably be more robust for longer gaps, although it would probably be more complex to implement.

from imputets.

glitt13 avatar glitt13 commented on May 25, 2024 1

@AndrewCunliffe - I don't develop imputeTS, but I do work with wind data. One trick I've found is to convert wind direction into unit circle coordinates. A good description is here: https://blog.tomgibara.com/post/11016504425/clustering-angles This would create a multivariate dataset of two coordinates. In that light, perhaps consider a multivariate imputation package such as mice: https://www.rdocumentation.org/packages/mice/versions/3.13.0

from imputets.

SteffenMoritz avatar SteffenMoritz commented on May 25, 2024 1

Interesting problem.
The solution of @glitt13 with the transformation into unit circle coordinates sounds pretty good.
I guess you want to do e.g. linear interpolation - which you then could do on these coordinates.
You will have a time series for the x and one for the y values (on both you would run na_interpolation).
Afterwards you transform back to one time series with degrees.
(also handling for some special cases is needed)

I'll try to post a code example later if I find the time.

Although I wouldn't switch to mice here (since it would ignore the time aspects).
Meaning if you have 10 degree, NA,NA,NA,NA, 100 degree. You probably want a results like 10 ,28 ,46 ,64 ,82 ,100.
Which would be only possible with mice, if you additionally somehow model the time aspects into additional variables.

from imputets.

SteffenMoritz avatar SteffenMoritz commented on May 25, 2024

Here is a simple code example:

library("useful")
library("imputeTS")

#Test data
data <- c(0,NA,30,100,NA,NA,200,300,NA,100,359,NA,NA,5,90,NA,270)

cartesian <- pol2cart(r = 1, theta = data, degrees = TRUE)
imp <- na_interpolation(cartesian, option ="linear")
result <- cart2pol(imp$x,imp$y, degrees = TRUE)$theta
result

Linear interpolation should produce reasonable results. But you could also exchange the na_interpolation in the code with another algorithm (but might be that not all algorithms make sense in this specific setting).

The transformation above is not so complicated, luckily there is a pol2cart and a cart2pol function in package 'useful'. You have to be a little bit careful, when doing it on your own instead of using this package, since the R functions sin(), cos() expect radians instead of degrees as input. In my very shallow tests the evenly split special cases (e.g. 90° to 270°) still produced output. Might be, because the x and y values never really were exactly the same. Guess the function from package 'useful' first does a degree to radiant conversion first and the Degree * pi/180 produces this minimal variation, which is needed to avoid 0 / 0 points after interpolation.

I don't know how often and how quickly the wind direction changes. But you are probably right, interpolation does not seem like a good option for very long gaps. There is the maxgap option in every imputation function of imputeTS. With na_interpolation(x, option = "linear", maxgap = 3) you would just interpolate for NA gaps that are no larger than 3. Longer gaps will be left NA. Maybe after performing interpolation for the short gaps you could use another imputation function of imputeTS for the long gaps. Maybe something like na_mean(x, option ="median"), or na_replace where you just impute the most common wind direction. Or maybe na_seasplit(x, algorithm= "mean") to have the mean per season. Guess a lot is possible there ... probably depends a lot on the data. Also your colleague might be right, if it is a lot of sudden back and forth changes instead of gradual changes of wind direction, a moving average might give better results than pure interpolation.
Think you might have to do some testing, what works best.
( maybe by simulating missing data for complete parts of your dataset - as described in this issue #52 )

from imputets.

SteffenMoritz avatar SteffenMoritz commented on May 25, 2024

I also now thought a while about adding the above outlined solution to the package ... but I came to the conclusion, the use case is probably a little bit too specific. Every additional feature also always has the downside, that the package gets more complicated for the average user (mental overload by too many parameter/function options to choose). Think instead I'll try to add an additional documentation / vignette to the package in one of the next versions. I think a vignette about handling special cases / problems with provided code examples could be quite nice and helpful.

from imputets.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.