the functions filter and full_join do not work properly with numbers with 4 decimals or more. about dplyr HOT 2 CLOSED

PStaus commented on June 11, 2024

the functions filter and full_join do not work properly with numbers with 4 decimals or more.

from dplyr.

Comments (2)

DavisVaughan commented on June 11, 2024 1

This is a typical problem with floating point numbers. Manually typing in 7.0001 and generating the supposedly equivalent value with seq() results in two different numbers under the hood:

options(digits = 22)

seq(0,10, 0.0001)[70002]
#> [1] 7.00010000000000065512

7.0001
#> [1] 7.000099999999999766942

They are close, but not quite the same. There is a nice python article about this:
https://docs.python.org/3/tutorial/floatingpoint.html

In general, filtering or joining on fractional numbers like this is going to be problematic. For filtering, one solution is to use near():

library(dplyr)
longseq <- data.frame(time = seq(0,10, 0.0001))
longseq[70002,]
#> [1] 7.0001
longseq %>% filter(near(time, 7.0001))
#>     time
#> 1 7.0001

For joins, you should try to avoid using floating point numbers if you can, or at least round them to the nearest second first if they are times. If you really need fractional seconds, you could look into nanotime or clock, which have "true" subsecond types that aren't backed by floating point types, so they would work better for that.

Other than that, there isn't much dplyr can do here

from dplyr.

PStaus commented on June 11, 2024

Thank you Davis Vaughan!
this helped me a lot. And thank you for the very fast reply and pointing to the other two packages. I will have a look on using different classes for joining data frames.

from dplyr.

the functions filter and full_join do not work properly with numbers with 4 decimals or more. about dplyr HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent