Code Monkey home page Code Monkey logo

Comments (28)

statquant avatar statquant commented on May 19, 2024 1

I will try to track it down this weekend. Thanks for answering

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

That's a bug, and I can replicate it here when I use "TZ=Europe/London". Oddly, it does not happen for either my TZ or UTC.

If you have a moment, can you chase it down? As you can see, the anydate() function does very little. We probably "just" have to add another tz=... somewhere.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

It behaves better when you use utcdate(). I don't quite understand why it is otherwise off by an hour.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

Appreciate the help and second set of eyes.

Some comtext: There is a (really long) issue thread #5 -- incidentally with the same problem right in the title -- happening for one locale in Australia (and I got a lot of testing help from @jason-turner2). This now makes it two. I am starting to suspect that it may be a Boost Date_time issue. I tried some variants around TZ=Europe/London yesterday and got really weird results with it either being an hour ahead or behind. Converting via as.numeric() and looking at a (web-based) epoch converter helps a little.

A simple test may just be to use strptime from the system, or from R, and see what happens. Maybe I need to refactor that part of the package. I should at least do some more testing on this.

I'll probably release 0.1.2 in the meantime as it fixes one other corner case bug.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

I am still not entirely sure where we loose that hour, but I think the best way about is to make anydate() a proper accessor and to convert from Boost Posix Time to Boost Gregorian::Date at the C++ side and then just export it. That does mean rejigging some code so it won't be immediate, but I hope to get to it "at some point".

from anytime.

statquant avatar statquant commented on May 19, 2024

Hello, I tried to create a pull request but could not do it.
The following change works for me, I might be doing something stupid that breaks everything else but I am not sure why the epoch had to be in Local.
I am new to working on r packages, I ran your tests and did not see anything that I would have broken.
Something else: using Rstudio and rebuilding the sources I randomely get stupid results like :

> anytime("2016-07-11")
[1] "1400-01-01 09:12:08 LMT"

and I need to close RStudio, rebuild to get it working again (I cannot reproduce)...

// given a ptime object, return (fractional) seconds since epoch
// account for localtime, and also account for dst
double ptToDouble(const bt::ptime & pt) {

    const bt::ptime timet_start(boost::gregorian::date(1970,1,1));
    bt::time_duration tdiff = pt - timet_start;

    // hack-ish: go back to struct tm to use its tm_isdst field
    time_t secsSinceEpoch = tdiff.total_seconds();
    struct tm* localAsTm = localtime(&secsSinceEpoch);
    //Rcpp::Rcout << "Adj is " << localAsTm->tm_isdst << std::endl;

    // Define BOOST_DATE_TIME_POSIX_TIME_STD_CONFIG to use nanoseconds
    // (and then use diff.total_nanoseconds()/1.0e9;  instead)
    //
    // note dst correction here -- needed as UTC offset is correct but does not
    // contain the additional DST adjustment
    double totsec = tdiff.total_microseconds()/1.0e6, dstadj = 0;
#if defined(_WIN32)
    if (totsec > 0) {           // on Windows, for dates before 1970-01-01: segfault
        dstadj = localAsTm->tm_isdst*60*60;
    }
#else
    dstadj = localAsTm->tm_isdst*60*60;
#endif
    return totsec - dstadj;
}

Stuff from Autralia/Sydney:

Sys.setenv(TZ = "Australia/Sydney")
library(anytime)
anydate(20150101)
[1] "2015-01-01"

Current issue Europe/London:

anydate(20150101)
[1] "2015-01-01"
anydate('2016-12-12')
[1] "2016-12-12"
anytime('2016-12-12 15:00:00')
[1] "2016-12-12 15:00:00 GMT"
anytime('2016-05-12 15:00:00')
[1] "2016-05-12 15:00:00 BST"

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

Let's take this one step at a time:

  • if in doubt do NOT use RStudio with anytime. Maybe build, but do not run
  • it is known to crash on some operations
  • see an issue ticket here: #25
  • see how two functions abort (at the R level) when we see we are in RStudio, that was #27
  • likely due to us using Boost, and .... RStudio being built against a different (ancient version) of Boost
  • they know about this, I know about this, there is no immediate fix other than 'do not do this'

Updated:

  • I can still run 'check' fine too in RStudio

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

That out of the way, you can still try to build in RStudio if you don't know how to build otherwise.

Just try to run more tests on the command-line, maybe via RScript.

Now: can you detail what you changed where? Did you commit something somewhere?

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

Ok, I created a branch with the (shorter) version of ptToDouble() above. You may be on to something as I just noticed this in R itself yesterday:

R> as.POSIXct.numeric
function (x, tz = "", origin, ...) 
{
    if (missing(origin)) 
        stop("'origin' must be supplied")
    .POSIXct(as.POSIXct(origin, tz = "GMT", ...) + x, tz)
}
<bytecode: 0x5868e40>
<environment: namespace:base>
R> 

Here too the 'epoch timepoint' is computed with tz="GMT". I had the local adjustment code in all my versions starting from some Boost Date_Time examples ... including in the RcppBDT package.

from anytime.

statquant avatar statquant commented on May 19, 2024

Rstudio: I usually not use it at all, that yet another reason, I had a Boost version mismatch bug on some unrelated project a week ago so I get it.

What I changed: Only the 2 first lines of ptToDouble in anytime.cpp when you were setting the tz of the 1970-01-01 epoch to local (using code you got from the boost help ). My view was that the epoch should be in UTC and not in local, for the offset to be accurate (cf how R does it).

Did I commit: No I did not, I am not sure how to, I could clone, create a copy-repo and commit, but I thought the idiomatic way was to create a pull request.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

And the trouble is if I do what you suggest, it will only work in Greenwich, UK:

edd@max:~/git/anytime(feature/improved_pttodouble)$ date
Sun Dec 18 10:16:43 CST 2016
edd@max:~/git/anytime(feature/improved_pttodouble)$ Rscript -e 'anytime::anytime("2016-12-18 10:16:43")'
[1] "2016-12-18 04:16:43 CST"
edd@max:~/git/anytime(feature/improved_pttodouble)$ 

That's just plain wrong by six hours.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

I had a Boost version mismatch bug on some unrelated project a week ago so I get it.

Super-annoying as hard to fix. In all these years with Rcpp and BH this is the first one from those projects. There must be a Date_Time object instantiation somewhere on each side.

What I changed:

Got that now, see above.

Did I commit: No I am not sure how to, I could clone, create a copy-repo and commit, but I thought the idiomatic way was to create a pull request.

That is how you create a pull request. You clone (or fork), commit your change and the pull request is based the difference between your repo (or branch) and the repo you send the PR to, ie my master.

Useful to learn that, and this repo is admirably small that you may as well learn.

from anytime.

statquant avatar statquant commented on May 19, 2024

Ok, I feel this is along "Boost sets the local timezone on construction, R expects UTC as epoch, I want another timezone", please bear with me, I am on a blocked eurostar with limited wifi, I'll get back to it.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

What you suggest is something we may already have in the package. Did you ever look at utctime() and utcdate() ?

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

I have it fixed now -- by converting to Date internally in the C++ code:

edd@max:~/git/anytime(feature/improved_pttodouble)$ TZ="Europe/London" r -lanytime -p -e'at <- anytime:::anytime_cpp("2016-12-18 00:00", asUTC=FALSE, asDate=TRUE)'
[1] "2016-12-18"
edd@max:~/git/anytime(feature/improved_pttodouble)$ TZ="Europe/London" r -lanytime -p -e'at <- anydate("2016-12-18 00:00")'
[1] "2016-12-17"
edd@max:~/git/anytime(feature/improved_pttodouble)$ 

Here the first one correctly returns Dec 18 in your case of a TZ for London. It uses the new (internal) argument asDate.

For comparison anydate() (which has simply not yet been updated) shows the wrong earlier behaviour.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

This is now fixed in the master branch.

from anytime.

statquant avatar statquant commented on May 19, 2024

Running the lattest anytime I still see the following behaviour:

library(anytime)
anytime:::getTZ()
[1] "Europe/London"
anytime('2016-05-12 15:00:00')
[1] "2016-05-12 14:00:00 BST" ## which I would have expected to be "2016-05-12 15:00:00 BST"
anytime('2016-12-12 15:00:00')
[1] "2016-12-12 14:00:00 GMT" ## which I would have expected to be "2016-05-12 15:00:00 GMT"

Do I misunderstand the expected behaviour ?

from anytime.

statquant avatar statquant commented on May 19, 2024

Ah it looks like it is related to Europe/London TZ Goes back an Hour #51

from anytime.

statquant avatar statquant commented on May 19, 2024

@eddelbuettel Could you kindly confirm if the 2 examples I show 2 comments above are expected behavior, a yes/no will do.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

I think I have said all I am going to say in the matter.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

See #51 and the discussion there. Appears to be a bug with Boost.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

You could try and insert another line just before this line and create something like (untested, not compiled)

int bstFudgeFactor = 3600 * (totsec >= 5779800) * (tz=="Europe/London);

(or maybe at a different spot where we have tz) so that the missing 3600 seconds can be added for British dates after Nov 1, 1971.

It's not a high priority item for me, but I would consider a careful pull request.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

I made such a pull request (or rather, a branch ready for a pull request). If you could test that, I would appreciate it.

from anytime.

statquant avatar statquant commented on May 19, 2024

Before

me@nyz:~$ Rscript -e 'library(anytime); anytime:::getTZ(); anytime("2016-05-12 15:00:00"); anytime("2016-12-12 15:00:000")'
[1] "America/New_York"                                                                                                                
[1] "2016-05-12 15:00:00 EDT"                                                                                                         
[1] "2016-12-12 15:00:00 EST"   
me@ldz:~$ Rscript -e 'library(anytime); anytime:::getTZ(); anytime("2016-05-12 15:00:00"); anytime("2016-12-12 15:00:000")'
[1] "Europe/London"                                                                                                                   
[1] "2016-05-12 14:00:00 BST"                                                                                                         
[1] "2016-12-12 14:00:00 GMT"

After

remove.packages('anytime')
library(devtools)
install_github(repo='eddelbuettel/anytime', ref='feature/bst_correction')
me@nyz:~$ Rscript -e 'library(anytime); anytime:::getTZ(); anytime("2016-05-12 15:00:00"); anytime("2016-12-12 15:00:000")'
[1] "America/New_York"                                                                                                                
[1] "2016-05-12 15:00:00 EDT"                                                                                                         
[1] "2016-12-12 15:00:00 EST"
me@ldz:~$ Rscript -e 'library(anytime); anytime:::getTZ(); anytime("2016-05-12 15:00:00"); anytime("2016-12-12 15:00:000")'
[1] "Europe/London"                                                                                                                   
[1] "2016-05-12 15:00:00 BST"                                                                                                         
[1] "2016-12-12 15:00:00 GMT"   

However I would have expected the as.POSIXct and anytime to match in the following situation:

suppressMessages({                                                                              
    library(anytime)                                                                            
    library(data.table)                                                                         
})                                                                                              
anytime:::getTZ() 
[1] "America/New_York"                                                                              
set.seed(1)                                                                                     
DT <- data.table(dt1=as.POSIXct('1970-01-01 00:00:00')+floor(1e6*rnorm(1e6,sd=1000)), key='dt1')
DT[,dt1Str:=as.character(dt1)]                                                                  
DT[,dt2:=anytime(dt1Str)]                                                                       
DT[abs(dt1-dt2)>1][,delta:=as.numeric(dt1-dt2)][]                                                                           

                       dt1              dt1Str                 dt2   delta
    1: 1816-10-10 12:37:53 1816-10-10 12:37:53 1816-10-10 12:41:51  -3.967
    2: 1817-06-03 05:11:50 1817-06-03 05:11:50 1817-06-03 04:15:48  56.033
    3: 1824-06-01 10:53:34 1824-06-01 10:53:34 1824-06-01 09:57:32  56.033
    4: 1828-05-12 20:40:44 1828-05-12 20:40:44 1828-05-12 19:44:42  56.033
    5: 1830-02-23 13:08:10 1830-02-23 13:08:10 1830-02-23 13:12:08  -3.967
   ---                                                                    
20731: 2103-05-14 01:43:53 2103-05-14 01:43:53 2103-05-14 02:43:53 -60.000
20732: 2104-12-02 21:53:02 2104-12-02 21:53:02 2104-12-02 20:53:02  60.000
20733: 2107-05-26 11:42:30 2107-05-26 11:42:30 2107-05-26 12:42:30 -60.000
20734: 2116-11-01 05:50:52 2116-11-01 05:50:52 2116-11-01 04:50:52  60.000
20735: 2118-04-23 21:54:57 2118-04-23 21:54:57 2118-04-23 22:54:57 -60.000

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

Interesting. I would encourage you to translate this into a pure C++ bug report and file it with Boost.

(Also: you probably want a more uniform distribution of random dates, rather than a N(0, largeSd) around the epoch.)

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024
newdt <- DT[abs(dt1-dt2)>1][,delta:=as.numeric(dt1-dt2)]
newdt[, year:=year(as.IDate(dt2))]
newdt[, .(count=.N), by=year]

There is clearly a lot of error before 1901, and after 2038. But it is not clear that I can do anything about it.

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

Plus 1940s which is weird.

R> newdt[, .(count=.N), by=10*round(year/10,0)]
    round count
 1:  1820     4
 2:  1830    10
 3:  1840    31
 4:  1850    96
 5:  1860   334
 6:  1870   866
 7:  1880  2362
 8:  1890  3132
 9:  1900  4536
10:  1920     3
11:  1930     9
12:  1940  5026
13:  1950    11
14:  1960    11
15:  1970    11
16:  1980    11
17:  1990    15
18:  2000    10
19:  2010    11
20:  2020     7
21:  2040  5228
22:  2050  2853
23:  2060   650
24:  2070   216
25:  2080    70
26:  2090    24
27:  2100    14
28:  2110     1
29:  2120     1
    round count
R> 

from anytime.

eddelbuettel avatar eddelbuettel commented on May 19, 2024

I played with this a little more -- when we use Boost to turn a numeric offset to the epoch into a string and parse that, we have fewer issues. The main problem here seems to be that we go to Boost, and then come back to R. Which seems to introduce some small frictions,

from anytime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.