Code Monkey home page Code Monkey logo

Comments (12)

L-M-Sherlock avatar L-M-Sherlock commented on June 23, 2024 1

Thanks for this report. I will fix it in soon.

from srs-benchmark.

imrryr avatar imrryr commented on June 23, 2024

@L-M-Sherlock

from srs-benchmark.

L-M-Sherlock avatar L-M-Sherlock commented on June 23, 2024

image

I get the correct result from your code. It seems an environment problem.

from srs-benchmark.

L-M-Sherlock avatar L-M-Sherlock commented on June 23, 2024

I find that only review_th is inconsistent with my result. It doesn't matter. The order is correct.

from srs-benchmark.

imrryr avatar imrryr commented on June 23, 2024

Unfortunately, it does matter for my analysis. I need the review_th to be right to order the entire file, also the delta_t column was different. I really need to figure it out in my environment.

from srs-benchmark.

L-M-Sherlock avatar L-M-Sherlock commented on June 23, 2024

The review_th is calculated here:

https://github.com/open-spaced-repetition/fsrs-benchmark/blob/ea493cf91900d9c8fd3bd05c42518373875c799f/revlogs2dataset.py#L54

I recommend searching the document of pandas about this function.

Document: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rank.html

I'm not helpful here because I can't reproduce the bug.

It’s helpful for debugging to store the intermediate products during the converting. You can save the df into csv after each step. Then you may locate the bug.

from srs-benchmark.

imrryr avatar imrryr commented on June 23, 2024

OK, that makes sense. It appears the densification is working but the original times coming from the revlog are different. Did you test with the revlog I provided? Not your original?

from srs-benchmark.

imrryr avatar imrryr commented on June 23, 2024

is there any chance the stats_pb2 file is the wrong version? I got that from @dae 's package and it was needed to run your file. @L-M-Sherlock

from srs-benchmark.

L-M-Sherlock avatar L-M-Sherlock commented on June 23, 2024

I also got that from dae. Could you show some cases about the different review time?

from srs-benchmark.

imrryr avatar imrryr commented on June 23, 2024

Here is an output before dropping the rows: review_time card_id rating review_state is_learn_start sequence_group last_learn_start mask relative_day delta_t i review_th
0 97218963 0 3 0 True 1 1 True -19683 -1 1 4863
1 97224667 0 3 0 False 1 1 True -19683 0 2 4864
2 440742459 0 3 1 False 1 1 True -19679 4 3 4997
3 933416194 0 4 1 False 1 1 True -19674 5 4 5846
4 1046892324 0 2 3 False 1 1 True -19672 2 5 6105
... ... ... ... ... ... ... ... ... ... ... .. ...
7070 -1726999624 645 3 0 False 620 620 True -19705 0 2 1367
7071 -1726339624 645 3 3 False 620 620 True -19705 0 3 1380
7072 -1697912624 645 3 1 False 620 620 True -19704 1 4 1639
7073 -1659497624 645 3 3 False 620 620 True -19704 0 5 1959
7074 -1637230624 645 3 3 False 620 620 True -19704 0 6 2077

[6966 rows x 12 columns]
card_id review_th delta_t rating
0 0 4863 -1 3
1 0 4864 0 3
2 0 4997 4 3
3 0 5846 5 4
4 0 6105 2 2
... ... ... ... ...
7070 645 1367 0 3
7071 645 1380 0 3
7072 645 1639 1 3
7073 645 1959 0 3
7074 645 2077 0 3

from srs-benchmark.

L-M-Sherlock avatar L-M-Sherlock commented on June 23, 2024

It's weird that the review_time is negative.

https://github.com/open-spaced-repetition/fsrs-benchmark/blob/ea493cf91900d9c8fd3bd05c42518373875c799f/revlogs2dataset.py#L31

Could you check whether they are correct after below this line?

from srs-benchmark.

imrryr avatar imrryr commented on June 23, 2024

Ha! my environment demoted the int64 to int32 here, which corrupted it. Problem solved. @L-M-Sherlock
df["review_time"] = df["review_time"].astype(int) fixed with
df["review_time"] = df["review_time"].astype("int64")

from srs-benchmark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.