<a target="_blank" rel="noopener noreferrer" href="https://private-user-images.githubusercontent.com

Interesting. But I suggest working on <a href="https://github.com/open-spaced-repetiti

Initial results: <div class="snippet-clipboard-content notranslate position-relati

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[TODO] Add DASH and its variants,about open-spaced-repetition/srs-benchmark

Comments (13)

giacomoran commented on September 24, 2024 1

I'm not 100% sure the following code is the most up-to-date and the one I've used for the thesis results. I should have kept the code and repo cleaner... Hopefully it will be useful anyways.

I implemented DASH[MCM] in R by casting it as a Generalized Additive Model (GAM) and using the bam function from mgcv package.

pop_mean <- mean(data_train$ratingB)
  
cntSeen <- function(intervalDays, threshold, tau) {
  N <- length(intervalDays)
  ret <- rep(0, length(intervalDays))
  for (n in 1:N) {
    # head(intervalDays, n)                    1 ,  4, 15
    # rev(head(intervalDays, n))               15,  4,  1
    ts <- cumsum(rev(head(intervalDays, n))) # 15, 19, 20
    ts <- ts[ts <= threshold]
    ret[n] <- sum(map_dbl(ts, ~ exp(- . / tau)))
  }
  ret
}

cntCorrect <- function(intervalDays, rating, threshold, tau) {
  N <- length(intervalDays)
  ret <- rep(0, length(intervalDays))
  for (n in 1:N) {
    if (n == 1) {
      rs <- c(T)
    } else {      
      rs <- c(T, head(rating, n-1) == "GOOD")
    }
    ts <- cumsum(rev(head(intervalDays, n)))
    ts <- ts[rs & ts <= threshold]
    ret[n] <- sum(map_dbl(ts, ~ exp(- . / tau)))
  }
  ret
}

tau_1 <- 0.2434
tau_7 <- 1.9739
tau_30 <- 16.0090
tau_infty <- 129.8426

data_train_mcm <- 
  data_train %>% 
  group_by(idHistory) %>%
  mutate(
    cntDaySeen = log1p(cntSeen(intervalDays, 1, tau_1)),
    cntDayCorrect = log1p(cntCorrect(intervalDays, rating, 1, tau_1)),
    cntWeekSeen = log1p(cntSeen(intervalDays, 7, tau_7)),
    cntWeekCorrect = log1p(cntCorrect(intervalDays, rating, 7, tau_7)),
    cntMonthSeen = log1p(cntSeen(intervalDays, 30, tau_30)),
    cntMonthCorrect = log1p(cntCorrect(intervalDays, rating, 30, tau_30)),
    cntHistorySeen = log1p(cntSeen(intervalDays, 10000, tau_infty)),
    cntHistoryCorrect = log1p(cntCorrect(intervalDays, rating, 10000, tau_infty))
  ) %>%
  ungroup()

fit_data_mcm <- bam(rating ~ -1 + idPrompt + s(idUser, bs="re") +
                             cntHistorySeen + cntHistoryCorrect +
                             cntDaySeen + cntDayCorrect + 
                             cntWeekSeen + cntWeekCorrect + 
                             cntMonthSeen + cntMonthCorrect,
                    family="binomial",
                    data=data_train_mcm,
                    cluster=cluster)

As for DASH[ACT-R], I implemented it in Python using Keras because of the tricky paramenter appearing as exponent.

class DASHACTR_review(layers.Layer):
    def __init__(self, **kwargs):
      super(DASHACTR_review, self).__init__(**kwargs)

      self.theta_2 = tf.Variable(initial_value=1., trainable=True, name="theta_2", constraint=tf.keras.constraints.NonNeg())
      self.theta_3 = tf.Variable(initial_value=1., trainable=True, name="theta_3")
      self.theta_4 = tf.Variable(initial_value=1., trainable=True, name="theta_4")

    def compute_output_shape(self, input_shape):
        return input_shape[:-1] + (1, )

    def call(self, inputs):
      input_delta, inputs_r = tf.split(inputs, [1, 1], axis=-1)

      return tf.math.pow(input_delta, (-self.theta_2) * (2 * tf.math.sign(input_delta) - 1)) * ( self.theta_3 + inputs_r * self.theta_4)

    def get_config(self):
      return {}


class DASHACTR_history(layers.Layer):
    def __init__(self, **kwargs):
      super(DASHACTR_history, self).__init__(**kwargs)

    def compute_output_shape(self, input_shape):
      return input_shape[:-2] + (1, )

    def call(self, inputs):
      return tf.math.log1p(tf.math.reduce_sum(inputs, axis=-1))

    def get_config(self):
      return {}


inputs_sequences = layers.Input(shape=(max_history_length, 2), name="inputs_sequences")

h_1 = layers.TimeDistributed(DASHACTR_review())(inputs_sequences)
h = DASHACTR_history()(h_1)

input_user = layers.Input(shape=(1), name="input_user", dtype="string")
layer_onehot_user = tf.keras.layers.StringLookup(output_mode='one_hot')
layer_onehot_user.adapt(x_train_user)
onehot_user = layer_onehot_user(input_user)

input_card = layers.Input(shape=(1), name="input_card", dtype="string")
layer_onehot_card = tf.keras.layers.StringLookup(output_mode='one_hot')
layer_onehot_card.adapt(x_train_card)
onehot_card = layer_onehot_card(input_card)

concatenated = layers.concatenate([h, onehot_user, onehot_card])

output = layers.Dense(1, activation='sigmoid', use_bias=False, kernel_regularizer=tf.keras.regularizers.l2(1e-4), name="sigmoid_out")(concatenated)

model_0 = keras.Model(inputs=[inputs_sequences, input_user, input_card], outputs=output, name="model_dash_act_r")

from srs-benchmark.

Expertium commented on September 24, 2024

Interesting. But I suggest working on this issue first, ACT-R seems to be simpler.

from srs-benchmark.

L-M-Sherlock commented on September 24, 2024

According to my research, the basic DASH is very simple. I will take a look at the ACT-R tomorrow.

from srs-benchmark.

Expertium commented on September 24, 2024

Now I'm curious whether DASH will be cheating. I'm looking forward to seeing the graphs!

from srs-benchmark.

L-M-Sherlock commented on September 24, 2024

Initial results:

Model: DASH
Total number of users: 537
Total number of reviews: 18685690
Weighted average by reviews:
DASH LogLoss (mean±std): 0.337±0.155
DASH RMSE(bins) (mean±std): 0.049±0.039

Weighted average by log(reviews):
DASH LogLoss (mean±std): 0.377±0.162
DASH RMSE(bins) (mean±std): 0.078±0.058

Weighted average by users:
DASH LogLoss (mean±std): 0.383±0.162
DASH RMSE(bins) (mean±std): 0.084±0.062

Model: FSRS-4.5
Total number of users: 537
Total number of reviews: 18685690
Weighted average by reviews:
FSRS-4.5 LogLoss (mean±std): 0.318±0.153
FSRS-4.5 RMSE(bins) (mean±std): 0.041±0.031

Weighted average by log(reviews):
FSRS-4.5 LogLoss (mean±std): 0.346±0.162
FSRS-4.5 RMSE(bins) (mean±std): 0.062±0.043

Weighted average by users:
FSRS-4.5 LogLoss (mean±std): 0.348±0.163
FSRS-4.5 RMSE(bins) (mean±std): 0.065±0.045

weights: [0.5614, 1.4046, 3.8707, 10.3723, 5.1491, 1.2271, 0.8804, 0.0465, 1.6598, 0.1405, 1.0407, 2.1135, 0.0886, 0.3247, 1.4143, 0.2151, 2.8857]
Model: FSRSv4
Total number of users: 537
Total number of reviews: 18685690
Weighted average by reviews:
FSRSv4 LogLoss (mean±std): 0.322±0.157
FSRSv4 RMSE(bins) (mean±std): 0.049±0.037

Weighted average by log(reviews):
FSRSv4 LogLoss (mean±std): 0.353±0.169
FSRSv4 RMSE(bins) (mean±std): 0.073±0.051

Weighted average by users:
FSRSv4 LogLoss (mean±std): 0.357±0.171
FSRSv4 RMSE(bins) (mean±std): 0.077±0.052

The calibration graph:

DASH.zip

By the way, it's pretty fast. It only costs 5 minutes to optimize 350 collections.

It's time to sleep in China. Good night.

from srs-benchmark.

Expertium commented on September 24, 2024

x = torch.log(x + 1)
If x can be small, then it's better to use torch.log1p(x) to avoid the loss of precision. Btw, I'm assuming this is the simplest version of DASH, not DASH[MCM] and not DASH[ACT-R]?
EDIT: your code doesn't really look like DASH. But I'm not sure, I find these formulas to be very difficult to read.

from srs-benchmark.

L-M-Sherlock commented on September 24, 2024

If x can be small, then it's better to use torch.log1p(x) to avoid the loss of precision.

x is non-negative integer.

I'm assuming this is the simplest version of DASH, not DASH[MCM] and not DASH[ACT-R]?

Yeah. I will implement the DASH[MCM] and DASH[ACT-R] later.

your code doesn't really look like DASH.

The equation is very complicated. But I'm sure my code is correct. I just merged the $a_s$ and $d_c$ into the bias item of the linear layer and removed the first time windows.

from srs-benchmark.

L-M-Sherlock commented on September 24, 2024

@giacomoran, sorry for bothering you. Could you share your code of the DASH[MCM] and DASH[ACT-R] models? I know you compared them with your R-17 and DASH[RNN] models. I also want to compare them with FSRS. I have implemented DASH.

Edit: I guess I have figured out the implementation of DASH[MCM]. The only one difference between DASH and DASH[MCM] is the time windows features:

def dash_tw_features_optimized_no_accumulator(r_history, t_history, enable_decay=False):
    features = np.zeros(8)
    r_history = np.array(r_history) > 1
    tau_w = np.array([0.2434, 1.9739, 16.0090, 129.8426])
    time_windows = np.array([1, 7, 30, np.inf])

    # Compute the cumulative sum of t_history in reverse order
    cumulative_times = np.cumsum(t_history[::-1])[::-1]

    for j, time_window in enumerate(time_windows):
        # Calculate decay factors for each time window
        if enable_decay:
            decay_factors = np.exp(-cumulative_times / tau_w[j])
        else:
            decay_factors = np.ones_like(cumulative_times)

        # Identify the indices where cumulative times are within the current time window
        valid_indices = cumulative_times <= time_window

        # Update features using decay factors where valid
        features[j * 2] += np.sum(decay_factors[valid_indices])
        features[j * 2 + 1] += np.sum(r_history[valid_indices] * decay_factors[valid_indices])

    return features


r_history = [1, 4, 3, 2, 1, 3]
t_history = [4, 4, 15, 10, 1, 3]
features = dash_tw_features(r_history, t_history, delta_t, True)
print(features)
features = dash_tw_features(r_history, t_history, delta_t, False)
print(features)

[0.01643301 0.01643301 0.8137531  0.73433718 2.99542927 2.2636851
 6.12471083 4.41621267]
[1. 1. 3. 2. 5. 4. 7. 5.]

When enable_decay is True, the features are for DASH[MCM].

from srs-benchmark.

L-M-Sherlock commented on September 24, 2024

Try to implement the DASH[ACT-R]. The initial weights are arbitrary.

class DASH_ACTR(nn.Module):
    init_w = [1, 1, 1, 1, 1]

    def __init__(self, w=init_w):
        super(DASH_ACTR, self).__init__()
        self.w = nn.Parameter(torch.tensor(w, dtype=torch.float32))
        self.sigmoid = nn.Sigmoid()

    def forward(self, inputs):
        """
        :param inputs: shape[seq_len, batch_size, 2], 2 means r and t
        """
        return self.sigmoid(self.w[0] * torch.log(
            1 + torch.sum(inputs[:, :, 1] ** -self.w[1] *
                          torch.where(inputs[:, :, 0] == 0, self.w[2], self.w[3]))
        ) + self.w[4])


t_history = [0, 1, 2, 4, 8]
r_history = [0, 1, 0, 1, 0]
delta_t = 1

t_history = torch.tensor(t_history[1:] + [delta_t], dtype=torch.float32)
cumsum = torch.cumsum(t_history, dim=0)

inputs = torch.tensor([r_history, t_history - cumsum + cumsum[-1:None]], dtype=torch.float32).transpose(0, 1)
inputs = inputs.unsqueeze(1)
print(inputs)
model = DASH_ACTR()
output = model(inputs)
print(output.item())

tensor([[[ 0., 16.]],

        [[ 1., 15.]],

        [[ 0., 13.]],

        [[ 1.,  9.]],

        [[ 0.,  1.]]])
torch.Size([5, 1, 2])
0.862991213798523

from srs-benchmark.

L-M-Sherlock commented on September 24, 2024

return tf.math.pow(input_delta, (-self.theta_2) * (2 * tf.math.sign(input_delta) - 1)) * ( self.theta_3 + inputs_r * self.theta_4)

@giacomoran I think this line of code is inconsistent with the equation 2.12:

By the way, I find that this item could be negative if $\theta_3$ is negative. So the ln will run into math error.

from srs-benchmark.

L-M-Sherlock commented on September 24, 2024

The initial results of DASH[ACT-R]:

Model: DASH[ACT-R]
Total number of users: 191
Total number of reviews: 5847195
Weighted average by reviews:
DASH[ACT-R] LogLoss (mean±std): 0.318±0.170
DASH[ACT-R] RMSE(bins) (mean±std): 0.039±0.032

Weighted average by log(reviews):
DASH[ACT-R] LogLoss (mean±std): 0.360±0.169
DASH[ACT-R] RMSE(bins) (mean±std): 0.057±0.050

Weighted average by users:
DASH[ACT-R] LogLoss (mean±std): 0.362±0.171
DASH[ACT-R] RMSE(bins) (mean±std): 0.060±0.053

weights: [1.5332, 0.4815, -0.452, 2.0, 1.0422]
Model: FSRS-4.5
Total number of users: 191
Total number of reviews: 5847195
Weighted average by reviews:
FSRS-4.5 LogLoss (mean±std): 0.310±0.168
FSRS-4.5 RMSE(bins) (mean±std): 0.044±0.032

Weighted average by log(reviews):
FSRS-4.5 LogLoss (mean±std): 0.351±0.160
FSRS-4.5 RMSE(bins) (mean±std): 0.064±0.044

Weighted average by users:
FSRS-4.5 LogLoss (mean±std): 0.354±0.160
FSRS-4.5 RMSE(bins) (mean±std): 0.067±0.046

weights: [0.5441, 1.4455, 3.8863, 11.5647, 5.1589, 1.2303, 0.8881, 0.0465, 1.629, 0.1588, 1.019, 2.1135, 0.0928, 0.337, 1.3907, 0.2225, 2.9044]
Model: DASH
Total number of users: 191
Total number of reviews: 5847195
Weighted average by reviews:
DASH LogLoss (mean±std): 0.333±0.160
DASH RMSE(bins) (mean±std): 0.058±0.057

Weighted average by log(reviews):
DASH LogLoss (mean±std): 0.382±0.156
DASH RMSE(bins) (mean±std): 0.081±0.058

Weighted average by users:
DASH LogLoss (mean±std): 0.386±0.157
DASH RMSE(bins) (mean±std): 0.085±0.060

The calibration graphs: DASH[ACT-R].zip

@Expertium, could you check them when you're available?

from srs-benchmark.

Expertium commented on September 24, 2024

Graphs look good to me.
EDIT: actually, I'm not so sure. We really need some sort of quantitative measure of cheating.
EDIT 2: @L-M-Sherlock #57

from srs-benchmark.

giacomoran commented on September 24, 2024

@giacomoran I think this line of code is inconsistent with the equation 2.12

Yeah, that looks like a mistake, I don't know what I was thinking.

from srs-benchmark.

[TODO] Add DASH and its variants about srs-benchmark HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent