Code Monkey home page Code Monkey logo

ctr_prediction's People

Contributors

johnson0722 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ctr_prediction's Issues

请教

您好。FFM算法中,不太理解为什么要加一个field?

数据

您好,有没有同步的sample数据。

FM/FM.py中的train_sparse_data_frac_0.01.pkl文件哪来的?

FM/FM.py中的train_sparse_data_frac_0.01.pkl文件哪来的?

def train_model(sess, model, epochs=10, print_every=50):
"""training model"""
# Merge all the summaries and write them out to train_logs
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter('train_logs', sess.graph)
# get sparse training data
with open('../avazu_CTR/train_sparse_data_frac_0.01.pkl', 'rb') as f:
sparse_data_fraction = pickle.load(f)
# get number of batches
num_batches = len(sparse_data_fraction)

for e in range(epochs):
    num_samples = 0
    losses = []
    for ibatch in range(num_batches):
        # batch_size data
        batch_y = sparse_data_fraction[ibatch]['labels']
        batch_y = np.array(batch_y)
        actual_batch_size = len(batch_y)
        batch_indexes = np.array(sparse_data_fraction[ibatch]['indexes'], dtype=np.int64)
        batch_shape = np.array([actual_batch_size, feature_length], dtype=np.int64)
        batch_values = np.ones(len(batch_indexes), dtype=np.float32)
        # create a feed dictionary for this batch
        feed_dict = {model.X: (batch_indexes, batch_values, batch_shape),
                     model.y: batch_y,
                     model.keep_prob:1.0}


        loss, accuracy,  summary, global_step, _ = sess.run([model.loss, model.accuracy,
                                                             merged,model.global_step,
                                                             model.train_op], feed_dict=feed_dict)
        # aggregate performance stats
        losses.append(loss*actual_batch_size)
        num_samples += actual_batch_size
        # Record summaries and train.csv-set accuracy
        train_writer.add_summary(summary, global_step=global_step)
        # print training loss and accuracy
        if global_step % print_every == 0:
            logging.info("Iteration {0}: with minibatch training loss = {1} and accuracy of {2}"
                         .format(global_step, loss, accuracy))
            saver.save(sess, "checkpoints/model", global_step=global_step)
    # print loss of one epoch
    total_loss = np.sum(losses)/num_samples
    print("Epoch {1}, Overall loss = {0:.3g}".format(total_loss, e+1))

InvalidArgumentError (see above for traceback): k (303) from index[19,1] out of bounds (>=303)

Hello @Johnson0722 , I meet this error when I run FM.py. I use the same dataset as you. Can you give me some help?

`Caused by op 'interaction_layer/SparseTensorDenseMatMul/SparseTensorDenseMatMul', defined at:
File "FM.py", line 223, in
model.build_graph()
File "FM.py", line 94, in build_graph
self.inference()
File "FM.py", line 61, in inference
tf.pow(tf.sparse_tensor_dense_matmul(self.X, v), 2),
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\sparse_ops.py", line 1822, in sparse_tensor_dense_matmul
adjoint_b=adjoint_b)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_sparse_ops.py", line 3213, in sparse_tensor_dense_mat_mul
name=name)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3155, in create_op
op_def=op_def)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1717, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): k (303) from index[19,1] out of bounds (>=303)
[[Node: interaction_layer/SparseTensorDenseMatMul/SparseTensorDenseMatMul = SparseTensorDenseMatMul[T=DT_FLOAT, Tindices=DT_INT64, adjoint_a=false, adjoint_b=false, _device="/job:localhost/replica:0/task:0/devi
ce:CPU:0"](_arg_Placeholder_2_0_2, _arg_Placeholder_1_0_1, _arg_Placeholder_0_0, interaction_layer/v/read)]]
`

FFM 63行是笔误吗?

tf.reduce_sum(tf.multiply(v[i,self.feature2field[i]], v[j,self.feature2field[j]])),
这个应该是:
tf.reduce_sum(tf.multiply(v[i,self.feature2field[j]], v[j,self.feature2field[i]])),
才对吧。

DeepFM.py文件报错,难道是tensorflow版本问题(本人用tf1.12-GPU测试),求告知?

Caused by op u'Ftrl/update_Variable/SparseApplyFtrl', defined at:
File "DeepFM.py", line 325, in
model.build_graph()
File "DeepFM.py", line 132, in build_graph
self.train()
File "DeepFM.py", line 124, in train
self.train_op = optimizer.minimize(self.loss, global_step=self.global_step)
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 410, in minimize
name=name)
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 610, in apply_gradients
update_ops.append(processor.update_op(self, grad))
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 128, in update_op
return optimizer._apply_sparse_duplicate_indices(g, self._v)
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 1019, in _apply_sparse_duplicate_indices
return self._apply_sparse(gradient_no_duplicate_indices, var)
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/training/ftrl.py", line 224, in _apply_sparse
use_locking=self._use_locking)
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/training/gen_training_ops.py", line 3299, in sparse_apply_ftrl
use_locking=use_locking, name=name)
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py",line 488, in new_func
return func(*args, **kwargs)
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/u2019101432/.conda/envs/tf1.12/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Index 131989 at offset 131989 in indices is out of range
[[node Ftrl/update_Variable/SparseApplyFtrl (defined at DeepFM.py:124) = SparseApplyFtrl[T=DT_FLOAT, Tindices=DT_INT64, use_locking=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable, Variable/Ftrl, Variable/Ftrl_1, Ftrl/update_Variable/UnsortedSegmentSum, Ftrl/update_Variable/Unique, Ftrl/learning_rate, Ftrl/l1_regularization_strength, Ftrl/update_DNN/b1/Cast, Ftrl/learning_rate_power)]]

请教一个离散数据特征和计算的问题

代码DeepFM.py:
"""
# shape of [None, 2]
self.linear_terms = tf.add(tf.matmul(self.X, w1), b)

        # shape of [None, 1]
        self.interaction_terms = tf.multiply(0.5,
                                             tf.reduce_mean(
                                                 tf.subtract(
                                                     tf.pow(tf.matmul(self.X, v), 2),
                                                     tf.matmul(tf.pow(self.X, 2), tf.pow(v, 2))),

1, keep_dims=True))
"""
问题:DeepFM论文中每个离散特征(one hot之前)被表示成latent_size长度的embedding值,本质上感觉可以看成一个局部全连接。注意这个地方是每个离散特征。在每个离散特征表示成embedding之后,然后两两相乘形成interaction.
但是代码中用“tf.matmul(self.X, v)”这种方式相乘其结果是把所有的特征最终变成了一个self.k大小的embedding,而不是每个特征都变成self.k大小。
这个地方是不是存在问题呢?感觉应该用tf.multiply(self.X, v),不知道我理解的是否有问题。

how to merge first-order and second order

In the implement, second order is broadcasted added to first order. I would like to know why they are added and what the meaning is.
In my opinion, there is a replacement method that the first order is summed to a scalar and then it is added to the second order part.

您好,请教一个问题

fields_train_dict = {}
   for field in fields_train:
       with open('dicts/' + field + '.pkl', 'rb') as f:
           fields_train_dict[field] = pickle.load(f)
   fields_test_dict = {}
   for field in fields_test:
       with open('dicts/' + field + '.pkl', 'rb') as f:
           fields_test_dict[field] = pickle.load(f)

这段代码中路径"dicts/"下的field保存的是什么?
feature_length和field_cnt是什么关系的?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.