geffy / tffm Goto Github PK

View Code? Open in Web Editor NEW

783.0 33.0 175.0 461 KB

TensorFlow implementation of an arbitrary order Factorization Machine

License: MIT License

Python 23.24% Jupyter Notebook 76.76%

factorization-machines tensorflow research-project

tffm's People

Contributors

Stargazers

Watchers

Forkers

thierry-silbermann chenglongchen orenov mindis lixiangbao laisun poseidon1214 lijiankou wubinzzu colinsongf jattenberg vimos dansbs vyraun felixmaximilian vseledkin binga yasyf e-mon hangelwen jseabold leoleishi codyhan leezqcst zhxwmessi allensmile wangxwlt hiyorimi tky5622 pgnepal xujin1982 morozov-group chao-jiang tandychao patagonia4 mnrmja007 fatmas1982 tony32769 kangkot trillville chenlongzhen sandy4321 babakx ajoeajoe ysebiat zhenv5 sugiki emilywangattri woniuhu liya2001 woodstone121 yangjiamu sdd031215 skobaken7 chaitanyacixlive anirband zgcgreat alexmay21 hengqujushi she-huanbo yushuai qingdimeng yassine99 ntopi brahmaslee slhansen duke24k sainiudit alxsoares batizty ml-ai-nlp-ir peterewills bollack dsivaji kalyansashank kman0 icaffe skyjiao thomsonreuters molimomo howieeeeeeee ambier wangkanger happynoom mysqlsc babylls racle257 emilieke karammawas i-plusplus hy-kuo enyun srngit zxlmufc kevinblackmore ihongchen daniel-bale pratikfalke mejihero afcarl

tffm's Issues

Multi-GPU support

Hi,

I just wondered how FM can be parallelized effectively between multiple GPUs. I'm a bit familiar with TF and not really with FMs. If you provide me with ideas or any highlights, I would make a necessary modifications and subsequently a PR since for today I'm interested in parallel GPU FM implementation and seems that your code is a well base for this.

Best Regards,
Alexandr

Huge variations in predictions as the seed changes.

I ran the model on my data and could see huge variations in predictions with respect to previous training on the same data. How can I tackle that so that the prediction is as good as possible?

Support for Bayesian Personalized Ranking

Hi there, I'm thinking of contributing codes for BPR. What is the best way to extend the code to handle this optimization in your opinion?

Why not in Keras ?

Why don't you make your work compatible with Keras as a simple neural net layer ?
keras-team/keras#4959

how can I get every interaction weight?

how can I get every interaction weight ,can you help me?
Thank you!

get_shorter_decompositions in util.py

Hi, thank you for making this great package!
I just noticed this example for get_shorter_decompositions function in util.py:

    Example
    -------
    decompositions, counts = get_shorter_decompositions([1, 2, 3])
        decompositions == [(1, 5), (2, 4), (3, 3), (6,)]
        counts == [ 2.,  1.,  1.,  2.]

I ran this example myself and got the counts as [2, 2, 1, 1] instead of [2, 1, 1, 2].
It seems just a typo in the example in the docstring, nothing wrong with the code itself.

ALS / MCME

This is not an issue but more of a question. Is it so that tffm supports SGD but not ALS and MCMC algorithms? This was my understanding with a quick look at the code and readme.

Upgrade tensorflow to v1.0, tensor shape issue in pow_wrapper

After upgrading tensorflow to tensorflow-1.0.0-cp27-cp27mu-manylinux1_x86_64.whl, the original code runs into error

Traceback (most recent call last):
  File "training.py", line 65, in <module>
    obj.training()
  File "/data/home/vimos/Public/git/github/hotel-revenue/revenueml/revenueml/train/split.py", line 264, in training
    getattr(self.trainer, "{}_train".format(method))(**ins))
  File "/data/home/vimos/Public/git/github/hotel-revenue/revenueml/revenueml/train/trainer.py", line 103, in tffm_train
    fm.fit(self.x_train, y_train, self.punishment, n_epochs=n_epochs, show_progress=True)
  File "/data/home/vimos/Public/git/github/hotel-revenue/revenueml/revenueml/tffm/base.py", line 247, in fit
    self.core.build_graph()
  File "/data/home/vimos/Public/git/github/hotel-revenue/revenueml/revenueml/tffm/core.py", line 214, in build_graph
    self.init_main_block()
  File "/data/home/vimos/Public/git/github/hotel-revenue/revenueml/revenueml/tffm/core.py", line 180, in init_main_block
    self.pow_matmul(i, in_pows[pow_idx]),
  File "/data/home/vimos/Public/git/github/hotel-revenue/revenueml/revenueml/tffm/core.py", line 122, in pow_matmul
    x_pow = pow_wrapper(self.train_x, pow, self.input_type)
  File "/data/home/vimos/Public/git/github/hotel-revenue/revenueml/revenueml/tffm/core.py", line 261, in pow_wrapper
    return tf.SparseTensor(X.indices, tf.pow(X.values, p), X.shape)
AttributeError: 'SparseTensor' object has no attribute 'shape'

I fixed this using

--- a/revenueml/tffm/core.py
+++ b/revenueml/tffm/core.py
@@ -258,6 +258,6 @@ def pow_wrapper(X, p, optype):
     if optype == 'dense':
         return tf.pow(X, p)
     elif optype == 'sparse':
-        return tf.SparseTensor(X.indices, tf.pow(X.values, p), X.shape)
+        return tf.SparseTensor(X.indices, tf.pow(X.values, p), X.dense_shape)
     else:
         raise NameError('Unknown input type in pow_wrapper')

Hope this fix is right and be helpful to others.

Question about data format

Hi!
I just want to ask about do we need to transform every feature column in the dataset to 0/1 representation? I know we need to transform the categorical variables, but what about the numerical variables (like price)? Do we also need to transform them?

Besides, when I tried to transform all variables to 0/1 representations, I got 550+ columns, and I also have 100,000 rows. When I train the model, I always got this error: NaN or Inf in w[2]. : Tensor had NaN values But I am pretty sure there are no other numbers other than 0/1. How does it happen? However, when I only use 90,000 rows of my dataset, this problem disappears.I really don't know why and I really need your help!!!

Thank a lot!!!
Weisi

Implement regression part

Thanks for the great repo, which inspired me on learning tensorflow, I am trying to understand the tffm code line by line.

I am planning to implement the regression part. It's not easy for me, but I would like to give it a try. If anybody have finished this, please give me some hint.

is the output of model the same as original FM?

is the self.outputs in the core.py the same as:

$p=w_0+\sum_{i=1}^{n}w_ix_i+1/2\sum_{f=1}^{k}((\sum_{i=1}^{n}v_{i,f}x_{i})^2-\sum_{i=1}^{n}v_{i,f}^{2}x_{i}^{2})$

from the original FM model paper?

I have trained a model using tffm, and I want to use saved model to inference by c++ code, so I need to rewrite a output by myself, I find the output in your model is a little confusing, so does it have a equation for order 2 FM model like above one? or is it the same equation?

Problem in cross validating the model using sklearn

The estimator TFFMClassifier() does not have score method.

How to save a trained model?

How can I save a trained tffm for future predictions ?
The model object cannot be pickled due to thread.RLock type object and isn't json serializable either.

Scaling benchmarks

I've been looking at Spark implementations of Factorization Machines. I found that none of the existing open source implementations scale to a dataset with millions of features and hundreds of millions of examples. I'd be curious how this implementation is able to scale.

Errors while working with TensorFlow 1.3

I noticed README mentions TF 1.0 but thought report this and if easy I can fix it. So running test.py for TF 1.3 results in errors below and it seems decision_function() has changed:

======================================================================
ERROR: test_dense_FM (__main__.TestFM)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 54, in test_dense_FM
    self.decision_function_order_4(input_type='dense', use_diag=False)
  File "test.py", line 48, in decision_function_order_4
    actual = model.decision_function(X)
TypeError: decision_function() takes exactly 3 arguments (2 given)

======================================================================
ERROR: test_dense_PN (__main__.TestFM)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 57, in test_dense_PN
    self.decision_function_order_4(input_type='dense', use_diag=True)
  File "test.py", line 48, in decision_function_order_4
    actual = model.decision_function(X)
TypeError: decision_function() takes exactly 3 arguments (2 given)

======================================================================
ERROR: test_sparse_FM (__main__.TestFM)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 60, in test_sparse_FM
    self.decision_function_order_4(input_type='sparse', use_diag=False)
  File "test.py", line 48, in decision_function_order_4
    actual = model.decision_function(X)
TypeError: decision_function() takes exactly 3 arguments (2 given)

======================================================================
ERROR: test_sparse_PN (__main__.TestFM)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 63, in test_sparse_PN
    self.decision_function_order_4(input_type='sparse', use_diag=True)
  File "test.py", line 48, in decision_function_order_4
    actual = model.decision_function(X)
TypeError: decision_function() takes exactly 3 arguments (2 given)

----------------------------------------------------------------------
Ran 4 tests in 8.957s

FAILED (errors=4)

about l2 regularization

hello geffy,
tffm is very useful, here I have a small question:
how do you deal with the l2 regularization when input is sparse, I mean I look at your code, your regularization term include all of the parameters, but when the input data is very sparse, factorization machine only use a small part of parameters(e.g. not all of the w at order 1 can be used), if you do like this, all of the parameters will change, instead of only changing those part of parameters. Do you think it is a issue?

How to set loss_funtion to auc

show a mean squared average between epochs like pyFM?

https://github.com/coreylynch/pyFM#getting-started

About multi-class supported?

@geffy frist thank you for provide us this good tool,but in fact sometime there some question of multi-class .Could you give some advises.

Working with various optimizers

Hello,

I've tried the implementation using ADAM optimizer and it works out pretty good. I wanted to compare the results I got with AdaDelta or FTRL but the algorithm always returns a 0% recall with the 'predictions' vector containing only negative samples.

Is anyone else experiencing this issue ?

Thanks

do you support spars data format

for example this
http://srome.github.io/Leveraging-Factorization-Machines-for-Sparse-Data-and-Supervised-Visualization/
gives bid data use
can you run code for this data?
X_train.shape
#(7580, 1048576)
where number of features is 1048576

Problem: NaN or Inf in w[0]. : Tensor had NaN values

why no value in the bias tensor?

Custom Loss Function

Is there a reason that custom loss functions are not enabled for TFFMClassifier and TFFMRegressor? In particular, it would be helpful to be able to use weighted log-loss, so that there is a class_weight keyword argument a la sklearn.linear_model.LogisticRegression.

If appropriate, I'd be happy to implement this.

model.fit(X_tr, y_tr, show_progress=True)

I used sparse style and the following error occurs:
`AttributeError Traceback (most recent call last)
in
----> 1 model.fit(X_tr, y_tr, show_progress=True)

~\anaconda3\lib\site-packages\tffm\models.py in fit(self, X, y, sample_weight, n_epochs, show_progress)
124 def fit(self, X, y, sample_weight=None, n_epochs=None, show_progress=False):
125 sample_weight = np.ones_like(y) if sample_weight is None else sample_weight
--> 126 self.fit(X=X, y_=y, w_=sample_weight, n_epochs=n_epochs, show_progress=show_progress)
127
128 def predict(self, X, pred_batch_size=None):

~\anaconda3\lib\site-packages\tffm\base.py in fit(self, X, y_, w_, n_epochs, show_progress)
224 # iterate over batches
225 for bX, bY, bW in batcher(X_[perm], y_=y_[perm], w_=w_[perm], batch_size=self.batch_size):
--> 226 fd = batch_to_feeddict(bX, bY, bW, core=self.core)
227 ops_to_run = [self.core.trainer, self.core.target, self.core.summary_op]
228 result = self.session.run(ops_to_run, feed_dict=fd)

~\anaconda3\lib\site-packages\tffm\base.py in batch_to_feeddict(X, y, w, core)
79 # sparse case
80 X_sparse = X.tocoo()
---> 81 fd[core.raw_indices] = np.hstack(
82 (X_sparse.row[:, np.newaxis], X_sparse.col[:, np.newaxis])
83 ).astype(np.int64)

AttributeError: 'TFFMCore' object has no attribute 'raw_indices'`

got NaN issue running on a sparse data

Hi,

I tried to run TFFMClassifier on a sparse data, (for example: https://github.com/apache/spark/blob/master/data/mllib/sample_libsvm_data.txt), but got an error when fitting the data, the data is loaded with load_svmlight_file
========================= trace log ========================
tensorflow.python.framework.errors.InvalidArgumentError: NaN or Inf in target value : Tensor had NaN values
[[Node: target/CheckNumerics = CheckNumericsT=DT_FLOAT, _class=["loc:@add"], message="NaN or Inf in target value", _device="/job:localhost/replica:0/task:0/cpu:0"]]
Caused by op u'target/CheckNumerics', defined at:
File "/usr/local/lib/python2.7/dist-packages/tffm/testtffm.py", line 51, in
model.fit(X_tr.toarray(), y_tr, show_progress=True)
File "/usr/local/lib/python2.7/dist-packages/tffm/tffm/base.py", line 242, in fit
self.core.build_graph()
File "/usr/local/lib/python2.7/dist-packages/tffm/tffm/core.py", line 208, in build_graph
self.init_target()
File "/usr/local/lib/python2.7/dist-packages/tffm/tffm/core.py", line 191, in init_target
msg='NaN or Inf in target value', name='target')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/numerics.py", line 42, in verify_tensor_all_finite

verify_input = array_ops.check_numerics(t, message=msg)

it seems the problem is in self.loss = self.loss_function(self.outputs, self.train_y), that somehow generated NaN points.

Can someone look at this issue? thanks.

Turning off warnings on tensorflow 1.7

When running tffm on tensorflow 1.7, the following warning appears:

WARNING:tensorflow:Variable += will be deprecated. Use variable.assign_add if you want assignment to the variable value or 'x = x + y' if you want a new python Tensor object

How is it possible to turn it off? I think that these are coming from core. init_main_block() and core. init_regularization()...

Use of self.train_w in TFFMCore.init_loss()

I am trying to understand how tffm is working but I can't figure out why self.loss is obtained by multiplying self.loss_function by self.train_w in the class TFFMCore. I would have thought that self.train_w shouldn't be there...

    def init_loss(self):
        with tf.name_scope('loss') as scope:
            self.loss = self.loss_function(self.outputs, self.train_y) * self.train_w
            self.reduced_loss = tf.reduce_mean(self.loss)
            tf.summary.scalar('loss', self.reduced_loss)

Save / load Model

Hi,

I've been using TFFM for prototyping and is great, thank you. It is actually one of my first experiences with TF.

I found there is no direct save/load for a TFFM model. If there is, ignore this post completely :)

I understand you can save/load underlying TF objects. More precisely, say I have a
model = TFFMRegressor( ... )
that I want to save and load.

So, my workaround is:
-> Save: underlying TF objects via: model.core.saver.save(model.session, 'filename')

-> Load:

Create an (almost) identical model = TFFMRegressor( ..., n_epochs=1 )
Train for 1 epoch with model.fit()
Retrieving TF graph and Session with
saver = tf.train.import_meta_graph('filename.meta')
saver.restore(model.session, tf.train.latest_checkpoint('./'))

First of all, since I am not very experienced in TF, I do not know if this is even enough / correct.

Then, I would like to suggest a .save() and .load() method for TFFM models.

How can I verify the output of a row after loading the model into memory ?

I want to manually verify the prediction of a particular testing sample.

Non negativity constraints

I want to try out factorization machines for a problem in which I've been using non-negative matrix factorization. Can tffm be used with non negativity constraints?

Restore a trained model

I am trying to restore a model from saved state and got the error

Traceback (most recent call last):
  File "predict.py", line 116, in <module>
    df_predict[v] = predict_with(v)
  File "predict.py", line 95, in predict_with
    fm.load_state(model_file)
  File "/home/revenueml/git/revenueml/revenueml/tffm/base.py", line 300, in load_state
    self.core.build_graph()
  File "/home/revenueml/git/revenueml/revenueml/tffm/core.py", line 183, in build_graph
    assert self.n_features is not None
AssertionError

I fixed this with

    fm.core.set_num_features(fmap.index.size)

Hope this can help somebody else.

Please add some usage on restoring models in readme, this may be quite helpful.

Interaction factors

Hi,
Is it possible to display all the interaction factors between the variables?

Incorrect code when order >= 3

The code for computing predictions (https://github.com/geffy/tffm/blob/master/tffm.py#L212) is incorrect.

You naively applied Lemma 3.1 from Rendle's original paper (http://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf) but this is incorrect when order >= 3.

If you compare with the predictions obtained by Equation (5) in the paper, you'll see that the predictions are not the same.

Issue with tensorflow 2.0: module 'tensorflow_core._api.v2.train' has no attribute 'AdamOptimizer'

pip install tensorflow==2.0
import numpy as np
import tensorflow as tf

from tffm import TFFMClassifier

gives error:

AttributeError Traceback (most recent call last)
in
----> 1 from tffm import TFFMClassifier

~/anaconda3/lib/python3.7/site-packages/tffm/init.py in
----> 1 from .models import TFFMClassifier, TFFMRegressor
2
3 all = ['TFFMClassifier', 'TFFMRegressor']

~/anaconda3/lib/python3.7/site-packages/tffm/models.py in
2
3 import numpy as np
----> 4 from .base import TFFMBaseModel
5 from .utils import loss_logistic, loss_mse, sigmoid
6

~/anaconda3/lib/python3.7/site-packages/tffm/base.py in
1 import tensorflow as tf
----> 2 from .core import TFFMCore
3 from sklearn.base import BaseEstimator
4 from abc import ABCMeta, abstractmethod
5 import six

~/anaconda3/lib/python3.7/site-packages/tffm/core.py in
4
5
----> 6 class TFFMCore():
7 """This class implements underlying routines about creating computational graph.
8

~/anaconda3/lib/python3.7/site-packages/tffm/core.py in TFFMCore()
94 """
95 def init(self, order=2, rank=2, input_type='dense', loss_function=None,
---> 96 optimizer=tf.train.AdamOptimizer(learning_rate=0.01), reg=0,
97 init_std=0.01, use_diag=False, reweight_reg=False,
98 seed=None):

AttributeError: module 'tensorflow_core._api.v2.train' has no attribute 'AdamOptimizer'

libfm benchmark

Need to compare tffm and libfm in terms of speed and quality.

Automatic checkpointing

Hi,

Would it be possible to automatically save the "best" model run in terms of global loss?

Maybe in a way similar to how https://keras.io/callbacks/#modelcheckpoint works?
I had different runs with about 2.000 epochs where the loss would go down to about 5.5 after 1.200 runs just to skyrocket and settle at around 1E7 in the end, with no way to access the intermediate results.

Best regards

“upper_bound” or “i + batch_size”

In base.py, there is a function named batcher. You define a variable upper_bound, but why not use it to update ret_y and ret_w ?

Running tffm on a single core

I need to run multiple TFFMRegressor objects in joblib Parallel. To do so, I passed the following parameter:

session_config=tf.ConfigProto(intra_op_parallelism_threads=1,
                                                    inter_op_parallelism_threads=1,
                                                    allow_soft_placement=True,
                                                    device_count = {'CPU': 1, 'GPU': 0})

However, my cores do not seem to run whenever I use n_jobs=2 or higher in Parallel; my python notebook cell just hangs, never completes and my processors are not used. At n_jobs=1, everything is running fine. What am I missing? Would I better use polylearn instead for this kind of task?

If log dir setting to ".", it just cleans all the files without any warning

Hi, log_dir parameter is doing something dangerous. The first try of this code, I set log dir to ".", and without any warning, it just removed all my files of that dir.

I hope this can be fixed, so that no such thing happens to anybody else.

Does tffm support multi-threading?

Does tffm support multi-threading? While running tffm, all the cores on my machine are not running at 100% CPU. Is there anyway to add multi-threading at tf-level into tffm just like how it's configured on keras here: https://github.com/fchollet/keras/blob/master/keras/backend/tensorflow_backend.py#L105 ?
Thank you.