Code Monkey home page Code Monkey logo

gc-mc's People

Contributors

riannevdberg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gc-mc's Issues

the W_user or W_movies

Hi Rianne,
Sorry, but a little confused with the data in .mat, what's the meaning of the W_user or W_movies? Thanks~

Are you using test rating matrix during matrix completion??

I analyze your code and I think you use test rating matrix is used while completing test rating matrix.

Is it real?????

I judge following code.

test_support = support[np.array(test_u)]
test_support_t = support_t[np.array(test_v)]

This part save the things which we need to predict. Right?

Also, when GCMC do encoding,

self.layers.append(OrdinalMixtureGCN(input_dim=self.input_dim,
                                                 output_dim=self.hidden[0],
                                                 support=self.support,
                                                 support_t=self.support_t,
                                                 num_support=self.num_support,
                                                 u_features_nonzero=self.u_features_nonzero,
                                                 v_features_nonzero=self.v_features_nonzero,
                                                 sparse_inputs=True,
                                                 act=tf.nn.relu,
                                                 bias=False,
                                                 dropout=self.dropout,
                                                 logging=self.logging,
                                                 share_user_item_weights=True,
                                                 self_connections=False))

In this code, GCN use support to encode, which is predicted itself.

If I have mistake, please correct me.

I think we only use training matrix to complete matrix and test matrix is used only for calculating RMSE or some evaluation metric.

When running your code, some error happened

Hello author:
the environement is python 3.6 , tensorflower 1.4 but, when run your code , some error happens. Here is the specific errors.

Settings:
{'dataset': 'flixster', 'data_seed': 1234, 'feat_hidden': 64, 'write_summary': False, 'norm_symmetric': False, 'dropout': 0.7, 'features': True, 'epochs': 200, 'summaries_dir': 'logs/2018-11-15_15:03:52.631584', 'accumulation': 'stack', 'num_basis_functions': 2, 'learning_rate': 0.01, 'testing': True, 'hidden': [500, 75]}

number of users = 2341
number of item = 2956
Traceback (most recent call last):
File "train.py", line 150, in
test_u_indices, test_v_indices, class_values = load_data_monti(DATASET, TESTING)
File "/home/zxj/PycharmProjects/gc-mc/gcmc/preprocessing.py", line 274, in load_data_monti
np.random.shuffle(rand_idx)
File "mtrand.pyx", line 4852, in mtrand.RandomState.shuffle
File "mtrand.pyx", line 4855, in mtrand.RandomState.shuffle
TypeError: 'range' object does not support item assignment
Can you help me solved this problems! Thank you very much!

Data's problem

Hi,
I am trying to run your code, however I am confused about the data.
Are douban data and flixster data the same data? Becuase, they have the same size and same project names inside.
Also, I tried to open W_movies in flixster data by using the follow code
question
and system shows that "Unable to open object (object 'W_movies' doesn't exist)".
However I can run your code with flixster data in cmd.
Could you told me why my code doesn't work? Thank you so much.

Best wishes
Lee

Adding side features to train_mini_batch.py

Hi,

Thank you for the code, I read the paper and I am interested in applying this model to a large scale bipartite graph with side features for each node. I was able to get "train_mini_batch.py" up and running on movielens 10 million. I am interested in modifying the code to incorporate side features.

Do you think this is feasible? I have been looking at "train.py" and as far as I can see, I should just be able to treat the side features as additional sparse matrices. In batch mode, I can look up the required rows of the side feature matrices analogous to train_u_indices_batch on line 284 of "train_mini_batch.py". Would I have to make any internal modifications to RecommenderSideInfoGAE?

Do you think this is feasible?

Thank You,

Kuhan

test data generated from training data?

Hi Rianne,

I have a question regarding the support matrix data. From the code, it seems you are using train rating matrix as a full dataset to generate also test support matrix.

  1. rating support matrix rating_mx_train is generated from training rating data. (in testing, it contains training and validation data).

    gc-mc/gcmc/preprocessing.py

    Lines 191 to 193 in 722f37d

    rating_mx_train = np.zeros(num_users * num_items, dtype=np.float32)
    rating_mx_train[train_idx] = labels[train_idx].astype(np.float32) + 1.
    rating_mx_train = sp.csr_matrix(rating_mx_train.reshape(num_users, num_items))

  2. support matrix is generated from the adj_train which is the rating_mx_train

    gc-mc/gcmc/train.py

    Lines 203 to 216 in 722f37d

    adj_train_int = sp.csr_matrix(adj_train, dtype=np.int32)
    for i in range(NUMCLASSES):
    # build individual binary rating matrices (supports) for each rating
    support_unnormalized = sp.csr_matrix(adj_train_int == i + 1, dtype=np.float32)
    if support_unnormalized.nnz == 0 and DATASET != 'yahoo_music':
    # yahoo music has dataset split with not all ratings types present in training set.
    # this produces empty adjacency matrices for these ratings.
    sys.exit('ERROR: normalized bipartite adjacency matrix has only zero entries!!!!!')
    support_unnormalized_transpose = support_unnormalized.T
    support.append(support_unnormalized)
    support_t.append(support_unnormalized_transpose)

  3. But then 'test_support' is extracted from 'support'.

    test_support = support[np.array(test_u)]

Shouldn't we change line 192 to
rating_mx_train[idx_nonzero] = labels[idx_nonzero].astype(np.float32) + 1.0
such that all rating_mx_train contains all rating data.

gc-mc/gcmc/preprocessing.py

Lines 191 to 193 in 722f37d

rating_mx_train = np.zeros(num_users * num_items, dtype=np.float32)
rating_mx_train[train_idx] = labels[train_idx].astype(np.float32) + 1.
rating_mx_train = sp.csr_matrix(rating_mx_train.reshape(num_users, num_items))

Where is the augmented adjacency matrix in `global_normalize_bipartite_adjacency`

In the implementation of global_normalize_bipartite_adjacency function,
there is a comment

# degree_u and degree_v are row and column sums of adj+I

But I don't see the augmented adj (adj+I) in and before the function.

Where is the adj+I?

Why does it matter?
If you only use the adj with different ratings, then when aggregating the features from its neighbors (for users, those are items, vice versa), the new feature has no information of itself.

normalize_features(feat)?

Your code in preprocessing.py in line 17
degree = np.asarray(feat.sum(1)).flatten()
If I understand it correctly, this is to sum horizontally over the feature matrix, and then inverse it as the multiplier. To me sum over axis=0 makes more sense?

How to apply multiple layers for the mini-batch version?

Hello,
I have a question of the mini-batch code. The first gcn layer outputs batch-size num embeddings, but the support matrix need to matmul the full-size embeddings. That is, the output embeddings of the first gcn layer is unable to meet the requirements of the operations in the second gcn layer. How should we solve the problem. Hope for your help soon. Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.