ysig / grakel Goto Github PK
View Code? Open in Web Editor NEWA scikit-learn compatible library for graph kernels
Home Page: https://ysig.github.io/GraKeL/
License: Other
A scikit-learn compatible library for graph kernels
Home Page: https://ysig.github.io/GraKeL/
License: Other
Hi, nice library!
I am wondering, can the kernels, such as the WL kernel, along with the SVM classifier, take graph representations whose adjacency matrix is continuous/soft? I.e. not discrete integral matrices, an example is [[.9, .1], [.1, .9]].
I saw in the documentation that vertex and edge attributes can be continuous features, but can the adjacency matrix be continuous as well? I tested this out and didn't see an obvious error message yet, but wanted to double check to see if I can expect the results to always be right.
Thanks!
The link towards documentation provided by README is currently down. Any fix or backup?
Hi there!
First, nice job with GraKel ;) It took me very little time to get my graphs into a format that can be handled by GraKel (thanks to the networkx import method and the corresponding example)!
My graphs (computed from some brain imaging data) have multiple vector-valued attributes on each node... Can any of the kernels availabe in GraKel deal with this? (I of course have the solution of concatenating the different vectors into one, but I'm just wondering whether there's a cleaner solution...) Or any kernel in general?
Thanks,
Sylvain
from grakel import Graph
adj=[[0, 1, 0, 0], [1, 0, 1, 0], [0, 1, 0, 1], [0, 0, 1, 0]]
node_label={0: '1', 1: '2', 2: '1', 3: '2'}
edge_label={(0, 1): '1', (1, 2): '1', (2, 3): '1'}
g1=Graph(initialization_object=adj, node_labels=node_label, edge_labels=edge_label)
from grakel.kernels import SubgraphMatching
sm_kernel=SubgraphMatching()
sm_kernel.fit_transform([g1])
Hello!
when I use the Submatching kernel, there is an error as showed in the picture above. Is there something wrong with the code I entered? Thanks a lot for your reply!
While running your library in a HPC (I have a singularity image with the required python packages) I am encountering the following error:
Traceback (most recent call last):
File "MI.py", line 163, in
K_GH = gh_kernel.fit_transform(Gs)
File "/opt/conda/lib/python3.6/site-packages/grakel/kernels/kernel.py", line 194, in fit_transform
self.fit(X)
File "/opt/conda/lib/python3.6/site-packages/grakel/kernels/kernel.py", line 123, in fit
self.X = self.parse_input(X)
File "/opt/conda/lib/python3.6/site-packages/grakel/kernels/graph_hopper.py", line 206, in parse_input
occ_p, des_p = od_vectors_dag(A_cc, D_cc)
File "/opt/conda/lib/python3.6/site-packages/grakel/kernels/graph_hopper.py", line 407, in od_vectors_dag
np.matlib.repmat(np.hstack([0, occ[i, :-1]]), edges_starting_at_ith.shape[0],
AttributeError: module 'numpy' has no attribute 'matlib'
Doing a quick search in stack overflow I wonder if there is the need to explicitly import numpy.matlib? I haven't had this issue while running the file locally. Thanks in advance!
The documentation said the fit()
function extracts that. However, I try to print the result of fit()
by print(wl_kernel.fit([G1,G2]))
and just got the "WeisfeilerLehman(n_iter=1, normalize=True)" in the screen. I am sure that the graph G1, G2, and the WeisfeilerLehman kernel are built correctly. I want to calculate the Jaccard similarity of two graphs by the feature of WL kernel, so printing the feature is necessary. Please help me! Thanks a lot!
Is there any way to use networkx graph format or do you have some function to transform .gexf formats to the appropriate format for the library?
gaussian_random_partition_graph constructor always returns a directed graph no matter how I set the parameter "directed"
I used graphlet kernel for graph classification, but got following error.
Would you please help me?
Traceback (most recent call last):
File "group_kernel.py", line 79, in
K_test = gk.transform(X_test)
File "/home/yukuocen/py_env/lib/python2.7/site-packages/grakel-0.1a2-py2.7-linux-x86_64.egg/grakel/graph_kernels.py", line 344, in transform
K = self.kernel_.transform(X)
File "/home/yukuocen/py_env/lib/python2.7/site-packages/grakel-0.1a2-py2.7-linux-x86_64.egg/grakel/kernels/graphlet_sampling.py", line 247, in transform
Y = self.parse_input(X)
File "/home/yukuocen/py_env/lib/python2.7/site-packages/grakel-0.1a2-py2.7-linux-x86_64.egg/grakel/kernels/graphlet_sampling.py", line 439, in parse_input
self._Y_graph_bins[j], sg):
KeyError: 1
Hi,
Does someone know how to plot a new created Grakel graph?
Thanks
https://ysig.github.io/GraKeL/dev/_modules/grakel/utils.html#graph_from_pandas;
func :graph_from_pandas ;in the third explaining of node_df
When going to https://ysig.github.io/GraKeL/0.1a8, I get a "File not found" error.
First, thank you very much for this wonderful library. I am a beginner in graph kernels, and I have a question. Is there any way to use these graph kernels for regression tasks? Thank you very much again.
Hi,
I am the author of pynauty
. Due to some github magic I discovered that at some stage you used pynauty
. This is just to let you know that I have recently moved the package to Github - pynauty and updated it. I also made it available from PyPi - pynauty, many binary wheels are also provided for Linux and macOS platforms.
Code:
gk = WeisfeilerLehman(n_iter=10, base_graph_kernel=VertexHistogram, normalize=True)
K_train = gk.fit_transform(G_train)
For example,G_train is 149 adjacency matrixs of 90*90
fit_transform call the parse_input(), and then call the Graph()
Error:
line 45, in
K_train = gk.fit_transform(G_train)
File "D:\Anaconda\lib\site-packages\grakel\kernels\weisfeiler_lehman.py", line 295, in fit_transform
km, self.X = self.parse_input(X)
File "D:\Anaconda\lib\site-packages\grakel\kernels\weisfeiler_lehman.py", line 176, in parse_input
'a graph like object and node labels ' +
TypeError: each element of X must be either a graph object or a list with at least a graph like object and node labels dict
Hello, I have a Networkx multigraph that I have created using the AIFB dataset. I want to use the kernels to perform classification tasks on nodes. Since these graphs have edge labels, do you have any suggestions as to what kernel might be good for this case?
I encountered another assertion error when using graphlet_sampling kernel.
File "/usr/local/lib/python2.7/site-packages/grakel/graph_kernels.py", line 386, in fit_transform
K = self.kernel_.fit_transform(X)
File "/usr/local/lib/python2.7/site-packages/grakel/kernels/graphlet_sampling.py", line 300, in fit_transform
self.fit(X)
File "/usr/local/lib/python2.7/site-packages/grakel/kernels/kernel.py", line 123, in fit
self.X = self.parse_input(X)
File "/usr/local/lib/python2.7/site-packages/grakel/kernels/graphlet_sampling.py", line 417, in parse_input
if self._graph_bins[k].isomorphic(sg):
File "grakel/kernels/_isomorphism/bliss.pyx", line 365, in grakel.kernels._isomorphism.bliss.Graph.isomorphic
File "grakel/kernels/_isomorphism/bliss.pyx", line 361, in grakel.kernels._isomorphism.bliss.Graph.get_isomorphism
AssertionError
Looks like the error was raised within bliss library. For GraKeL, maybe an easy fix is to catch the error of the following conditional check code if self._graph_bins[k].isomorphic(sg):
Traceback (most recent call last):
File "D:\python\test\GraKeL-develop\test.py", line 9, in
from grakel import GraphKernel
File "D:\python\test\GraKeL-develop\grakel_init_.py", line 6, in
from grakel.graph_kernels import GraphKernel
File "D:\python\test\GraKeL-develop\grakel\graph_kernels.py", line 13, in
from grakel.kernels import GraphletSampling
File "D:\python\test\GraKeL-develop\grakel\kernels_init_.py", line 4, in
from grakel.kernels.kernel import Kernel
File "D:\python\test\GraKeL-develop\grakel\kernels\kernel.py", line 17, in
from grakel.kernels._c_functions import k_to_ij_triangular
ImportError: cannot import name 'k_to_ij_triangular' from 'grakel.kernels._c_functions' (unknown location)
[Finished in 1.2s with exit code 1]
I just installed grakel according to the tutorial on github and ran the test. The above BUG appeared, and the package could not be installed.
Hello,
I'm using the grakel library (thank you for the very clear documentation and the gathering work !) in order to make classification. But i'm confused about some results I have regarding the Weisfeiler Lehman kernel.
For what I understand about the kernel is that there is no "learning" process : if we fit the kernel on the entire dataset or a subset we should have the same result about the pairwise similarities between the graphs.
However when I run the following code I'm not getting the same kernel at the end. First I fit_transform on all Mutag data getting a K_from_all kernel and then I select a subset (with respect to train and test indices) of this kernel.
I then compare with the same kernel but fitted on a small subset of the data (with respect to the train subset) and transformed on the test subset. I'm getting a K_from_small kernel which is different from the K_from_all kernel :
Did I miss details about the fitting procedure of the kernel ? (for the shortest path kernel I recover the same kernels)
Thank you very much
Titouan
The example examples/optimizing_hyperparameters.py
doesn't work as
The line gk = WeisfeilerLehman(n_iter=i, base_kernel=VertexHistogram, normalize=True)
throws an error as
base_kernel
should be base_graph_kernel
(see https://ysig.github.io/GraKeL/0.1a8/generated/grakel.WeisfeilerLehman.html#grakel.WeisfeilerLehman).
Hi,
I've two problems with random walk kernel. The first one is that putting the flag of normalizing = True, the similarity matrix is all 1's.
The second issue is that, if I put the normalizing flag = False, the diagonal of the similarity matrix is not 0.
This is the code:
def computeKernelRW(graphs):
print("-- computing kernel")
rw_kernel = GraphKernel(kernel=[{"name": "random_walk","lamda":0.5}],normalize=False)
return rw_kernel.fit_transform(graphs)
And this is the call on the main:
K = computeKernelSP(graphs)
Hi There,
I'm trying to understand how one would use ones own graph data with GraKel via networkx and I am having some issues.
For example the following example code will fail:
H2O = scipy.sparse.csr_matrix(([1, 1, 1, 1], ([0, 0, 1, 2], [1, 2, 0, 0])), shape=(3, 3)) G = nx.from_scipy_sparse_matrix(H2O) graph = grakel.graph_from_networkx(G) sp_kernel.fit_transform(graph)
With this rather strange error:
AttributeError: 'int' object has no attribute 'nodes'
Any help you could provide on how to use custom data with GraKel would be most helpful! Is it required that graphs must have labels on the vertices and edges to be used?
When following the documentation to conduct one to one graph comparison using the following code:
wl_kernel.fit(H2O).transform(H3O)
It raises an AttributeError exception:
AttributeError Traceback (most recent call last)
<ipython-input-267-4071ae7832b8> in <module>()
----> 1 wl_kernel.fit(H2O)
/usr/local/lib/python2.7/site-packages/grakel/graph_kernels.pyc in fit(self, X, y)
315 self.component_indices_ = inds
316 else:
--> 317 self.kernel.fit(X)
318
319 # Return the transformer
AttributeError: 'dict' object has no attribute 'fit'
Hi, I wonder if we can use GraKel to calculate graph similarity only? I don't need to classify the graphs, I only want to get the similarity among them, I have generated graphs with networkX, since multiple strategies of graph kernel have been implemented in GraKel, I think it should have the ability to export the graph similarity, but sadly I failed to find such an API.
To perform one step of random walk in the random walk kernel, we need to apply the adjacency matrix of the product graph to the probability of current node. In the line P*=XY
, it is equivalent to np.multiply (element-wise multiplication). But we actually need np.matmul (matrix multiplication) instead. So the correct one should be P = np.matmul(XY, P)
to perform one step of random walk.
if self.p is not None:
P = np.eye(XY.shape[0])
S = self._mu[0] * P
for k in self._mu[1:]:
P *= XY
S += k*P
DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
warnings.warn(msg, category=DeprecationWarning)
cannot import name 'joblib' from 'sklearn.externals
Hi there, it seems train_test_split is not happy when you give it a "generator object graph_from_networkx" as input...
To reproduce this:
G_train, G_test, y_train, y_test = train_test_split(G, [-1,1], test_size=0.5)
I dunno whether it's a bug, but anyhow, do you have a workaround in the meanwhile? (or maybe I don't do things correctly...)
Thanks,
Sylvain
I was testing some directed unlabeled unweighted graphs. One of them triggered the "Singular matrix" error, which corresponds to the following code:
def invert(n, w, v):
return (np.real(np.sum(v, axis=0))/n, np.real(w), np.real(np.sum(inv(v), axis=1))/n)
def add_input(x):
return invert(x.shape[0], *eig(x))
We now could use transform
function to measure similarity between two graphs, but how to get the matching graph?
Could we get corresponding nodes and edges in comparing graphs?
Any suggestions?
Thank for your implementation and detailed documentation!
The graph kernels produce a similarity matrix of graphs, and it's a dot product of graph features.I want to know how to do to get these features.
I have studied the implementation of Weisfeiler Lehman kernel for half of a day and can't figure out so far :(
Thank you so much for your work on this library. I'm a graduate student just getting into this technique, and your framework and documentation has been quite helpful! 🙇♂️👏
I'm doing a project involving callgraphs of Android malware (~40k vertices), which are currently in gml format. I've been converting them into grakel graphs by way of networkx, as below:
gml_paths = [os.path.join(root, file) for root, directories, files in os.walk(path) for file in files if file.endswith(".gml")]
networkx_graphs = [nx.read_gml(path) for path in gml_paths]
grakel_graphs = graph_from_networkx(networkx_graphs, as_Graph=True)
The thing that is surprising to me is that the second I use grakel_graphs
, for example, be coercing into a list list(grakel_graphs)
, a string representation of the graph gets dumped to STDOUT.
When I work in Jupyter Notebook, this results in the following error:
Is this a logging feature or is there some way to disable this functionality to silently convert to lists?
Apologies in advance if there is a naive Python issue on my end. I'm an experienced developer, but new to the language ecosystem.
Thanks!
Implementation parse_input doesn't consider when every input is of length one with data type graph. To be consistent with the other kernels, I have added
elif type(x) is Graph:
pass
Running GraKel version 0.1b7
Trying to run the H2O tutorial on the GraKel introduction, but noticed that I couldn't transform the same graph twice. Does fit_transform
or transform
do anything to the Graph object?
from grakel import Graph
from grakel.kernels import ShortestPath
H2O_adjacency = [[0, 1, 1], [1, 0, 0], [1, 0, 0]]
H2O = Graph(initialization_object=H2O_adjacency)
H3O_adjacency = [[0, 1, 1, 1], [1, 0, 0, 0], [1, 0, 0, 0], [1, 0, 0, 0]]
H3O = Graph(initialization_object=H3O_adjacency)
sp_kernel = ShortestPath(normalize=True, with_labels=False)
sp_kernel.fit_transform([H2O]) #works
sp_kernel.transform([H3O]) #works
sp_kernel.transform([H2O]) #error
Error on the second transform of H2O:
AttributeError: 'tuple' object has no attribute 'shape'
Is there any way to include node attributes (like a feature vector for each node) for the graph in GraKel?
Hi,
I tried running the GH kernel, but got this error: data type must provide an itemsize:
anaconda3\lib\site-packages\grakel\graph.py:312: UserWarning: changing format from "adjacency" to "all"
warnings.warn('changing format from "adjacency" to "all"')
Traceback (most recent call last):
File "test.py", line 71, in
t11 = sp_kernel2.fit_transform([g1])
File "anaconda3\lib\site-packages\grakel\kernels\kernel.py", line 197, in fit_transform
km = self._calculate_kernel_matrix()
File "anaconda3\lib\site-packages\grakel\kernels\kernel.py", line 231, in calculate_kernel_matrix
K[i, i] = self.pairwise_operation(x, x)
File "anaconda3\lib\site-packages\grakel\kernels\graph_hopper.py", line 261, in pairwise_operation
return self.metric((xp.reshape(xp.shape[0], m_sq),) + x[1:],
File "anaconda3\lib\site-packages\grakel\kernels\graph_hopper.py", line 282, in linear_kernel
NA_linear_kernel = np.dot(NA_i, NA_j.T)
File "<array_function internals>", line 5, in dot
ValueError: data type must provide an itemsize
Here is my code (which works fine with the shortest path kernel):
from grakel import Graph
from grakel import GraphKernel
#sp_kernel = GraphKernel(kernel="shortest_path")
from grakel.kernels import ShortestPath, GraphHopper
g1_edges = {(1, 2): 1, (1, 3): 1, (2, 1): 1, (3, 1): 1}
g1_edge_labels = {(1, 2): [1], (1, 3): [2], (2, 1): [1], (3, 1): [2]}
g1_node_labels = {1:'1',2:'2',3:'3'}
g1 = Graph(g1_edges, node_labels=g1_node_labels, edge_labels=g1_edge_labels)
g2_edges = {(1, 2): 1, (1, 3): 1, (2, 1): 1, (3, 1): 1}
g2_edge_labels = {(1, 2): [1], (1, 3): [2], (2, 1): [2], (3, 1): [1]}
g2_node_labels = {1:'1',2:'2',3:'3'}
g2 = Graph(g2_edges, node_labels=g2_node_labels, edge_labels=g2_edge_labels)
sp_kernel2 = GraphHopper(normalize=True)
t11 = sp_kernel2.fit_transform([g1])
t12 = sp_kernel2.transform([g2])
print(t11)
print(t12)
I also tried it with edge labels instead of edge attributes, but it didn't help. The issue happens in this line:
sp_kernel2 = GraphHopper(normalize=True)
Dear Grakel developers,
today I was playing with the Shortest Path kernel (Floyd-Warshall algorithm for building the shortest path matrix) and I was curious to look at the source code.
Your floyd_warshall()
algorithm has been nicely divided into an Initialization step and a Calculation step. The initialization step seems a bit inefficient to me (a double for-loop with three if/else branches). May I propose the following vectorized alternative?
dist = copy.deepcopy(adjacency_matrix)
dist[dist==0] = float("Inf")
np.fill_diagonal(dist, 0)
where, obviously np
stands for numpy
and copy
is included in the standard Python library.
I have been trying this new implementation on the DD
dataset (a quite challenging one) by taking the wall-clock time between the original initialization step and the vectorized one (the calculation block has obviously been omitted) and on a Linux 19.04 machine with Intel i7-3770K I obtained 183.60 seconds (original) against 1.03 seconds (vectorized).
Hope you find this little investigation useful.
All the best
Hi,
I'm having some trouble with the ML kernel. Here's the code (very simple):
from __future__ import print_function
from grakel import kernels
from grakel.datasets import fetch_dataset
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
import numpy as np
# Loads the dataset
dataset = fetch_dataset("IMDB-BINARY", produce_labels_nodes=True)
G, y = dataset.data, dataset.target
# Splits the dataset into a training and a test set
G_train, G_test, y_train, y_test = train_test_split(G, y, test_size=0.1, random_state=42)
# Uses the shortest path kernel to generate the kernel matrices
gk = kernels.MultiscaleLaplacian()
gk.fit(G_train)
K_train = gk.fit_transform(G_train)
K_test = gk.transform(G_test)
# Uses the SVM classifier to perform classification
print("Starting training")
clf = SVC(kernel="precomputed", verbose=True)
clf.fit(K_train, y_train)
y_pred = clf.predict(K_test)
# Computes and prints the classification accuracy
acc = accuracy_score(y_test, y_pred)
print("Accuracy:", str(round(acc * 100, 2)) + "%")
But it fails with this error:
Traceback (most recent call last):
File ".../gkernel/lib/python3.7/site-packages/grakel/kernels/multiscale_laplacian.py", line 465, in parse_input
phi = np.array([list(phi_d[i]) for i in range(A.shape[0])])
File ".../gkernel/lib/python3.7/site-packages/grakel/kernels/multiscale_laplacian.py", line 465, in <listcomp>
phi = np.array([list(phi_d[i]) for i in range(A.shape[0])])
TypeError: 'int' object is not iterable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/sdb-seagate/graph-kernels/train.py", line 43, in <module>
gk.fit(G_train)
File ".../gkernel/lib/python3.7/site-packages/grakel/kernels/kernel.py", line 123, in fit
self.X = self.parse_input(X)
File ".../gkernel/lib/python3.7/site-packages/grakel/kernels/multiscale_laplacian.py", line 467, in parse_input
raise TypeError('Features must be iterable and castable ' +
TypeError: Features must be iterable and castable in total to a numpy array.
Am I doing something wrong? Thank you for your time.
Hi,
I am using GraKeL as installed from the repository (grakel-dev, version 0.1a4). When playing around with RandomWalk, I found that the normalized kernel of a graph G with itself, k(G,G), is sometimes equals to -1. Here's a minimum working example to reproduce the result:
from grakel import RandomWalk
rw_kernel = RandomWalk(normalize=True)
A = [[0, 1, 1, 1],
[1, 0, 1, 1],
[1, 1, 0, 1],
[1, 1, 1, 0]]
labels_A = {0: 'A', 1: 'B', 2: 'C', 3: 'D'}
print(rw_kernel.fit_transform([[A, labels_A]])) # [[1.]]
B = [[0, 1, 1, 1, 1],
[1, 0, 1, 1, 1],
[1, 1, 0, 1, 1],
[1, 1, 1, 0, 1],
[1, 1, 1, 1, 0]]
labels_B = {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'E'}
print(rw_kernel.fit_transform([[B, labels_B]])) # [[-1.]]
Why is this happening?
Thanks in advance for your support.
I am trying to work on graph classification and I keep getting this error when fitting different kernels. Is there any guesses of what might be happening? I am using weighted edges and node features
/usr/local/lib/python3.6/dist-packages/grakel/kernels/shortest_path.py in lhash_labels(S, u, v, *args)
497
498 def lhash_labels(S, u, v, *args):
--> 499 return (args[0][u], args[0][v], S[u, v])
500
501KeyError: 0
Hi,
I'm trying to run my project in a server and it returns me this error (while in my computer not):
File "/home/nbaldan/.local/lib/python3.5/site-packages/grakel/kernels/graphlet_sampling.py", line 229, in initialize
self.n_samples_ = n_samples
UnboundLocalError: local variable 'n_samples' referenced before assignment
Do you know why?
Got the following RuntimeWarning when using graphlet_sampling kernel:
/usr/local/lib/python2.7/site-packages/grakel/kernels/graphlet_sampling.py:313: RuntimeWarning: invalid value encountered in divide
return np.divide(km, np.sqrt(np.outer(self._X_diag, self._X_diag)))
It is probobally caused by INF/NaN value. Not a major problem, but may correspond to some edge cases.
Hi, I want to use GraKel to compare program's function call graph similarity, each caller address is a node, I use the caller address itself as the node's label, besides, loop always exist in program execution, thus one edge may be executed multiple times before the program exit, so I use execution time as edge label.
I want to know whether there is an algorithm in GraKel can handle the situation above? I checked the example, but it only shows the graph with labeled node, I also tried to use WeisfeilerLehman algorithm, but it seems this algorithm won't take edge label into consideration :(
Hi,
I'm having the same problem as mentioned in a previous issue of not being able to produce the correct input format, not from a networkX graph nor the CSV file. Both functions itself work fine, but as soon as I try to use the fit_transform(X) it throws an error:
for graph_from_csv object it throws:
ValueError Traceback (most recent call last)
in
----> 1 sp_kernel.fit_transform(new_G)
~/anaconda3/envs/py36/lib/python3.6/site-packages/grakel/graph_kernels.py in fit_transform(self, X, y)
386 K = self.kernel_.transform(X).dot(self.nystroem_normalization_.T)
387 else:
--> 388 K = self.kernel_.fit_transform(X)
389
390 return K
~/anaconda3/envs/py36/lib/python3.6/site-packages/grakel/kernels/shortest_path.py in fit_transform(self, X, y)
379 """
380 self._method_calling = 2
--> 381 self.fit(X)
382
383 # calculate feature matrices.
~/anaconda3/envs/py36/lib/python3.6/site-packages/grakel/kernels/kernel.py in fit(self, X, y)
121 raise ValueError('fit
input cannot be None')
122 else:
--> 123 self.X = self.parse_input(X)
124
125 # Return the transformer
~/anaconda3/envs/py36/lib/python3.6/site-packages/grakel/kernels/shortest_path.py in parse_input(self, X)
428 elif self._method_calling == 3:
429 self._Y_enum = dict()
--> 430 for (idx, x) in enumerate(iter(X)):
431 is_iter = isinstance(x, collections.Iterable)
432 if is_iter:
~/anaconda3/envs/py36/lib/python3.6/site-packages/grakel/utils.py in graph_from_csv(edge_files, node_files, index_type, directed, sep, as_Graph)
566 type(edge_files[1]) is not bool or
567 type(edge_files[2]) not in [bool, None]):
--> 568 edge_files_error()
569 else:
570 if edge_files[1]:
~/anaconda3/envs/py36/lib/python3.6/site-packages/grakel/utils.py in edge_files_error()
553 """
554 def edge_files_error():
--> 555 raise ValueError('edge_file argument must contain an iterable of strings of edge files, '
556 'a bool weight_flag and attributes_flag bool or None')
557
ValueError: edge_file argument must contain an iterable of strings of edge files, a bool weight_flag and attributes_flag bool or None
whereas I get the same missing,edge, node labels error as in the other issue for my networkX input graph. Essentially i have a large set of unlabelled binary adjacency matrices as csv files and want to measure their similarity using your kernels. I have already tried the PM kernel in Matlab you developed and it was working like a charm! Thanks for your help.
I used GraKel on 3 graph classification datasets (MUTAG, ENZYMES, NCI1), it seems that my results are far away from your paper. I dont't know the reason. Could you explain this? Btw, PTC-MR is currently unsupported. And my source code is here: https://github.com/xnuohz/graph-kernel. Thanks:)
I was reading this introduction:
https://ysig.github.io/GraKeL/dev/user_manual/longer_introduction.html
I exeuted the following lines:
from grakel import GraphKernel
H2O = [[[[0, 1, 1], [1, 0, 0], [1, 0, 0]], {0: 'O', 1: 'H', 2: 'H'}]]
H3O = [[[[0, 1, 1, 1], [1, 0, 0, 0], [1, 0, 0, 0], [1, 0, 0, 0]], {0: 'O', 1: 'H', 2: 'H', 3:'H'}]]
gs_kernel = GraphKernel(kernel=dict(name="graphlet_sampling", n_samples=5))
gs_kernel.fit(H2O)
An error occurs on fit method:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/site-packages/grakel/graph_kernels.py", line 290, in fit
self.initialize_()
File "/usr/local/lib/python3.7/site-packages/grakel/graph_kernels.py", line 430, in initialize_
self.kernel_ = kernel(**params)
TypeError: __init__() got an unexpected keyword argument 'n_samples'
I'm using grakel 0.1a5 under python 3.7.1
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from grakel.datasets import fetch_dataset
from grakel.kernels import ShortestPath
MUTAG = fetch_dataset("MUTAG", verbose=False)
G, y = MUTAG.data, MUTAG.target
G_train, G_test, y_train, y_test = train_test_split(G, y, test_size=0.1, random_state=42)
gk = ShortestPath(normalize=True)
K_train = gk.fit_transform(G_train)
gk.pairwise_operation(G_test[0], G_test[1])
Traceback (most recent call last):
File "", line 1, in
File "/mnt/blossom/more/kgoyal/repos/graphsearch/venv/lib/python3.6/site-packages/grakel/kernels/kernel.py", line 384, in pairwise_operation
raise NotImplementedError('Pairwise operation is not implemented!')
NotImplementedError: Pairwise operation is not implemented!
I am using the Graphlet Sampling kernel with no sampling, i.e.
GK = grakel.GraphletSampling(n_jobs=None, normalize=False, verbose=False, random_state=None, k=this_k, sampling=None)
where this_k
is user-defined.
However, when I do GK.fit_transform()
on a dataset, it shows the following error
Traceback (most recent call last):
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/scipy/optimize/_differentialevolution.py", line 1265, in __call__
return self.f(x, *self.args)
File "main_benchmarkKernels.py", line 68, in myFitness
thisKernelMatrix = GK.fit_transform(thisDataset)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/grakel/kernels/graphlet_sampling.py", line 309, in fit_transform
self.fit(X)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/grakel/kernels/kernel.py", line 117, in fit
self.initialize()
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/grakel/kernels/graphlet_sampling.py", line 229, in initialize
self.n_samples_ = n_samples
UnboundLocalError: local variable 'n_samples' referenced before assignment
By looking at the code (graphlet_sampling.py
) a variable n_samples
is assigned to self.n_samples_
even when sampling
is None
(line 229).
Maybe it's just a minor if/else
issue? The if
branch at line 158 considers the case when sampling
is None
and ends at line 160. The elseif
branch starts at 161 and indeed declares n_samples
, which can be assigned to self.n_samples
at line 229.
Setup: Python 3.6.9 with grakel-dev==0.1a6
Hi, sometimes a division by zero error happens when using the Propagation kernel.
It seems that's because of normalizing the rows of the transition matrix in the following line:
transition_matrix[i] = (T.T / np.sum(T, axis=1)).T
when I replaced the above line by sklearn normalization function, the problem was solved and div by zero didn't happen:
from sklearn.preprocessing import normalize
transition_matrix[i] = normalize(T, axis=1, norm='l1')
Hi, this Graph Kernel repo is perfect. Could you provide the code implementation of EMD Kernel, which proposed in "Matching Node Embeddings for Graph Similarity" AAAI'17?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.