marius-team / marius Goto Github PK
View Code? Open in Web Editor NEWLarge scale graph learning on a single machine.
Home Page: https://marius-project.org
License: Apache License 2.0
Large scale graph learning on a single machine.
Home Page: https://marius-project.org
License: Apache License 2.0
Is your feature request related to a problem? Please describe.
We previously have observed occasional issues when using Python 3.8 and 3.9, but that behavior wasn't documented.
A user recently reported that they had trouble using python 3.8 and 3.9 with the system. #55 (comment)
Describe the solution you'd like
The tox test suites should be modified to run under python 3.6-3.9.
We should document and fix any compatibility issues with specific python versions.
Hi,
I am trying to preprocess ogb_mag240m with marius_preprocess --dataset ogb_mag240m --output_dir datasets/ogb_mag240m/
while it was killed due to oom.
The dataset.yaml was half-way generated:
dataset_dir: /marius/datasets/ogb_mag240m/
num_edges: 1297748926
num_nodes: 121751666
num_relations: 1
num_train: 1297748926
num_valid: -1
num_test: -1
node_feature_dim: -1
rel_feature_dim: -1
num_classes: -1
initialized: false
The cpu mem is as high as I am able to get (312GB). I am wondering if there is any way around if I want to run ogb_mag240m on this machine. Thank you.
Describe the bug
The path "./output_dir/" is hardcoded as the location where configuration files are generated.
If the directory doesn't exist then the preprocessing will throw an error.
To Reproduce
Any call to tools/preprocessing.py will hit this issue.
Expected behavior
The configuration files should be output to the proper directory
Environment
Affects all environments
Describe the bug
run marius_preprocess or import preprocess would trigger the following error.
free(): invalid pointer
Aborted
To Reproduce
Steps to reproduce the behavior:
Environment
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
Python 3.8.5
Is your feature request related to a problem? Please describe.
We currently provide no enforcement of python style in our python sources and our testing of these sources is incomplete.
This means some easy to catch bugs are missed during testing: #86
Describe the solution you'd like
Add a set of linting tools which:
Flake8 seems like a solid tool which can achieve all of the above: https://flake8.pycqa.org/en/latest/index.html
An example of its usage can be found in many open source libraries. Here's two:
PyKeen: https://github.com/pykeen/pykeen/blob/d9a93b07f85c839169530a3f0a4c8845c306602a/.flake8
Dask: https://github.com/dask/dask/blob/9bdc32a896e35f69770ec291d83655a1fd1a0346/setup.cfg
Describe alternatives you've considered
N/A
Additional context
N/A
Right now, the source code for both python and c++ is scattered among several folders. It would be most idiomatic to have all python code come under /src/marius
and perhaps have /src/cpp
for the other code. Not sure if this will be a problem with the extensions though
Describe the bug
The test test/python/preprocessing/test_csv_preprocessor.py::TestGeneralParser::test_remap_ids_true
occasionally fails with an assertion error when it should pass.
2021-06-05T17:17:06.1877640Z =================================== FAILURES ===================================
2021-06-05T17:17:06.1878720Z ____________________ TestGeneralParser.test_remap_ids_true _____________________
2021-06-05T17:17:06.1879520Z
2021-06-05T17:17:06.1880940Z self = <test.python.preprocessing.test_csv_preprocessor.TestGeneralParser testMethod=test_remap_ids_true>
2021-06-05T17:17:06.1882130Z
2021-06-05T17:17:06.1882890Z def test_remap_ids_true(self):
2021-06-05T17:17:06.1883710Z """
2021-06-05T17:17:06.1885570Z Check if processed data has non-sequential ids if remap_ids is set
2021-06-05T17:17:06.1886600Z to True
2021-06-05T17:17:06.1887270Z """
2021-06-05T17:17:06.1888190Z general_parser([str(Path(input_dir) / Path(train_file)),
2021-06-05T17:17:06.1889190Z str(Path(input_dir) / Path(valid_file)),
2021-06-05T17:17:06.1890150Z str(Path(input_dir) / Path(test_file))],
2021-06-05T17:17:06.1891080Z ["srd"], [output_dir], remap_ids=True)
2021-06-05T17:17:06.1891820Z
2021-06-05T17:17:06.1892810Z internal_node_ids = np.fromfile(str(Path(output_dir)) /
2021-06-05T17:17:06.1893870Z Path("node_mapping.bin"), dtype=int)
2021-06-05T17:17:06.1895720Z internal_rel_ids = np.fromfile(str(Path(output_dir)) /
2021-06-05T17:17:06.1896890Z Path("rel_mapping.bin"), dtype=int)
2021-06-05T17:17:06.1897690Z
2021-06-05T17:17:06.1898440Z delta_list = []
2021-06-05T17:17:06.1899820Z for i in range(len(internal_node_ids) - 1):
2021-06-05T17:17:06.1901380Z delta_list.append(internal_node_ids[i+1] - internal_node_ids[i])
2021-06-05T17:17:06.1902890Z delta_list_1 = [i - 1 for i in delta_list]
2021-06-05T17:17:06.1903910Z delta_list_2 = [i + 1 for i in delta_list]
2021-06-05T17:17:06.1904900Z self.assertNotEqual(sum(delta_list_1), 0)
2021-06-05T17:17:06.1905920Z self.assertNotEqual(sum(delta_list_2), 0)
2021-06-05T17:17:06.1906960Z self.assertNotEqual(sum(delta_list), 0)
2021-06-05T17:17:06.1907760Z
2021-06-05T17:17:06.1908510Z delta_list = []
2021-06-05T17:17:06.1909830Z for i in range(len(internal_rel_ids) - 1):
2021-06-05T17:17:06.1911480Z delta_list.append(internal_rel_ids[i+1] - internal_rel_ids[i])
2021-06-05T17:17:06.1913010Z delta_list_1 = [i - 1 for i in delta_list]
2021-06-05T17:17:06.1913950Z delta_list_2 = [i + 1 for i in delta_list]
2021-06-05T17:17:06.1914920Z > self.assertNotEqual(sum(delta_list_1), 0)
2021-06-05T17:17:06.1915860Z E AssertionError: 0 == 0
2021-06-05T17:17:06.1931790Z
2021-06-05T17:17:06.1932910Z test/python/preprocessing/test_csv_preprocessor.py:326: AssertionError
To Reproduce
Running the python test suite can cause this error to occur: https://github.com/marius-team/marius/pull/43/checks?check_run_id=2753751019
Expected behavior
The test should make sure the the ids have been remapped without failure.
Environment
All environments.
Additional context
The issue seems to be stemming from delta_list_1
and delta_list_2
. My guess is that if the IDs have been remapped to a specific set of values, that will cause this method of checking to fail.
Is there a different/easier way to check that the ids have been remapped? I think the old id values and the new id values can be just compared directly and asserted to be different to pass the test.
What is the documentation lacking? Please describe.
The documentation is only populated for describing the configuration files. The rest of the documentation needs to be filled out.
Describe the improvement you'd like
Add documentation for:
Additional context
The documentation should also be built and hosted automatically on the marius-project.org website. This can be put in a separate pull request.
Describe the bug
Traceback (most recent call last):
File "/Users/cthoyt/dev/marius/test.py", line 20, in <module>
fb15k_example()
File "/Users/cthoyt/dev/marius/test.py", line 8, in fb15k_example
train_set, eval_set = m.initializeDatasets(config)
RuntimeError: filesystem error: in copy_file: No such file or directory [training_data/marius/edges/train/edges.bin] [output_dir/train_edges.pt]
To Reproduce
I took the example from the README verbatim besides fixing the config path
import marius as m
def fb15k_example():
config_path = "/Users/cthoyt/dev/marius/examples/training/configs/kinships_cpu.ini"
config = m.parseConfig(config_path)
train_set, eval_set = m.initializeDatasets(config)
model = m.initializeModel(config.model.encoder_model, config.model.decoder_model)
trainer = m.SynchronousTrainer(train_set, model)
evaluator = m.SynchronousEvaluator(eval_set, model)
trainer.train(1)
evaluator.evaluate(True)
if __name__ == "__main__":
fb15k_example()
Expected behavior
A clear and concise description of what you expected to happen.
Environment
Mac os 11.2.3 big sur, python 3.9.2, pip installed from latest code on marius
What is the documentation lacking? Please describe.
The documentation about how to use marius_preprocess
to download and convert supported 21 datasets to Marius trainable versions and how to convert custom datasets to Marius trainable versions.
Describe the improvement you'd like
The documentation about how to use marius_preprocess
on supported datasets and custom datasets.
Hi,
I've been reading the documentation for node classification and edge prediction tasks. I have a set of custom graphs I'd like to use for graph classification or graph-level embeddings for additional downstream tasks. Is this possible with the current version of Marius?
Thank you.
Describe the bug
Line 179 of csv_converter.py
strips the output_dir
address. In the case the address starts with /
, the leading /
is also striped.
To Reproduce
Steps to reproduce the behavior:
preprocess.py
, enter the output_directory
with an address starts with /
Expected behavior
The output_directory option should take any valid input addresses.
Environment
Any environment would have this problem.
Is your feature request related to a problem? Please describe.
Currently configuration generation is performed for every call to preprocess.py and three configuration files are generated. One for CPU training, one for GPU training, and one for multi-GPU training.
Describe the solution you'd like
We should put this generation into a separate optional step where it can be included in the preprocessing by adding a flag to the preprocessor call.
E.g.
python3 preprocess.py fb15k output_dir/ // No config generated
python3 preprocess.py fb15k output_dir/ --generate_config // generates a single-GPU training configuration file by default
python3 preprocess.py fb15k output_dir/ --generate_config=GPU // generates a single-GPU training configuration file
python3 preprocess.py fb15k output_dir/ --generate_config=CPU // generates a CPU training configuration file
python3 preprocess.py fb15k output_dir/ --generate_config=Multi-GPU // generates a multi-GPU training configuration default
Adding config arguments should be supported too.
python3 preprocess.py fb15k output_dir/ --generate_config=GPU <args>
python3 preprocess.py fb15k output_dir/ --generate_config=GPU --model.embedding_size=400 // generates a single-GPU training configuration file for 400 dimensional embeddings
We should also allow for this configuration generation to be called separately. E.g.
python3 generate_config.py <files> <args> // args should include --embedding_dimension --num_partitions and --config_type (GPU, CPU, multi-GPU)
Describe alternatives you've considered
We could also disable this feature, but providing configuration file generation makes it easier on the users to get up and running on built-in and custom datasets.
Additional context
Eventually we will want to support generating a config based on user system characteristics. E.g. The users system has 64 GB of memory and wants to train 400 dimension embeddings on Freebase86m. We can set the number of partitions and buffer capacity to well utilize the available memory.
What is the documentation lacking? Please describe.
The variable output_directory in the input arguments is overloaded. The directory is used to include both the input and output data for a data set. Imprecise naming leads to wrong use of the system.
Describe the improvement you'd like
Rename the variable to data_directory instead of output_directory
Is your feature request related to a problem? Please describe.
Currently we have a testing framework Initialized for the c++ code with GTest (although the tests aren't written yet). For the python part of the codebase, we need to initialize a testing framework for python.
Describe the solution you'd like
Tox and Pytest seem to be good candidates for handling python tests.
Describe alternatives you've considered
Hypothesis is an interesting testing framework used by PyTorch: https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md
Additional context
Further work needs to be done to populate the tests for the cpp code and the python code
Hello,
I installed marius on my local server, using the command 'pip3 install .'.
I successfully installed it, but I cannot import marius.
If I try to import marius or run 'marius_preprocess', then I get the error below:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/.conda/envs/marius/lib/python3.7/site-packages/marius/__init__.py", line 13, in <module>
from . import _config as config
ImportError: /home/.conda/envs/marius/lib/python3.7/site-packages/marius/libmarius.so: undefined symbol: _ZNSt12experimental10filesystem2v16statusERKNS1_4pathE
Could I fix it and run marius??
The following is my setting:
cmake:
version: 3.13.2
cpu_info:
num_cpus: 64
total_memory: 754GB
cuda:
version: '10.2'
gpu_info:
- memory: 24GB
name: NVIDIA TITAN RTX
marius:
bindings_installed: false
install_path: N/A
version: N/A
openmp:
version: '201511'
operating_system:
platform: Linux-4.15.0-162-generic-x86_64-with-debian-buster-sid
pybind:
PYBIND11_BUILD_ABI: _cxxabi1011
PYBIND11_COMPILER_TYPE: _gcc
PYBIND11_STDLIB: _libstdcpp
python:
compiler: GCC 7.5.0
deps:
breathe_version: 4.33.1
numpy_version: 1.21.6
omegaconf_version: 2.2.1
pandas_version: 1.3.5
pip_version: 21.2.2
pyspark_version: 3.2.1
pytest_version: 7.1.2
sphinx_rtd_theme_version: 1.0.0
torch_version: 1.8.1+cu102
tox_version: 3.25.0
version: 3.7.13
pytorch:
install_path: /home/.conda/envs/marius/lib/python3.7/site-packages/torch
version: 1.8.1+cu102
Thanks
What is the documentation lacking? Please describe.
A code example accompanying the Marius++ paper
Describe the improvement you'd like
A code example accompanying the Marius++ paper
Additional context
Thank you for releasing this amazing repo! Have you released the code/examples to accompany the Marius++ paper - it'd be great to be run Marius++ code to better understand the system. Thank you
Testing Integration with Jira
Is your feature request related to a problem? Please describe.
We currently use Boost for command line argument parsing and parsing .ini configuration files.
Boost is very heavyweight and complicates the build process. Additionally, the download links to the boost library may fail: See #16.
Describe the solution you'd like
We should remove the dependency on Boost by switching to a lightweight library which can parse .ini files and command line arguments with the same semantics.
The implementation with the new library should match functionality with the current implementation in Boost.
Modifications will be largely contained to src/config.cpp
.
One minor dependency on Boost's lockfree queues can be removed in src/buffer.cpp
and replaced with traditional lock + queue data structure.
Describe alternatives you've considered
We can implement our own parsing functionality if no libraries fit our requirements.
Additional context
We might not be able to find a library which does both config parsing and the command line parsing. If we cannot, we should pick one which can do the config parsing and then implement our own command line parser.
Is your feature request related to a problem? Please describe.
The converter for delimited files does not have a set of tests associated with it.
Describe the solution you'd like
We should add tests for each function in csv_converter.py which cover reasonable inputs and possible failure modes.
For example, for the general_parser function https://github.com/marius-team/marius/blob/main/tools/csv_converter.py#L118 we should test:
Part of this testing effort should be to add validators to the input arguments to the general_parsers ensure no unreasonable values are passed into it: e.g a dataset split of (.8, .8), or a format ("sxrd"), etc.
Describe alternatives you've considered
The alternative is to leave it untested. No thanks.
Additional context
While testing we should note the ways we can improve and simplify the design of the preprocessing code and create a list of changes we will want to make in a future pull request. For example, https://github.com/marius-team/marius/blob/main/tools/csv_converter.py#L213, https://github.com/marius-team/marius/blob/main/tools/csv_converter.py#L238, and https://github.com/marius-team/marius/blob/main/tools/csv_converter.py#L252 should be put into a function and called.
Is your feature request related to a problem? Please describe.
Currently, we require the user to specify the output_directory.
We want to remove this requirement and set the output directory to match the name of the dataset by default. For custom datasets, we should choose a reasonable default name such as “custom_dataset/”
Users should still be able to specify the output directory if they wish.
Describe the solution you'd like
We remove the output_directory as a required argument. For built-in dataset, we set the default base directory name "_dataset", for custom datasets, we set the default base directory name "custom_dataset".
Users are given the option to specify the output directory name if they want.
Additional context
This issue corresponds to MAR-51 (https://marius-project.atlassian.net/browse/MAR-51
).
Hi,
I was trying to run marius on ogbn_products dataset on on gcp vm with following:
CPU: 16 x Intel Haswell
Memory: 60 GB
Storage: 1 x SSD 100 GB
GPU: 1 x NVIDIA Tesla P100
OS: Ubuntu 20.04
I installed and ran marius using docker following this.
OOM is triggered if all edges, features and embeddings are placed in DEVICE_MEMORY. Then, I tried to use mixed CPU-GPU following this sample file (where I think there may be some inconsistencies between the sample code and the explanation below for mixed CPU-GPU section). However, this triggers a RuntimeError where the error message is
root@7f36de16f004:/marius# marius_train examples/configuration/ogbn_products.yaml
[2022-07-05 15:01:54.234] [info] [marius.cpp:43] Start initialization
[07/05/22 15:02:03.393] Initialization Complete: 9.158s
[07/05/22 15:02:18.924] ################ Starting training epoch 1 ################
Traceback (most recent call last):
File "/usr/local/bin/marius_train", line 11, in <module>
load_entry_point('marius==0.0.2', 'console_scripts', 'marius_train')()
File "/usr/local/lib/python3.6/dist-packages/marius/console_scripts/marius_train.py", line 18, in main
m.manager.marius_train(config)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking arugment for argument index in method wrapper_index_select)
The configuration file is as following:
model:
learning_task: NODE_CLASSIFICATION
encoder:
train_neighbor_sampling:
- type: ALL
- type: ALL
- type: ALL
layers:
- - type: FEATURE
output_dim: 100
bias: true
- - type: GNN
options:
type: GRAPH_SAGE
aggregator: MEAN
input_dim: 100
output_dim: 100
bias: true
- - type: GNN
options:
type: GRAPH_SAGE
aggregator: MEAN
input_dim: 100
output_dim: 100
bias: true
- - type: GNN
options:
type: GRAPH_SAGE
aggregator: MEAN
input_dim: 100
output_dim: 47
bias: true
decoder:
type: NODE
loss:
type: CROSS_ENTROPY
options:
reduction: SUM
dense_optimizer:
type: ADAM
options:
learning_rate: 0.01
storage:
device_type: cuda
dataset:
dataset_dir: datasets/ogbn_products/
edges:
type: HOST_MEMORY
options:
dtype: int
features:
type: HOST_MEMORY
options:
dtype: float
embeddings:
type: HOST_MEMORY
options:
dtype: float
training:
batch_size: 100
num_epochs: 2
pipeline:
sync: true
evaluation:
batch_size: 100
pipeline:
sync: true
I tried different combinations of setting the type of edges, features and embeddings, but all of them gave the same RuntimeError.
I am wondering how to solve this error to get marius run on ognb_products using gpu. Thank you.
Hi,
I want to run marius with ogbn-products dataset.
I executed the following command:
marius_preprocess --dataset ogbn_products --output_dir datasets/ogbn_products
There was no problem running it, but only 'train_edges.bin' was created in ogbn_products/edges.
There is no 'test_edges.bin' and 'validation_edges.bin'.
How could I get them??
Thanks a lot.
Is your feature request related to a problem? Please describe.
The pythons bindings need additional implementation to support custom models.
Describe the solution you'd like
We should be able to support defining a custom model in the python API by doing the following:
import pymarius as m
class customRelationOperator(m.RelationOperator):
def forward(node_embs, rel_embs):
return node_embs + rel_embs
class customComparator(m.Comparator):
def forward(src_embs, dst_embs):
return src_embs * dst_embs
class CustomModel(m.Model):
def __init__():
self.decoder = m.LinkPredictionDecoder(customComparator(), customRelationOperator())
We may need to make modifications of the c++ to support these semantics.
Tests should be written for the custom models here: https://github.com/marius-team/marius/tree/main/test/python
We should test:
Describe alternatives you've considered
Alternative designs for custom models might require large changes to the core c++ code.
Additional context
For the rest of the bindings, we will add their tests in a future pull request.
Hi,
I built execution files (marius_train and marius_eval), using CMakeLists.txt.
However, when I run this execution file to execute as in the example of github, error occurs.
$ ./marius_train examples/configuration/fb15k_237.yaml
Result:
Aborted (core dumped)
Is the execution files created through CMake not working at the moment? Or is the input that should be entered differently from when running the marius python??
Thanks
Is your feature request related to a problem? Please describe.
Marius currently supports the following backends for storing parameters and training data:
Parquet files are commonly used for handling large amounts of data. Currently, if a user has a large amount of training data (edges) that is stored in a parquet file, they will have to convert the file into the flat file format. This conversion process is handled as a preprocessing step and will likely require the data to be copied.
Describe the solution you'd like
To avoid unnecessary copies of large amounts of data and expensive preprocessing. We should support a parquet file backend directly using https://github.com/apache/parquet-cpp. https://github.com/apache/arrow.
Describe alternatives you've considered
A preprocessor step can be written which converts the input Parquet file into the file format required by the FlatFile backend.
Additional context
This will add an additional dependency on spark to the system (This could be a heavy dependency). We should make this dependency optional as not all users will be operating with parquet files.
Parquet-cpp has merged with https://github.com/apache/arrow. So we can use that instead.
Is your feature request related to a problem? Please describe.
Users shouldn't have to build Marius with cmake to use it. We should provide pip install capabilities to simplify the usage/
Describe the solution you'd like
Building and install Marius from source:
git clone https://github.com/marius-team/marius.git
cd marius
python3 -m pip install .
Installing Marius from PyPi:
python3 -m pip install pymarius
// or
python3 -m pip install marius
Describe alternatives you've considered
The alternative is to make people build it themselves from source with the instructions in the docs. Not great...
Additional context
Should we separate out installing the Python API vs. the config based executable?
Describe the bug
The kinships dataset only has 100 edges, yet uses a batch size of 1000. Other parameters are also far too large for this dataset.
https://github.com/marius-team/marius/blob/main/examples/training/configs/kinships_cpu.ini
https://github.com/marius-team/marius/blob/main/examples/training/configs/kinships_gpu.ini
To Reproduce
See #23
Expected behavior
We should be providing reasonable hyper parameters for each dataset.
Environment
All environments
Additional context
Other datasets might also have this issue. We should check each one to make sure the values are at least reasonable. For future work we should tune them to optimal values.
Describe the bug
The example training scripts in ./examples/training/scripts/
are not updated according to the latest documents.
To Reproduce
The scripts in ./examples/training/scripts/
would not work with the current version of marius_preprocess
.
Expected behavior
The scripts in ./examples/training/scripts/
work with the current version of marius_preprocess
.
Environment
This issue should be present in any environment.
What is the documentation lacking? Please describe.
Developers need to know how to contribute to Marius.
Describe the improvement you'd like
Add a CONTRIBUTING.md
file to the repo which describes development instructions and the workflow.
Describe the bug
The path of node_ids.bin
and rel_ids.bin
files are still generated by config generator. However, these two files are no longer required by Marius and not generated during preprocessing.
To Reproduce
Steps to reproduce the behavior:
path.node_ids
and path.rel_ids
are present. This is unwanted.Expected behavior
There should not be path.node_ids
and path.rel_ids
in the configuratio file generated by config generator.
Environment
Any environment would have this bug.
What is the documentation lacking? Please describe.
Please add a clear description of the output of pre-processing; specifically, describe all files, their format, schema, and encoding requirements that are output by pre-processing.
Describe the improvement you'd like
Add this description in the comments of the general_parser function
Additional context
The above enhancement will enable writing custom (scalable) pre-processors that can emit Marius input files and won't require one starting from a raw CSV file.
Our current functionality is limited. We only support DistMult, ComplEx, and TransE, with double-sided relation embeddings.
We should expand our functionality by adding more models to Marius. The first thing to do is to scope out which models are out there and which can be implemented easily in our current abstractions.
A starting point is to look into the models supported by PyKeen:
List:
https://github.com/pykeen/pykeen#models-26
Implementation:
https://github.com/pykeen/pykeen/blob/master/src/pykeen/nn/functional.py
Documentation:
https://pykeen.readthedocs.io/en/stable/api/pykeen.nn.functional.convkb_interaction.html
Once we get a better handle on which models are out there, we can see in what ways our current abstractions are lacking and how we can improve them.
For each decoder model we should ask the following questions:
Describe the bug
I am trying to build marius on a google cloud vm. The build went smoothly until the final step at which point I got an error at the linking step:
[100%] Built target marius
Scanning dependencies of target marius_train
[100%] Building CXX object CMakeFiles/marius_train.dir/src/marius.cpp.o
[100%] Linking CXX executable marius_train
/usr/bin/ld: libmarius.so: undefined reference to std::filesystem::copy_file(std::filesystem::path const&, std::filesys tem::path const&, std::filesystem::copy_options)' /usr/bin/ld: libmarius.so: undefined reference to
std::filesystem::path::_M_split_cmpts()'
/usr/bin/ld: libmarius.so: undefined reference to std::filesystem::status(std::filesystem::path const&)' /usr/bin/ld: libmarius.so: undefined reference to
std::filesystem::rename(std::filesystem::path const&, std::filesystem
::path const&)'
collect2: error: ld returned 1 exit status
make[3]: *** [CMakeFiles/marius_train.dir/build.make:100: marius_train] Error 1
make[2]: *** [CMakeFiles/Makefile2:113: CMakeFiles/marius_train.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:125: CMakeFiles/marius_train.dir/rule] Error 2
make: *** [Makefile:188: marius_train] Error 2
To Reproduce
Steps to reproduce the behavior:
Follow the installation instruction from github README:
git clone https://github.com/marius-team/marius.git
cd marius
python3 -m pip install -r requirements.txt
mkdir build
cd build
cmake ../ -DUSE_CUDA=1
make marius_train -j
Expected behavior
The final step of the build should create marius executables, presumably in the build/bin directory.
Environment
This build was attempted on a google cloud vm with these parameters:
boot disk = tensorflow-2-4-20210414-140511-boot
Environment version = M65
I can provide more environment details after sshing into the instance, but not sure what is relevant for the above.
Describe the bug
The content of the node_mapping.txt
and rel_mapping.txt
is not up-to-date. The mapping of original id and remapped id is still represented by *_mapping.txt
and *.bin
files. According to the documentation, only *_mapping.txt
files are required. The mapping of original ids and remapped ids should be the first and second columns of *_mapping.txt
files.
To Reproduce
Run marius_preprocess
. This issue can be found by checking files included in the output directory and contents of *_mapping.txt
files.
Expected behavior
According to the documentation, only *_mapping.txt
files are required. The mapping of original ids and remapped ids should be the first and second columns of *_mapping.txt
files.
Environment
All versions would have this problem.
Additional context
marius_preprocess
and marius_postprocess
should be updated according to the documentation.
What is the documentation lacking? Please describe.
The document does not clearly state how to use config_generator to generate configs for custom and supported datasets.
Describe the improvement you'd like
Explictly mentioned when generating custom dataset configs, use option --stats
for manually input dataset stats. While for supported datasets, use option --datasets
I was trying to install pip and
ERROR: Could not find a version that satisfies the requirement torch (from marius==0.0.2) (from versions: none)
ERROR: No matching distribution found for torch (from marius==0.0.2)
keeps popping up.
Tried to pip install torchvision==0.1.8
in command line and it showed Successfully installed torch-1.11.0 torchvision-0.1.8
. Then, when I tried to pip3 install .
again, the same error appears. I am wondering how to solve this to proceed. Thank you.
Is your feature request related to a problem? Please describe.
As mentioned in PR#45, we need a script which can perform conversion for example configuration files to new versions when changes to the configurations options are made.
Describe the solution you'd like
We need a script that can perform the following basic operations:
Describe alternatives you've considered
We can add additional functions in the future.
Is your feature request related to a problem? Please describe.
Current version of marius_predict
don't support prediction for datasets with only 1 relation type (no relation column).
Describe the solution you'd like
Enable marius_predict
to perform link prediction for datasets with only 1 relation type.
Describe the bug
Currently, marius_predict
and marius_postprocess
assume "marius" as the value of general.experiment_name
. If other values are used, these two tools would not work.
To Reproduce
Steps to reproduce the behavior:
general.experiment_name
other than "marius"marius_predict
or marius_postprocess
Expected behavior
Multiple experiments with different general.experiment_name
can have separate directories under the base directory (default to data/
) for trained data. marius_predict
and marius_postprocess
should be able to handle directories created by different experiments.
Environment
Any environment will have this issue.
Is your feature request related to a problem? Please describe.
Debugging information from the system is currently limited, and when Marius is installed with pip, the full c++ stack trace is hidden from users and only the error message will be output, as in #55.
We currently have a debug log level, which needs to be better utilized to print out useful debug information at reasonable spots in the code.
Also, we should make it easier for users to report their environment, like PyTorch does with this script: https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
Describe the solution you'd like
Detailed debug logs should be created for major checkpoints in the code: reading the config, creating storage objects, initializing the model, ...
A python script based on the above torch script should be added as separate python tool marius_environment
.
Describe the bug
I successfully installed the program and it passed test/cpp/end_to_end
, then when I tried to execute examples/training/scripts/fb15k_gpu.sh
(and also some other configs with GPU enabled), it triggered a nll_loss_backward_reduce_cuda_kernel_2d assertion failure
.
To Reproduce
Steps to reproduce the behavior:
bash examples/training/scripts/fb15k_gpu.sh
marius_preprocess
step is able to be executed without any problemsmarius_train
proceeds to backward
for the first batch of the first epoch, the following error occurs:nfp@node19:~/marius$ bash examples/training/scripts/fb15k_gpu.sh
fb15k
Downloading fb15k.tgz to output_dir/fb15k.tgz
Extracting
Extraction completed
Detected delimiter: ~ ~
Reading in output_dir/freebase_mtr100_mte100-train.txt 1/3
Reading in output_dir/freebase_mtr100_mte100-valid.txt 2/3
Reading in output_dir/freebase_mtr100_mte100-test.txt 3/3
Number of instance per file:[483142, 50000, 59071]
Number of nodes: 14951
Number of edges: 592213
Number of relations: 1345
Delimiter: ~ ~
['/home/nfp/.local/bin/marius_train', 'examples/training/configs/fb15k_gpu.ini']
[info] [10/28/21 22:12:59.865] Start preprocessing
[debug] [10/28/21 22:12:59.866] Initializing Model
[debug] [10/28/21 22:12:59.866] Empty Encoder
[debug] [10/28/21 22:12:59.866] DistMult Decoder
[debug] [10/28/21 22:12:59.867] data/ directory already exists
[debug] [10/28/21 22:12:59.867] data/marius/ directory already exists
[debug] [10/28/21 22:12:59.867] data/marius/embeddings/ directory already exists
[debug] [10/28/21 22:12:59.867] data/marius/relations/ directory already exists
[debug] [10/28/21 22:12:59.867] data/marius/edges/ directory already exists
[debug] [10/28/21 22:12:59.867] data/marius/edges/train/ directory already exists
[debug] [10/28/21 22:12:59.867] data/marius/edges/evaluation/ directory already exists
[debug] [10/28/21 22:12:59.867] data/marius/edges/test/ directory already exists
[debug] [10/28/21 22:12:59.880] Edges: DeviceMemory storage initialized
[debug] [10/28/21 22:12:59.894] Edges shuffled
[debug] [10/28/21 22:12:59.894] Edge storage initialized. Train: 483142, Valid: 50000, Test: 59071
[debug] [10/28/21 22:13:00.004] Node embeddings: DeviceMemory storage initialized
[debug] [10/28/21 22:13:00.004] Node embeddings state: DeviceMemory storage initialized
[debug] [10/28/21 22:13:00.004] Node embeddings initialized: 14951
[debug] [10/28/21 22:13:00.014] Relation embeddings: DeviceMemory storage initialized
[debug] [10/28/21 22:13:00.014] Relation embeddings state: DeviceMemory storage initialized
[debug] [10/28/21 22:13:00.014] Relation embeddings initialized: 1345
[debug] [10/28/21 22:13:00.014] Getting batches from edge list
[info] [10/28/21 22:13:00.014] Training set initialized
[debug] [10/28/21 22:13:00.014] Getting batches from edge list
[debug] [10/28/21 22:13:00.014] Batches initialized
[info] [10/28/21 22:13:00.015] Evaluation set initialized
[info] [10/28/21 22:13:00.015] Preprocessing Complete: 0.149s
[debug] [10/28/21 22:13:00.032] Loaded training set
[info] [10/28/21 22:13:00.032] ################ Starting training epoch 1 ################
[trace] [10/28/21 22:13:00.032] Starting Batch. ID 0, Starting Index 0, Batch Size 10000
[trace] [10/28/21 22:13:00.034] Batch: 0 Accumulated 11109 unique embeddings
[trace] [10/28/21 22:13:00.034] Batch: 0 Accumulated 640 unique relations
[trace] [10/28/21 22:13:00.034] Batch: 0 Indices sent to device
[trace] [10/28/21 22:13:00.034] Batch: 0 Node Embeddings read
[trace] [10/28/21 22:13:00.034] Batch: 0 Node State read
[trace] [10/28/21 22:13:00.034] Batch: 0 Relation Embeddings read
[trace] [10/28/21 22:13:00.034] Batch: 0 Relation State read
[trace] [10/28/21 22:13:00.035] Batch: 0 prepared for compute
[debug] [10/28/21 22:13:00.040] Loss: 124804.266, Regularization loss: 0.012812799
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [1,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [2,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [3,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [4,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [5,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [6,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [7,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [8,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [9,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [10,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [11,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [12,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [13,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [14,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [15,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [16,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [17,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [18,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [19,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [20,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [21,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [22,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [23,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [24,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [25,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [26,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [27,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [28,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [29,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [30,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [31,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
File "/home/nfp/.local/bin/marius_train", line 8, in <module>
sys.exit(main())
File "/home/nfp/.local/lib/python3.6/site-packages/marius/console_scripts/marius_train.py", line 8, in main
m.marius_train(len(sys.argv), sys.argv)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception raised from launch_unrolled_kernel at /pytorch/aten/src/ATen/native/cuda/CUDALoops.cuh:132 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f95645bcd62 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: void at::native::gpu_kernel_impl<at::native::BinaryFunctor<float, float, float, at::native::AddFunctor<float> > >(at::TensorIteratorBase&, at::native::BinaryFunctor<float, float, float, at::native::AddFunctor<float> > const&) + 0xb37 (0x7f95665b2f27 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #2: void at::native::gpu_kernel<at::native::BinaryFunctor<float, float, float, at::native::AddFunctor<float> > >(at::TensorIteratorBase&, at::native::BinaryFunctor<float, float, float, at::native::AddFunctor<float> > const&) + 0x113 (0x7f95665bf333 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #3: void at::native::opmath_gpu_kernel_with_scalars<float, float, float, at::native::AddFunctor<float> >(at::TensorIteratorBase&, at::native::AddFunctor<float> const&) + 0xa9 (0x7f95665bf4c9 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #4: <unknown function> + 0xe5d953 (0x7f9566592953 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #5: at::native::add_kernel_cuda(at::TensorIteratorBase&, c10::Scalar const&) + 0x15 (0x7f95665930a5 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #6: <unknown function> + 0xe5e0cf (0x7f95665930cf in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #7: at::native::structured_sub_out::impl(at::Tensor const&, at::Tensor const&, c10::Scalar const&, at::Tensor const&) + 0x40 (0x7f95a9f1ef00 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x25e52ab (0x7f9567d1a2ab in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #9: <unknown function> + 0x25e5372 (0x7f9567d1a372 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #10: at::_ops::sub_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0xb9 (0x7f95aa55d3f9 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #11: <unknown function> + 0x34be046 (0x7f95ac03c046 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #12: <unknown function> + 0x34be655 (0x7f95ac03c655 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #13: at::_ops::sub_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x13f (0x7f95aa5b5b2f in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #14: <unknown function> + 0x3f299b0 (0x7f95acaa79b0 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #15: torch::autograd::generated::LogsumexpBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&) + 0x1dc (0x7f95abd1447c in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #16: <unknown function> + 0x3896817 (0x7f95ac414817 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #17: torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&) + 0x145b (0x7f95ac40fa7b in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #18: torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&) + 0x57a (0x7f95ac4107aa in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #19: torch::autograd::Engine::thread_init(int, std::shared_ptr<torch::autograd::ReadyQueue> const&, bool) + 0x89 (0x7f95ac4081c9 in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #20: <unknown function> + 0xc71f (0x7f962b3ad71f in /home/nfp/.local/lib/python3.6/site-packages/torch/lib/libtorch_cuda.so)
frame #21: <unknown function> + 0x76db (0x7f962d01f6db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #22: clone + 0x3f (0x7f962d35871f in /lib/x86_64-linux-gnu/libc.so.6)
Expected behavior
The program works well for CPU configs:
nfp@node19:~/marius$ bash examples/training/scripts/fb15k_cpu.sh
fb15k
Downloading fb15k.tgz to output_dir/fb15k.tgz
Extracting
Extraction completed
Detected delimiter: ~ ~
Reading in output_dir/freebase_mtr100_mte100-train.txt 1/3
Reading in output_dir/freebase_mtr100_mte100-valid.txt 2/3
Reading in output_dir/freebase_mtr100_mte100-test.txt 3/3
Number of instance per file:[483142, 50000, 59071]
Number of nodes: 14951
Number of edges: 592213
Number of relations: 1345
Delimiter: ~ ~
['/home/nfp/.local/bin/marius_train', 'examples/training/configs/fb15k_cpu.ini']
[info] [10/28/21 22:19:07.259] Start preprocessing
[info] [10/28/21 22:19:08.397] Training set initialized
[info] [10/28/21 22:19:08.397] Evaluation set initialized
[info] [10/28/21 22:19:08.397] Preprocessing Complete: 1.137s
[info] [10/28/21 22:19:08.410] ################ Starting training epoch 1 ################
[info] [10/28/21 22:19:08.904] Total Edges Processed: 50000, Percent Complete: 0.099
[info] [10/28/21 22:19:09.252] Total Edges Processed: 95000, Percent Complete: 0.198
[info] [10/28/21 22:19:09.700] Total Edges Processed: 152000, Percent Complete: 0.298
[info] [10/28/21 22:19:09.998] Total Edges Processed: 190000, Percent Complete: 0.397
[info] [10/28/21 22:19:10.418] Total Edges Processed: 237000, Percent Complete: 0.496
[info] [10/28/21 22:19:10.809] Total Edges Processed: 286000, Percent Complete: 0.595
[info] [10/28/21 22:19:11.211] Total Edges Processed: 336000, Percent Complete: 0.694
[info] [10/28/21 22:19:11.567] Total Edges Processed: 383000, Percent Complete: 0.793
[info] [10/28/21 22:19:11.958] Total Edges Processed: 432000, Percent Complete: 0.893
[info] [10/28/21 22:19:12.320] Total Edges Processed: 478000, Percent Complete: 0.992
[info] [10/28/21 22:19:12.357] ################ Finished training epoch 1 ################
[info] [10/28/21 22:19:12.357] Epoch Runtime (Before shuffle/sync): 3946ms
[info] [10/28/21 22:19:12.357] Edges per Second (Before shuffle/sync): 122438.414
[info] [10/28/21 22:19:12.358] Pipeline flush complete
[info] [10/28/21 22:19:12.374] Edges Shuffled
[info] [10/28/21 22:19:12.374] Epoch Runtime (Including shuffle/sync): 3963ms
[info] [10/28/21 22:19:12.374] Edges per Second (Including shuffle/sync): 121913.195
[info] [10/28/21 22:19:12.389] Starting evaluating
[info] [10/28/21 22:19:12.709] Pipeline flush complete
[info] [10/28/21 22:19:15.909] Num Eval Edges: 50000
[info] [10/28/21 22:19:15.909] Num Eval Batches: 50
[info] [10/28/21 22:19:15.909] Auc: 0.941, Avg Ranks: 40.139, MRR: 0.336, Hits@1: 0.212, Hits@5: 0.476, Hits@10: 0.600, Hits@20: 0.707, Hits@50: 0.827, Hits@100: 0.895
[info] [10/28/21 22:19:15.920] Evaluation complete: 3531ms
[info] [10/28/21 22:19:15.931] ################ Starting training epoch 2 ################
[info] [10/28/21 22:19:16.361] Total Edges Processed: 46000, Percent Complete: 0.099
[info] [10/28/21 22:19:16.900] Total Edges Processed: 97000, Percent Complete: 0.198
[info] [10/28/21 22:19:17.424] Total Edges Processed: 156000, Percent Complete: 0.298
[info] [10/28/21 22:19:17.697] Total Edges Processed: 189000, Percent Complete: 0.397
[info] [10/28/21 22:19:18.078] Total Edges Processed: 238000, Percent Complete: 0.496
[info] [10/28/21 22:19:18.466] Total Edges Processed: 288000, Percent Complete: 0.595
[info] [10/28/21 22:19:18.825] Total Edges Processed: 336000, Percent Complete: 0.694
[info] [10/28/21 22:19:19.160] Total Edges Processed: 381000, Percent Complete: 0.793
[info] [10/28/21 22:19:19.584] Total Edges Processed: 436000, Percent Complete: 0.893
[info] [10/28/21 22:19:19.909] Total Edges Processed: 481000, Percent Complete: 0.992
[info] [10/28/21 22:19:19.928] ################ Finished training epoch 2 ################
[info] [10/28/21 22:19:19.928] Epoch Runtime (Before shuffle/sync): 3997ms
[info] [10/28/21 22:19:19.928] Edges per Second (Before shuffle/sync): 120876.16
[info] [10/28/21 22:19:19.929] Pipeline flush complete
[info] [10/28/21 22:19:19.947] Edges Shuffled
[info] [10/28/21 22:19:19.948] Epoch Runtime (Including shuffle/sync): 4016ms
[info] [10/28/21 22:19:19.948] Edges per Second (Including shuffle/sync): 120304.29
[info] [10/28/21 22:19:19.961] Starting evaluating
[info] [10/28/21 22:19:20.246] Pipeline flush complete
[info] [10/28/21 22:19:20.255] Num Eval Edges: 50000
[info] [10/28/21 22:19:20.255] Num Eval Batches: 50
[info] [10/28/21 22:19:20.255] Auc: 0.972, Avg Ranks: 21.458, MRR: 0.431, Hits@1: 0.294, Hits@5: 0.595, Hits@10: 0.719, Hits@20: 0.812, Hits@50: 0.906, Hits@100: 0.949
[info] [10/28/21 22:19:20.271] Evaluation complete: 309ms
[info] [10/28/21 22:19:20.282] ################ Starting training epoch 3 ################
[info] [10/28/21 22:19:20.694] Total Edges Processed: 47000, Percent Complete: 0.099
[info] [10/28/21 22:19:21.042] Total Edges Processed: 95000, Percent Complete: 0.198
[info] [10/28/21 22:19:21.425] Total Edges Processed: 143000, Percent Complete: 0.298
[info] [10/28/21 22:19:21.872] Total Edges Processed: 203000, Percent Complete: 0.397
^C[info] [10/28/21 22:19:22.195] Total Edges Processed: 244000, Percent Complete: 0.496
[info] [10/28/21 22:19:22.561] Total Edges Processed: 288000, Percent Complete: 0.595
[info] [10/28/21 22:19:22.971] Total Edges Processed: 342000, Percent Complete: 0.694
[info] [10/28/21 22:19:23.266] Total Edges Processed: 380000, Percent Complete: 0.793
[info] [10/28/21 22:19:23.747] Total Edges Processed: 438000, Percent Complete: 0.893
[info] [10/28/21 22:19:24.101] Total Edges Processed: 479142, Percent Complete: 0.992
...
Environment
I tried on 2 machines and got the same error.
Platform: linux (Ubuntu 18.04 LTS)
Python version: 3.6.9
Pytorch version: 1.10.0+cu102; 1.10.0+cu113
Describe the bug
The training edge list file output by the preprocessor is incomplete, only a single chunk of the unprocessed edges will be written instead.
To Reproduce
Any call to tools/preprocessor.py will hit this issue
Expected behavior
The full input training file should be output into the binary format.
Environment
All environments will hit this issue
I'm trying to create embeddings for Wikidata, using this conf file
[general] device=CPU num_train=611058458 num_nodes=91580024 num_valid=612283 num_test=612283 experiment_name=wikidata num_relations=1390 ...
However, I am getting the error:
ValueError: cannot create std::vector larger than max_size()
Looking for any workaround, thanks
Is your feature request related to a problem? Please describe.
Users need to input custom dataset statistics manually currently whey using config_generator.
Describe the solution you'd like
It would be good to have the preprocess store a JSON file containing all the required custom dataset statistics and add an option to the config_generator so that users can use the stored custom dataset statistics for generating configuration files.
An issue to test if things are properly propagated to JIRA.
Describe the bug
MacOS pip install test throwing GIL error even though all tests pass: https://github.com/marius-team/marius/runs/2401968116
Could be an issue with Python 3.9 since the linux workflow passes but uses Python 3.8. Possibly related to pytorch/pytorch#49370
Output:
2021-04-21T16:01:51.7260960Z ##[group]Run python3 -c "import marius as m"
2021-04-21T16:01:51.7261640Z �[36;1mpython3 -c "import marius as m"�[0m
2021-04-21T16:01:51.7262230Z �[36;1mpython3 -c "from marius.tools import preprocess"�[0m
2021-04-21T16:01:51.7262850Z �[36;1mmarius_preprocess fb15k output_dir/�[0m
2021-04-21T16:01:51.7263760Z �[36;1mpytest test�[0m
2021-04-21T16:01:51.8917040Z shell: /bin/bash --noprofile --norc -e -o pipefail {0}
2021-04-21T16:01:51.8917570Z env:
2021-04-21T16:01:51.8918010Z BUILD_TYPE: Release
2021-04-21T16:01:51.8918430Z ##[endgroup]
2021-04-21T16:02:03.4541320Z fb15k
2021-04-21T16:02:03.4642510Z Downloading fb15k.tgz to output_dir/fb15k.tgz
2021-04-21T16:02:03.4658930Z Extracting
2021-04-21T16:02:03.4659870Z Extraction completed
2021-04-21T16:02:03.4660660Z Detected delimiter:
2021-04-21T16:02:03.4662650Z Reading in output_dir/freebase_mtr100_mte100-train.txt 1/3
2021-04-21T16:02:03.4664160Z Reading in output_dir/freebase_mtr100_mte100-valid.txt 2/3
2021-04-21T16:02:03.4665790Z Reading in output_dir/freebase_mtr100_mte100-test.txt 3/3
2021-04-21T16:02:03.4666760Z Number of instance per file:[483142, 50000, 59071]
2021-04-21T16:02:03.4667560Z Number of nodes: 14951
2021-04-21T16:02:03.4668370Z Number of edges: 592213
2021-04-21T16:02:03.4669180Z Number of relations: 1345
2021-04-21T16:02:03.4670000Z Delimiter: ~ ~
2021-04-21T16:02:05.0357020Z ============================= test session starts ==============================
2021-04-21T16:02:05.0358980Z platform darwin -- Python 3.9.4, pytest-6.2.3, py-1.10.0, pluggy-0.13.1
2021-04-21T16:02:05.0360090Z rootdir: /Users/runner/work/marius/marius
2021-04-21T16:02:05.0360930Z collected 29 items
2021-04-21T16:02:05.0361460Z
2021-04-21T16:04:46.3756720Z test/python/bindings/test_fb15k.py . [ 3%]
2021-04-21T16:04:46.4321450Z test/python/preprocessing/test_config_generator_cmd_opt_parsing.py ..... [ 20%]
2021-04-21T16:04:47.7820700Z ......... [ 51%]
2021-04-21T16:04:47.8108760Z test/python/preprocessing/test_csv_preprocessor.py . [ 55%]
2021-04-21T16:04:59.0886020Z test/python/preprocessing/test_preprocess_cmd_opt_parsing.py ........... [ 93%]
2021-04-21T16:04:59.1086690Z .. [100%]
2021-04-21T16:04:59.1171760Z
2021-04-21T16:04:59.1204890Z ======================== 29 passed in 175.06s (0:02:55) ========================
2021-04-21T16:04:59.2552200Z Fatal Python error: PyEval_SaveThread: the function must be called with the GIL held, but the GIL is released (the current Python thread state is NULL)
2021-04-21T16:04:59.2652700Z Python runtime state: finalizing (tstate=0x7fe41c409b50)
2021-04-21T16:04:59.2754080Z
2021-04-21T16:04:59.2856250Z /Users/runner/work/_temp/511be060-bb2e-418a-ac5e-2e0f5d09f4d7.sh: line 4: 5232 Abort trap: 6 pytest test
To Reproduce
Run the macOS pip install test workflow
Expected behavior
The pip install works fine on linux:
2021-04-21T15:50:21.1538556Z �[36;1mpython3 -c "import marius as m"�[0m
2021-04-21T15:50:21.1539213Z �[36;1mpython3 -c "from marius.tools import preprocess"�[0m
2021-04-21T15:50:21.1539916Z �[36;1mmarius_preprocess fb15k output_dir/�[0m
2021-04-21T15:50:21.1540448Z �[36;1mpytest test�[0m
2021-04-21T15:50:21.1584496Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2021-04-21T15:50:21.1585040Z env:
2021-04-21T15:50:21.1585484Z BUILD_TYPE: Release
2021-04-21T15:50:21.1586287Z ##[endgroup]
2021-04-21T15:50:26.6729316Z fb15k
2021-04-21T15:50:26.6730578Z Downloading fb15k.tgz to output_dir/fb15k.tgz
2021-04-21T15:50:26.6731334Z Extracting
2021-04-21T15:50:26.6731982Z Extraction completed
2021-04-21T15:50:26.6732836Z Detected delimiter:
2021-04-21T15:50:26.6734284Z Reading in output_dir/freebase_mtr100_mte100-train.txt 1/3
2021-04-21T15:50:26.6735973Z Reading in output_dir/freebase_mtr100_mte100-valid.txt 2/3
2021-04-21T15:50:26.6738109Z Reading in output_dir/freebase_mtr100_mte100-test.txt 3/3
2021-04-21T15:50:26.6739043Z Number of instance per file:[483142, 50000, 59071]
2021-04-21T15:50:26.6739918Z Number of nodes: 14951
2021-04-21T15:50:26.6740497Z Number of edges: 592213
2021-04-21T15:50:26.6741087Z Number of relations: 1345
2021-04-21T15:50:26.6741661Z Delimiter: ~ ~
2021-04-21T15:50:27.8808863Z ============================= test session starts ==============================
2021-04-21T15:50:27.8811125Z platform linux -- Python 3.8.5, pytest-6.2.3, py-1.10.0, pluggy-0.13.1
2021-04-21T15:50:27.8812170Z rootdir: /home/runner/work/marius/marius
2021-04-21T15:50:27.8812954Z collected 29 items
2021-04-21T15:50:27.8813617Z
2021-04-21T15:50:50.9462537Z test/python/bindings/test_fb15k.py . [ 3%]
2021-04-21T15:50:50.9827642Z test/python/preprocessing/test_config_generator_cmd_opt_parsing.py ..... [ 20%]
2021-04-21T15:50:51.6762691Z ......... [ 51%]
2021-04-21T15:50:51.6988451Z test/python/preprocessing/test_csv_preprocessor.py . [ 55%]
2021-04-21T15:50:57.6109717Z test/python/preprocessing/test_preprocess_cmd_opt_parsing.py ........... [ 93%]
2021-04-21T15:50:57.6234674Z .. [100%]
2021-04-21T15:50:57.6235430Z
2021-04-21T15:50:57.6236116Z ============================= 29 passed in 30.61s ==============================
Environment
MacOS: platform darwin -- Python 3.9.4, pytest-6.2.3, py-1.10.0, pluggy-0.13.1
Linux: platform linux -- Python 3.8.5, pytest-6.2.3, py-1.10.0, pluggy-0.13.1
Additional context
test/python/bindings/test_fb15k.py is the likely culprit for throwing errors since it's the only one which runs the bindings. Unclear why it marks the test as passed.
Describe the bug
The boost download link is failing. See the excerpt below.
This issue pops up time to time with boost:
boostorg/boost#299
Orphis/boost-cmake#88
[ 22%] Performing download step (download, verify and extract) for 'boost-populate'
-- Downloading...
dst='/tmp/pip-k4qygwbv-build/build/temp.linux-x86_64-3.6/_deps/boost-subbuild/boost-populate-prefix/src/boost_1_71_0.tar.bz2'
timeout='none'
inactivity timeout='none'
-- Using src='https://dl.bintray.com/boostorg/release/1.71.0/source/boost_1_71_0.tar.bz2'
-- [download 0% complete]
CMake Error at boost-subbuild/boost-populate-prefix/src/boost-populate-stamp/download-boost-populate.cmake:170 (message):
Each download failed!
error: downloading 'https://dl.bintray.com/boostorg/release/1.71.0/source/boost_1_71_0.tar.bz2' failed
status_code: 22
status_string: "HTTP response code said error"
log:
--- LOG BEGIN ---
Trying 34.214.135.19:443...
Connected to dl.bintray.com (34.214.135.19) port 443 (#0)
ALPN, offering h2
ALPN, offering http/1.1
successfully set certificate verify locations:
CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
[5 bytes data]
TLSv1.3 (OUT), TLS handshake, Client hello (1):
[512 bytes data]
[5 bytes data]
TLSv1.3 (IN), TLS handshake, Server hello (2):
[102 bytes data]
NPN, negotiated HTTP1.1
[5 bytes data]
TLSv1.2 (IN), TLS handshake, Certificate (11):
[2765 bytes data]
[5 bytes data]
TLSv1.2 (IN), TLS handshake, Server key exchange (12):
[333 bytes data]
[5 bytes data]
TLSv1.2 (IN), TLS handshake, Server finished (14):
[4 bytes data]
[5 bytes data]
TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
[70 bytes data]
[5 bytes data]
TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
[1 bytes data]
[5 bytes data]
TLSv1.2 (OUT), TLS handshake, Next protocol (67):
[36 bytes data]
[5 bytes data]
TLSv1.2 (OUT), TLS handshake, Finished (20):
[16 bytes data]
[5 bytes data]
[5 bytes data]
TLSv1.2 (IN), TLS handshake, Finished (20):
[16 bytes data]
SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
ALPN, server did not agree to a protocol
Server certificate:
subject: CN=*.bintray.com
start date: Sep 26 00:00:00 2019 GMT
expire date: Nov 9 12:00:00 2021 GMT
subjectAltName: host "dl.bintray.com" matched cert's "*.bintray.com"
issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=GeoTrust RSA CA 2018
SSL certificate verify ok.
[5 bytes data]
GET /boostorg/release/1.71.0/source/boost_1_71_0.tar.bz2 HTTP/1.1
Host: dl.bintray.com
User-Agent: curl/7.75.0
Accept: */*
[5 bytes data]
Mark bundle as not supporting multiuse
HTTP/1.1 403 Forbidden
Server: nginx
Date: Mon, 12 Apr 2021 15:10:54 GMT
Content-Type: text/plain
Content-Length: 10
Connection: keep-alive
ETag: "5c3b2e0c-a"
The requested URL returned error: 403
Closing connection 0
--- LOG END ---
CMakeFiles/boost-populate.dir/build.make:98: recipe for target 'boost-populate-prefix/src/boost-populate-stamp/boost-populate-download' failed
make[2]: *** [boost-populate-prefix/src/boost-populate-stamp/boost-populate-download] Error 1
CMakeFiles/Makefile2:82: recipe for target 'CMakeFiles/boost-populate.dir/all' failed
make[1]: *** [CMakeFiles/boost-populate.dir/all] Error 2
Makefile:90: recipe for target 'all' failed
make: *** [all] Error 2
CMake Error at /opt/cmake/share/cmake-3.20/Modules/FetchContent.cmake:1012 (message):
Build step for boost failed: 2
Call Stack (most recent call first):
/opt/cmake/share/cmake-3.20/Modules/FetchContent.cmake:1141:EVAL:2 (__FetchContent_directPopulate)
/opt/cmake/share/cmake-3.20/Modules/FetchContent.cmake:1141 (cmake_language)
third_party/boost-cmake/CMakeLists.txt:19 (FetchContent_Populate)
To Reproduce
Building Marius will encounter this issue if the boost servers are acting up.
Expected behavior
The download of boost should succeed.
Environment
Affects all environments
Additional context
We should remove the dependency on Boost. We only use it to parse .ini configuration files and for parsing command line options.
Describe the bug
I got really wired results regarding the evaluation on the dataset ogbl-ppa with CPU and with GPU, respectively. I have to change the memory to HostDevice for GPU version due to its overwhelming GRAM consumption (I thought the code could be running with 16G but it eventually exceeded 24GB).
To Reproduce
Steps to reproduce the behavior:
Run the marius script with config ogbl_ppa_cpu.ini and ogbl_ppa_gpu.ini, and then we have the following results
[2021-12-12 02:47:01.554] [info] [trainer.cpp:68] ################ Starting training epoch 3 ################
[2021-12-12 02:49:36.904] [info] [trainer.cpp:94] Total Edges Processed: 44586862, Percent Complete: 0.100
[2021-12-12 02:52:19.113] [info] [trainer.cpp:94] Total Edges Processed: 46709862, Percent Complete: 0.200
[2021-12-12 02:55:00.754] [info] [trainer.cpp:94] Total Edges Processed: 48832862, Percent Complete: 0.300
[2021-12-12 02:57:44.074] [info] [trainer.cpp:94] Total Edges Processed: 50955862, Percent Complete: 0.400
[2021-12-12 03:00:25.467] [info] [trainer.cpp:94] Total Edges Processed: 53078862, Percent Complete: 0.500
[2021-12-12 03:03:09.531] [info] [trainer.cpp:94] Total Edges Processed: 55201862, Percent Complete: 0.600
[2021-12-12 03:06:03.269] [info] [trainer.cpp:94] Total Edges Processed: 57324862, Percent Complete: 0.700
[2021-12-12 03:08:51.169] [info] [trainer.cpp:94] Total Edges Processed: 59447862, Percent Complete: 0.800
[2021-12-12 03:11:32.560] [info] [trainer.cpp:94] Total Edges Processed: 61570862, Percent Complete: 0.900
[2021-12-12 03:14:13.438] [info] [trainer.cpp:94] Total Edges Processed: 63693862, Percent Complete: 1.000
[2021-12-12 03:14:13.558] [info] [trainer.cpp:99] ################ Finished training epoch 3 ################
[2021-12-12 03:14:13.558] [info] [trainer.cpp:104] Epoch Runtime (Before shuffle/sync): 1632004ms
[2021-12-12 03:14:13.558] [info] [trainer.cpp:105] Edges per Second (Before shuffle/sync): 13009.73
[2021-12-12 03:14:14.870] [info] [dataset.cpp:761] Edges Shuffled
[2021-12-12 03:14:14.870] [info] [trainer.cpp:113] Epoch Runtime (Including shuffle/sync): 1633315ms
[2021-12-12 03:14:14.870] [info] [trainer.cpp:114] Edges per Second (Including shuffle/sync): 12999.288
[2021-12-12 03:14:37.284] [info] [evaluator.cpp:95] Num Eval Edges: 6062562
[2021-12-12 03:14:37.284] [info] [evaluator.cpp:96] Num Eval Batches: 0
[2021-12-12 03:14:37.284] [info] [evaluator.cpp:97] Auc: 0.508, Avg Ranks: 490.966, MRR: 0.008, Hits@1: 0.006, Hits@5: 0.007, Hits@10: 0.007, Hits@20: 0.008, Hits@50: 0.008, Hits@100: 0.009
[2021-12-13 01:53:58.848] [info] [trainer.cpp:68] ################ Starting training epoch 3 ################
[2021-12-13 01:54:03.413] [info] [trainer.cpp:94] Total Edges Processed: 44583862, Percent Complete: 0.100
[2021-12-13 01:54:07.270] [info] [trainer.cpp:94] Total Edges Processed: 46703862, Percent Complete: 0.200
[2021-12-13 01:54:11.005] [info] [trainer.cpp:94] Total Edges Processed: 48823862, Percent Complete: 0.299
[2021-12-13 01:54:15.259] [info] [trainer.cpp:94] Total Edges Processed: 50943862, Percent Complete: 0.399
[2021-12-13 01:54:19.315] [info] [trainer.cpp:94] Total Edges Processed: 53063862, Percent Complete: 0.499
[2021-12-13 01:54:23.355] [info] [trainer.cpp:94] Total Edges Processed: 55183862, Percent Complete: 0.599
[2021-12-13 01:54:27.633] [info] [trainer.cpp:94] Total Edges Processed: 57303862, Percent Complete: 0.699
[2021-12-13 01:54:31.465] [info] [trainer.cpp:94] Total Edges Processed: 59423862, Percent Complete: 0.798
[2021-12-13 01:54:35.505] [info] [trainer.cpp:94] Total Edges Processed: 61543862, Percent Complete: 0.898
[2021-12-13 01:54:39.482] [info] [trainer.cpp:94] Total Edges Processed: 63663862, Percent Complete: 0.998
[2021-12-13 01:54:39.547] [info] [trainer.cpp:99] ################ Finished training epoch 3 ################
[2021-12-13 01:54:39.547] [info] [trainer.cpp:104] Epoch Runtime (Before shuffle/sync): 40698ms
[2021-12-13 01:54:39.547] [info] [trainer.cpp:105] Edges per Second (Before shuffle/sync): 521694.72
[2021-12-13 01:54:40.847] [info] [dataset.cpp:761] Edges Shuffled
[2021-12-13 01:54:40.847] [info] [trainer.cpp:113] Epoch Runtime (Including shuffle/sync): 41998ms
[2021-12-13 01:54:40.847] [info] [trainer.cpp:114] Edges per Second (Including shuffle/sync): 505546.25
[2021-12-13 01:54:58.952] [info] [evaluator.cpp:95] Num Eval Edges: 6062562
[2021-12-13 01:54:58.952] [info] [evaluator.cpp:96] Num Eval Batches: 0
[2021-12-13 01:54:58.952] [info] [evaluator.cpp:97] Auc: 0.992, Avg Ranks: 2.925, MRR: 0.991, Hits@1: 0.990, Hits@5: 0.991, Hits@10: 0.991, Hits@20: 0.992, Hits@50: 0.993, Hits@100: 0.995
Environment
List your operating system, and dependency versions
Python 3.7.10
pytorch 1.7.1 (py3.7_cuda10.1.243_cudnn7.6.3_0)
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
cmake version 3.16.3
GNU Make 4.2.1
Is your feature request related to a problem? Please describe.
Currently, the supported loss functions we have for Marius are SoftMax and RankingLoss.
Describe the solution you'd like
We can expand the set of loss functions by implementing additional loss functions to Marius.
We can implement losses into 2 new source files: loss.cpp and loss.h.
We can also add a new section in the configuration for loss options.
We can use the loss functions implemented by PyKeen as a reference:
List:
https://github.com/pykeen/pykeen#losses-7
Implementation:
https://github.com/pykeen/pykeen/blob/master/src/pykeen/losses.py
Documentation:
https://pykeen.readthedocs.io/en/stable/reference/losses.html
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.