Comments (14)
my solution:
set one bigger than your original setting of the vocab_size in config
from matchzoo.
What's your operating system and how big's your memory?
from matchzoo.
mac os
8G RAM
from matchzoo.
Can you consistently reproduce the problem? and check your memory usage while running the script. Maybe you're running out of your memory.
from matchzoo.
I observed that the memory consumes less than 1G (from 3G raise to <4G).
The segmentation fault can be reproduced consistently.
from matchzoo.
The generated dataset is attached below. The ranking config is same as toy_example/config/dssm_ranking.config (except for the data path).
corpus.txt
corpus_preprocessed.txt
relation_test.txt
relation_train.txt
relation_valid.txt
triletter_dict.txt
word_dict.txt
word_stats.txt
word_triletter_map.txt
from matchzoo.
Try this, see if you can hunt down the exact line that's causing the problem?
from matchzoo.
Process 5098 stopped
* thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0x11a5ab35c)
frame #0: 0x000000010ee628da _sparsetools.so`csr_todense_thunk(int, int, void**) + 2282
_sparsetools.so`csr_todense_thunk:
-> 0x10ee628da <+2282>: addss (%rdx,%rsi,4), %xmm0
0x10ee628df <+2287>: movss %xmm0, (%rdx,%rsi,4)
0x10ee628e4 <+2292>: addq $0x8, %rcx
0x10ee628e8 <+2296>: addq $0x8, %rbx
Target 0: (python) stopped.
(lldb) bt
* thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0x11a5ab35c)
* frame #0: 0x000000010ee628da _sparsetools.so`csr_todense_thunk(int, int, void**) + 2282
frame #1: 0x000000010ee5c418 _sparsetools.so`call_thunk(char, char const*, long (*)(int, int, void**), _object*) + 2456
frame #2: 0x00007fff4f735f89 Python`PyEval_EvalFrameEx + 2917
frame #3: 0x00007fff4f735232 Python`PyEval_EvalCodeEx + 1551
frame #4: 0x00007fff4f73b2b4 Python`___lldb_unnamed_symbol1476$$Python + 290
frame #5: 0x00007fff4f735b45 Python`PyEval_EvalFrameEx + 1825
frame #6: 0x00007fff4f6d40df Python`___lldb_unnamed_symbol419$$Python + 182
frame #7: 0x00007fff4f732b00 Python`___lldb_unnamed_symbol1446$$Python + 140
frame #8: 0x00007fff4f735f89 Python`PyEval_EvalFrameEx + 2917
frame #9: 0x00007fff4f735232 Python`PyEval_EvalCodeEx + 1551
frame #10: 0x00007fff4f73b2b4 Python`___lldb_unnamed_symbol1476$$Python + 290
frame #11: 0x00007fff4f735b45 Python`PyEval_EvalFrameEx + 1825
frame #12: 0x00007fff4f6d40df Python`___lldb_unnamed_symbol419$$Python + 182
frame #13: 0x00007fff4f732b00 Python`___lldb_unnamed_symbol1446$$Python + 140
frame #14: 0x00007fff4f735f89 Python`PyEval_EvalFrameEx + 2917
frame #15: 0x00007fff4f735232 Python`PyEval_EvalCodeEx + 1551
frame #16: 0x00007fff4f6dc935 Python`___lldb_unnamed_symbol510$$Python + 327
frame #17: 0x00007fff4f6bf581 Python`PyObject_Call + 97
frame #18: 0x00007fff4f738f2a Python`PyEval_EvalFrameEx + 15110
frame #19: 0x00007fff4f73b256 Python`___lldb_unnamed_symbol1476$$Python + 196
frame #20: 0x00007fff4f735b45 Python`PyEval_EvalFrameEx + 1825
frame #21: 0x00007fff4f73b256 Python`___lldb_unnamed_symbol1476$$Python + 196
frame #22: 0x00007fff4f735b45 Python`PyEval_EvalFrameEx + 1825
frame #23: 0x00007fff4f735232 Python`PyEval_EvalCodeEx + 1551
frame #24: 0x00007fff4f6dc935 Python`___lldb_unnamed_symbol510$$Python + 327
frame #25: 0x00007fff4f6bf581 Python`PyObject_Call + 97
frame #26: 0x00007fff4f6c9c9e Python`___lldb_unnamed_symbol192$$Python + 163
frame #27: 0x00007fff4f6bf581 Python`PyObject_Call + 97
frame #28: 0x00007fff4f73abfe Python`PyEval_CallObjectWithKeywords + 159
frame #29: 0x00007fff4f766afb Python`___lldb_unnamed_symbol1725$$Python + 70
frame #30: 0x00007fff6cd596c1 libsystem_pthread.dylib`_pthread_body + 340
frame #31: 0x00007fff6cd5956d libsystem_pthread.dylib`_pthread_start + 377
frame #32: 0x00007fff6cd58c5d libsystem_pthread.dylib`thread_start + 13
from matchzoo.
It crashes at main.py 146 history = model.fit_generator(...)
from matchzoo.
I put the code on linux machine and the exception is
[Model] Model Compile Done.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/local/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/lib/python2.7/site-packages/keras/utils/data_utils.py", line 568, in data_generator_task
generator_output = next(self._generator)
File "/home/fanliwen/MatchZoo/matchzoo/inputs/pair_generator.py", line 283, in get_batch_generator
X1, X1_len, X2, X2_len, Y = self.get_batch()
File "/home/fanliwen/MatchZoo/matchzoo/inputs/pair_generator.py", line 83, in get_batch
return next(self.batch_iter)
File "/home/fanliwen/MatchZoo/matchzoo/inputs/pair_generator.py", line 276, in get_batch_iter
yield self.transfer_feat2sparse(X1).toarray(), X1_len, self.transfer_feat2sparse(X2).toarray(), X2_len, Y
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 964, in toarray
return self.tocoo(copy=False).toarray(order=order, out=out)
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 958, in tocoo
dtype=self.dtype)
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/coo.py", line 184, in __init__
self._check()
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/coo.py", line 232, in _check
raise ValueError('column index exceeds matrix dimensions')
ValueError: column index exceeds matrix dimensions
[01-09-2018 19:47:26] [Train:train] Traceback (most recent call last):
File "matchzoo/main.py", line 328, in <module>
main(sys.argv)
File "matchzoo/main.py", line 320, in main
train(config)
File "matchzoo/main.py", line 151, in train
verbose = 0
File "/usr/local/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/keras/engine/training.py", line 2015, in fit_generator
generator_output = next(output_generator)
StopIteration
from matchzoo.
@levyfan some kind of out-bound problem when indexing matrices, but I'm not familiar with the MatchZoo iterators. Maybe @faneshion can help.
from matchzoo.
Hello,
I'm getting the same error while running with AP88 TREC dataset. Actually, I'm trying to run MZ models using TREC datasets, in a SLURM server but I've got the same problem with ARC_I, ARC_II, CDSSM, DRMM_TKS .. and the allocated memory is not completely used.
Here is what printed on the screen:
{
"model": {
"model_py": "arci.ARCI",
"model_path": "matchzoo/models/",
"setting": {
"dropout_rate": 0.5,
"kernel_count": 8,
"kernel_size": 3,
"d_pool_size": 2,
"q_pool_size": 2
}
},
"losses": [
{
"object_params": {
"margin": 0.5
},
"object_name": "rank_hinge_loss"
}
],
"global": {
"model_type": "PY",
"learning_rate": 0.0001,
"optimizer": "adam",
"weights_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/weights/arci_ranking.weights",
"num_iters": 10,
"save_weights_iters": 10,
"display_interval": 10,
"test_weights_iters": 10
},
"metrics": [
"precision@10",
"ndcg@10",
"ndcg@20",
"map"
],
"net_name": "ARCI",
"outputs": {
"predict": {
"save_format": "TREC",
"save_path": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/predictions/from_qrels/predict.test.arci_ranking.txt"
}
},
"inputs": {
"share": {
"train_embed": false,
"text1_corpus": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt",
"use_dpool": false,
"embed_size": 300,
"text2_maxlen": 1000,
"text2_corpus": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt",
"vocab_size": 129897,
"target_mode": "ranking",
"text1_maxlen": 20
},
"train": {
"use_iter": false,
"batch_size": 100,
"query_per_iter": 50,
"batch_per_iter": 5,
"input_type": "PairGenerator",
"relation_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt",
"phase": "TRAIN"
},
"predict": {
"batch_list": 10,
"relation_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_test.txt",
"input_type": "ListGenerator",
"phase": "PREDICT"
},
"test": {
"batch_list": 10,
"relation_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_test.txt",
"input_type": "ListGenerator",
"phase": "EVAL"
},
"valid": {
"batch_list": 10,
"relation_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt",
"input_type": "ListGenerator",
"phase": "EVAL"
}
}
}
[Embedding] Embedding Load Done.
[Input] Process Input Tags. odict_keys(['train']) in TRAIN, odict_keys(['test', 'valid']) in EVAL.
[/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt]
Data size: 79969
[Dataset] 1 Dataset Load Done.
{'text1_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'use_dpool': False, 'batch_size': 100, 'embed_size': 300, 'text2_maxlen': 1000, 'input_type': 'PairGenerator', 'relation_file': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt', 'embed': array([[-0.18291523, -0.00574826, -0.13887608, ..., 0.04232861,
0.16873358, -0.1632563 ],
[ 0.04360746, 0.02268181, 0.13736159, ..., -0.04956975,
-0.18725845, -0.19015439],
[-0.07373005, -0.04657853, 0.0677646 , ..., 0.00168478,
0.03469655, 0.12419996],
...,
[-0.04969991, -0.00968194, -0.1472602 , ..., -0.07864611,
0.11010233, 0.15707028],
[-0.169353 , -0.07957499, -0.00709578, ..., -0.07572405,
0.06080896, 0.19945614],
[ 0.16906822, -0.16493008, 0.07978389, ..., 0.00874102,
0.05448175, 0.10033885]], dtype=float32), 'train_embed': False, 'use_iter': False, 'text1_maxlen': 20, 'batch_per_iter': 5, 'query_per_iter': 50, 'text2_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'target_mode': 'ranking', 'vocab_size': 129897, 'phase': 'TRAIN'}
[/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt]
Instance size: 3196760
Pair Instance Count: 85144090
[PairGenerator] init done
{'text1_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'input_type': 'ListGenerator', 'use_dpool': False, 'embed_size': 300, 'text2_maxlen': 1000, 'batch_list': 10, 'relation_file': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_test.txt', 'embed': array([[-0.18291523, -0.00574826, -0.13887608, ..., 0.04232861,
0.16873358, -0.1632563 ],
[ 0.04360746, 0.02268181, 0.13736159, ..., -0.04956975,
-0.18725845, -0.19015439],
[-0.07373005, -0.04657853, 0.0677646 , ..., 0.00168478,
0.03469655, 0.12419996],
...,
[-0.04969991, -0.00968194, -0.1472602 , ..., -0.07864611,
0.11010233, 0.15707028],
[-0.169353 , -0.07957499, -0.00709578, ..., -0.07572405,
0.06080896, 0.19945614],
[ 0.16906822, -0.16493008, 0.07978389, ..., 0.00874102,
0.05448175, 0.10033885]], dtype=float32), 'train_embed': False, 'text1_maxlen': 20, 'text2_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'target_mode': 'ranking', 'vocab_size': 129897, 'phase': 'EVAL'}
[/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_test.txt]
Instance size: 399595
List Instance Count: 50
[ListGenerator] init done
{'text1_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'input_type': 'ListGenerator', 'use_dpool': False, 'embed_size': 300, 'text2_maxlen': 1000, 'batch_list': 10, 'relation_file': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt', 'embed': array([[-0.18291523, -0.00574826, -0.13887608, ..., 0.04232861,
0.16873358, -0.1632563 ],
[ 0.04360746, 0.02268181, 0.13736159, ..., -0.04956975,
-0.18725845, -0.19015439],
[-0.07373005, -0.04657853, 0.0677646 , ..., 0.00168478,
0.03469655, 0.12419996],
...,
[-0.04969991, -0.00968194, -0.1472602 , ..., -0.07864611,
0.11010233, 0.15707028],
[-0.169353 , -0.07957499, -0.00709578, ..., -0.07572405,
0.06080896, 0.19945614],
[ 0.16906822, -0.16493008, 0.07978389, ..., 0.00874102,
0.05448175, 0.10033885]], dtype=float32), 'train_embed': False, 'text1_maxlen': 20, 'text2_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'target_mode': 'ranking', 'vocab_size': 129897, 'phase': 'EVAL'}
[/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt]
Instance size: 3196760
List Instance Count: 50
[ListGenerator] init done
[ARCI] init done
[layer]: Input [shape]: [None, 20]
�[33m [Memory] Total Memory Use: 9417.4688 MB Resident: 9643488 Shared: 0 UnshareData: 0 UnshareStack: 0 �[0m
[layer]: Input [shape]: [None, 1000]
�[33m [Memory] Total Memory Use: 9417.4688 MB Resident: 9643488 Shared: 0 UnshareData: 0 UnshareStack: 0 �[0m
[layer]: Embedding [shape]: [None, 20, 300]
�[33m [Memory] Total Memory Use: 10011.7578 MB Resident: 10252040 Shared: 0 UnshareData: 0 UnshareStack: 0 �[0m
[layer]: Embedding [shape]: [None, 1000, 300]
�[33m [Memory] Total Memory Use: 10011.7578 MB Resident: 10252040 Shared: 0 UnshareData: 0 UnshareStack: 0 �[0m
[layer]: Conv1D [shape]: [None, 20, 8]
�[33m [Memory] Total Memory Use: 10011.7578 MB Resident: 10252040 Shared: 0 UnshareData: 0 UnshareStack: 0 �[0m
[layer]: Conv1D [shape]: [None, 1000, 8]
�[33m [Memory] Total Memory Use: 10011.7578 MB Resident: 10252040 Shared: 0 UnshareData: 0 UnshareStack: 0 �[0m
srun: error: 64cpu-nc01: task 0: Segmentation fault
from matchzoo.
@lovejasmine it works!
from matchzoo.
if you're working on DSSM, please try out our newly released Matchzoo 2.0 here.
from matchzoo.
Related Issues (20)
- ImportError: cannot import name 'losses_utils' HOT 2
- FileNotFoundError: [Errno 2] No such file or directory: '\\site-packages\\matchzoo\\datasets\\toy\\train.csv' HOT 2
- Is there a tutorial on the ANMM model? HOT 2
- Is there any method to change csv to data_pack? HOT 2
- Do I have to make label=0 data when I make new datapack?
- Using callbacks for early stopping in DSSM
- Hello, as long as I use a structure with a convolutional layer, memory overflow will occur for small data. How to solve the memory overflow? HOT 1
- ChineseTokenize疑惑 HOT 1
- The use of the GPU.
- Import Error HOT 1
- How to setting learning rate in the model params??
- matchzoo.contrib.models.ESIM(), model.save, raise ValueError: substring not found HOT 1
- keras should be replaced by tf.keras HOT 6
- GPU-Utils is low 1%
- set_up.py missing tensorflow HOT 1
- DSSM model.predict() scores rank does not match with the rank by dot layer cosine similarity
- Can Deep Component (Representation-focused model) is there in matchzoo?
- Bug/enhancement HOT 2
- TypeSpec error while DRMM model build
- MatchACNN
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from matchzoo.