yourh / attentionxml Goto Github PK
View Code? Open in Web Editor NEWImplementation for "AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification"
Implementation for "AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification"
I am using scipy==1.11.2 and trying to train on Amazon-670K dataset. I am using the Colab notebook.
Can you please help me to find a way around on this error?
[I 230818 11:55:13 main:37] Model Name: AttentionXML
[I 230818 11:55:13 main:40] Loading Training and Validation Set
[I 230818 11:55:13 main:52] Number of Labels: 34399
[I 230818 11:55:13 main:53] Size of Training Set: 880
[I 230818 11:55:13 main:54] Size of Validation Set: 120
[I 230818 11:55:13 main:56] Training
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
/content/drive/MyDrive/xmlc_research/attention_xml/deepxml/optimizers.py:108: UserWarning: This overload of add_ is deprecated:
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha) (Triggered internally at ../torch/csrc/utils/python_arg_parser.cpp:1485.)
exp_avg.mul_(beta1).add_(1 - beta1, grad)
[I 230818 11:58:03 models:114] SWA Initializing
Traceback (most recent call last):
File "/content/drive/MyDrive/xmlc_research/attention_xml/main.py", line 95, in <module>
main()
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/content/drive/MyDrive/xmlc_research/attention_xml/main.py", line 64, in main
model.train(train_loader, valid_loader, **model_cnf['train'])
File "/content/drive/MyDrive/xmlc_research/attention_xml/deepxml/models.py", line 76, in train
p5, n5 = get_p_5(labels, targets), get_n_5(labels, targets)
File "/content/drive/MyDrive/xmlc_research/attention_xml/deepxml/evaluation.py", line 42, in get_precision
mlb = get_mlb(classes, mlb, targets)
File "/content/drive/MyDrive/xmlc_research/attention_xml/deepxml/evaluation.py", line 33, in get_mlb
mlb = MultiLabelBinarizer(range(targets.shape[1]), sparse_output=True)
TypeError: MultiLabelBinarizer.__init__() takes 1 positional argument but 2 positional arguments (and 1 keyword-only argument) were given
您好!
请问您能提供PSP@K评价指标的代码吗?想通过这个评价指标跑跑实验,但是不会复现这部分的代码。
Hi!
我想在自己的数据集上做cluster
但是出现了
ValueError: could not convert string to float: b'vckbee'
可能是因为我的数据集只有raw_text的
如何用raw text生成svmlight_file呢
The output topK's K is fixed now.
Do you think training a classifier to predict the value of K for every input is a good solution?
args=(self.data_cnf['train']['sparse'], self.data_cnf['train']['labels'], mlb),
KeyError: 'sparse'
there's no information about train sparse train labels in /FastAttentionXML-Wiki-500K.yaml file.
Thanks for any help.
Thank you.
Hello,
I am trying to run preprocess on provided Wiki10-31K dataset. However, I am facing the following error:
Traceback (most recent call last):
File "/content/AttentionXML/preprocess.py", line 17, in <module>
from deepxml.data_utils import *
File "/content/AttentionXML/deepxml/data_utils.py", line 15, in <module>
from gensim.models import KeyedVectors
File "/usr/local/lib/python3.7/dist-packages/gensim/__init__.py", line 5, in <module>
from gensim import parsing, corpora, matutils, interfaces, models, similarities, summarization, utils # noqa:F401
File "/usr/local/lib/python3.7/dist-packages/gensim/corpora/__init__.py", line 6, in <module>
from .indexedcorpus import IndexedCorpus # noqa:F401 must appear before the other classes
File "/usr/local/lib/python3.7/dist-packages/gensim/corpora/indexedcorpus.py", line 15, in <module>
from gensim import interfaces, utils
File "/usr/local/lib/python3.7/dist-packages/gensim/interfaces.py", line 19, in <module>
from gensim import utils, matutils
File "/usr/local/lib/python3.7/dist-packages/gensim/matutils.py", line 1054, in <module>
from gensim._matutils import logsumexp, mean_absolute_difference, dirichlet_expectation
File "__init__.pxd", line 198, in init gensim._matutils
ValueError: numpy.ndarray has the wrong size, try recompiling. Expected 80, got 88
You can reproduce the error in this Colab Notebook.
Even with the required environment already created I am facing multiples errors like:
[I 230221 09:56:33 main:37] Model Name: AttentionXML
[I 230221 09:56:33 main:40] Loading Training and Validation Set
[I 230221 09:56:33 main:52] Number of Labels: 29801
[I 230221 09:56:33 main:53] Size of Training Set: 14748
[I 230221 09:56:33 main:54] Size of Validation Set: 200
[I 230221 09:56:33 main:56] Training
Traceback (most recent call last):
File "main.py", line 95, in <module>
main()
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "main.py", line 64, in main
model.train(train_loader, valid_loader, **model_cnf['train'])
File "/home/celso/projects/AttentionXML/deepxml/models.py", line 67, in train
loss = self.train_step(train_x, train_y.cuda())
File "/home/celso/projects/AttentionXML/deepxml/models.py", line 42, in train_step
scores = self.model(train_x)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/celso/projects/AttentionXML/deepxml/networks.py", line 42, in forward
rnn_out = self.lstm(emb_out, lengths) # N, L, hidden_size * 2
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/celso/projects/AttentionXML/deepxml/modules.py", line 60, in forward
self.lstm(packed_inputs, (hidden_init, cell_init))[0], batch_first=True)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/torch/nn/modules/rnn.py", line 561, in forward
result = _VF.lstm(input, batch_sizes, hx, self._flat_weights, self.bias,
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
Are you sure that all tensor operation happens in the same device?
I train AttentionXML on my Chinese dataset with average 20 sentence length and 150 classes.
The result is same to TextCNN.
Thank you!
@yourh
I find that if I directly run scripts on EUR-Lex or Amazon-670K, it would raise errors w.r.t segment fault. I doubt the multiprocessing part might be related with this issue. So I slightly change the run_xml.sh like
Finally it works on EUR-Lex data. But still got errors on Amazon-670K when clustering labels. The errors were as follows:
My server has been well set with 2080 Ti Gpus. Is this error merely caused by the GPU memory limits? Or I can adapt the code to fix it further?
and advice?
Hello,
After hours of training in Amazon-670k, I am getting the following error:
[I 230302 08:33:14 tree:145] Finish Training Level-1
[I 230302 08:33:14 tree:149] Generating Candidates for Level-2, Number of Labels: 16384, Top: 160
^MCandidates: 0%| | 0/459301 [00:00<?, ?it/s]^MCandidates: 1%| | 2517/459301 [00:00<00:18, 25168.82it/s]^MCandidates: 1%| | 5013/459301 [00:00>
^MParents: 0it [00:00, ?it/s]^MParents: 3155it [00:00, 31545.29it/s]^MParents: 6310it [00:00, 31546.17it/s]^MParents: 9409it [00:00, 31375.56it/s]^MParents: 12587it [00:00, 31>
File "main.py", line 98, in <module>
main()
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/celso/projects/venvs/AttentionXML/lib/python3.8/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "main.py", line 70, in main
model.train(train_x, train_y, valid_x, valid_y, mlb)
File "/home/celso/projects/AttentionXML/deepxml/tree.py", line 200, in train
self.train_level(self.level - 1, train_x, train_y, valid_x, valid_y)
File "/home/celso/projects/AttentionXML/deepxml/tree.py", line 86, in train_level
train_group_y, train_group, valid_group = self.train_level(level - 1, train_x, train_y, valid_x, valid_y)
File "/home/celso/projects/AttentionXML/deepxml/tree.py", line 132, in train_level
model = XMLModel(network=FastAttentionRNN, labels_num=labels_num, emb_init=self.emb_init,
File "/home/celso/projects/AttentionXML/deepxml/models.py", line 145, in __init__
self.attn_weights = AttentionWeights(labels_num, hidden_size*2, attn_device_ids)
File "/home/celso/projects/AttentionXML/deepxml/modules.py", line 88, in __init__
group_size, plus_num = labels_num // len(device_ids), labels_num % len(device_ids)
ZeroDivisionError: integer division or modulo by zero
Hi yourh !
I find 2 models in this repo, AttentionXML and fastAttentionXML, the latter of which seems not occurred in your paper. Also, I can't find where the PLT module is used in AttentionXML. Could you tell me the difference between these two models, and, where is the PLT modules in AttentionXML?
Thank you very much!
Hello?
Could you provide (at a high level) the time complexity of AttentionXML for training and predicting?
You can use
Indeed, it will be excellent if the final formula depends mostly on terms of
I appreciate any help you can provide.
很高兴能看到这个成果,我在尝试跑你的代码的preprocess.py的时候报错了直接“Aborted”。经过调试,发现可能是没法加载 glove.840B.300d.gensim 这个模型导致的,请问这个现在还有在维护吗?
Hello!
We are planning to re-use the code as baselines for our research. Is it possible for you to add a license (i.e. MIT or BSD) for the codebase?
Thanks!
您好!我在执行“python main.py --data-cnf configure/datasets/EUR-Lex.yaml --model-cnf configure/models/AttentionXML-EUR-Lex.yaml”的时候会显示报错为
/home/jupyter-chanchiuhung/AttentionXML/deepxml/optimizers.py:108: UserWarning: This overload of add_ is deprecated:
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha) (Triggered internally at ../torch/csrc/utils/python_arg_parser.cpp:1050.)
exp_avg.mul_(beta1).add_(1 - beta1, grad)
Traceback (most recent call last):
File "/home/jupyter-chanchiuhung/AttentionXML/main.py", line 95, in
main()
File "/home/jupyter-chanchiuhung/.local/lib/python3.9/site-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/home/jupyter-chanchiuhung/.local/lib/python3.9/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/home/jupyter-chanchiuhung/.local/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/jupyter-chanchiuhung/.local/lib/python3.9/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/home/jupyter-chanchiuhung/AttentionXML/main.py", line 64, in main
model.train(train_loader, valid_loader, **model_cnf['train'])
File "/home/jupyter-chanchiuhung/AttentionXML/deepxml/models.py", line 73, in train
p5, n5 = get_p_5(labels, targets), get_n_5(labels, targets)
File "/home/jupyter-chanchiuhung/AttentionXML/deepxml/evaluation.py", line 42, in get_precision
mlb = get_mlb(classes, mlb, targets)
File "/home/jupyter-chanchiuhung/AttentionXML/deepxml/evaluation.py", line 33, in get_mlb
mlb = MultiLabelBinarizer(range(targets.shape[1]), sparse_output=True)
TypeError: init() takes 1 positional argument but 2 positional arguments (and 1 keyword-only argument) were given
请问这是为什么呀?
Is there an assumption that max_leaf hyperparameter present in the configuration file will never be set to 1? Because when I try to run with max_leaf = 1 there is an assert statement in cluster.py which fails.
assert sum(len(labels) for labels in labels_list) == labels_f.shape[0]
This is present inside build_tree_by_level method. Also can you explain the above assert statement and why it is necessary?
Hi,
I have a question about using AttentionXML model in production. Do we have to use the same tokenizer used for training in POC or we can create another tokenizer and embedding matrix in production?
Thank you in advance
Hi,
I got an error : Cuda Error : RuntimeError: CUDNN_STATUS_EXECUTION_FAILED when trying to train the level 1 and this because of:
loss = self.train_step(train_x, train_y.cuda()) (line 70 models.py)
when i change this line to loss = self.train_step(train_x.cuda(), train_y.cuda()) i still get other issues !!!
without tree for the other three datasets.
thank you very much.
Excellent work!
Hi, will you release the code where BERT is the replacement encoder? It was mentioned in a previous issue. Thanks!
What is the content of the train_v1.txt file? How can I get the train_v1.txt file of my own dataset?
It seems that the train_v1.txt file contains a sparse matrix X and Y, and X is the bow feature of each instance. But why the vocab size is different from the vocab.npy?
Thank you.
The dataset Amazon-670k config has an additional parameter: sparse: data/Amazon-670k/train_v1.txt
, which is not generated from the run_preprocess.sh
script.
What is train_v1.txt
, and how to generate it?
After the "Finish Clustering" log message, the process seems to be doing nothing. Only RAM is allocated, the processor's cores usage is near zero, and GPU is not allocated yet.
What do I missing?
Sir,
During preprocess
how is
--vocab-path data/EUR-Lex/vocab.npy
--emb-path data/EUR-Lex/emb_init.npy \ exist>
when we download the data set it does not have these two files.
please guide
Hi, thanks for your works.
I just have some question, what's are the batch size and number of epochs for the Amazon-670k and Wiki-500K. For the AttentionXML network without PLT.
Thanks for your answer :)
Have you tried to replace bi-lstm with bert as encoder?
I've taken a close look in the code when running experiments on Amazon-670k.
When FastAttentionXML is being trained, it seems that:
I see nowhere the PLT compression operation is involved in the pipeline. Maybe I missed something?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.