allenai / allennlp-as-a-library-example Goto Github PK
View Code? Open in Web Editor NEWA simple example for how to build your own model using AllenNLP as a dependency.
A simple example for how to build your own model using AllenNLP as a dependency.
Hi all, I followed an example and finally got the trained model saved in model.tar.gz
. I want to load the model to make a predictor like following
from allennlp.models.archival import load_archive
from allennlp.service.predictors import Predictor
archive = load_archive('model.tar.gz')
# predictor = Predictor.from_archive(archive, 'paper-classifier')
However, I got the following error when I try to load the model.
ConfigurationError Traceback (most recent call last)
<ipython-input-2-741f7f19114e> in <module>()
----> 1 archive = load_archive('output/model.tar.gz')
~/anaconda3/lib/python3.6/site-packages/allennlp/models/archival.py in load_archive(archive_file, cuda_device, overrides, weights_file)
147 weights_file=weights_path,
148 serialization_dir=serialization_dir,
--> 149 cuda_device=cuda_device)
150
151 if tempdir:
~/anaconda3/lib/python3.6/site-packages/allennlp/models/model.py in load(cls, config, serialization_dir, weights_file, cuda_device)
293 # This allows subclasses of Model to override _load.
294 # pylint: disable=protected-access
--> 295 return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
296
297
~/anaconda3/lib/python3.6/site-packages/allennlp/common/registrable.py in by_name(cls, name)
54 def by_name(cls: Type[T], name: str) -> Type[T]:
55 if name not in Registrable._registry[cls]:
---> 56 raise ConfigurationError("%s is not a registered name for %s" % (name, cls.__name__))
57 return Registrable._registry[cls].get(name)
58
ConfigurationError: 'paper-classifier is not a registered name for Model'
Is their a way on how to register paper-classifier
so that it can be loaded and use later?
When I sync down the project and run python -m pytest. There is a failing test.
The output is included below.
`
======================================================================================================= test session starts =======================================================================================================
platform darwin -- Python 3.6.3, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
rootdir: /Users/paul.murphy/PycharmProjects/second-attempt/allennlp-as-a-library-example, inifile: pytest.ini
plugins: pythonpath-0.7.2, cov-2.5.1, flaky-3.4.0
collected 3 items
tests/dataset_readers/semantic_scholar_dataset_reader_test.py . [ 33%]
tests/models/academic_paper_classifier_test.py F [ 66%]
tests/predictors/predictor_test.py . [100%]
============================================================================================================ FAILURES =============================================================================================================
_________________________________________________________________________________ AcademicPaperClassifierTest.test_model_can_train_save_and_load __________________________________________________________________________________
self = <models.academic_paper_classifier_test.AcademicPaperClassifierTest testMethod=test_model_can_train_save_and_load>
def test_model_can_train_save_and_load(self):
self.ensure_model_can_train_save_and_load(self.param_file)
tests/models/academic_paper_classifier_test.py:12:
../../untitled/venv/lib/python3.6/site-packages/allennlp/common/testing/model_test_case.py:81: in ensure_model_can_train_save_and_load
self.check_model_computes_gradients_correctly(model, model_batch)
model = AcademicPaperClassifier(
(text_field_embedder): BasicTextFieldEmbedder(
(token_embedder_tokens): Embedding(
...(_dropout): ModuleList(
(0): Dropout(p=0.2)
(1): Dropout(p=0.0)
)
)
(loss): CrossEntropyLoss(
)
)
model_batch = {'abstract': {'tokens': Variable containing:
18 80 6 ... 0 0 0
18 80 6 ... 0 ... 0 0
237 612 238 4 613 614 14 239 615 616 0 0
[torch.LongTensor of size 10x12]
}}
@staticmethod
def check_model_computes_gradients_correctly(model, model_batch):
model.zero_grad()
result = model(**model_batch)
result["loss"].backward()
has_zero_or_none_grads = {}
for name, parameter in model.named_parameters():
zeros = torch.zeros(parameter.size())
if parameter.requires_grad:
if parameter.grad is None:
has_zero_or_none_grads[name] = "No gradient computed (i.e parameter.grad is None)"
# Some parameters will only be partially updated,
# like embeddings, so we just check that any gradient is non-zero.
if (parameter.grad.data.cpu() == zeros).all():
has_zero_or_none_grads[name] = f"zeros with shape ({tuple(parameter.grad.size())})"
else:
assert parameter.grad is None
if has_zero_or_none_grads:
for name, grad in has_zero_or_none_grads.items():
print(f"Parameter: {name} had incorrect gradient: {grad}")
raise Exception("Incorrect gradients found. See stdout for more info.")
E Exception: Incorrect gradients found. See stdout for more info.
../../untitled/venv/lib/python3.6/site-packages/allennlp/common/testing/model_test_case.py:161: Exception
------------------------------------------------------------------------------------------------------ Captured stdout call -------------------------------------------------------------------------------------------------------
Parameter: classifier_feedforward._linear_layers.0.weight had incorrect gradient: zeros with shape ((2, 4))
Parameter: classifier_feedforward._linear_layers.0.bias had incorrect gradient: zeros with shape ((2,))
Parameter: classifier_feedforward._linear_layers.1.weight had incorrect gradient: zeros with shape ((3, 2))
------------------------------------------------------------------------------------------------------ Captured stderr call -------------------------------------------------------------------------------------------------------
10it [00:00, 388.59it/s]
100%|██████████| 10/10 [00:00<00:00, 2317.55it/s]
10it [00:00, 557.01it/s]
10it [00:00, 551.77it/s]
20it [00:00, 2460.00it/s]
accuracy: 0.4000, accuracy3: 1.0000, loss: 1.0902 ||: 100%|##########| 1/1 [00:00<00:00, 50.55it/s]
accuracy: 0.4000, accuracy3: 1.0000, loss: 1.0898 ||: 100%|##########| 1/1 [00:00<00:00, 88.64it/s]
10it [00:00, 530.95it/s]
10it [00:00, 561.45it/s]
-------------------------------------------------------------------------------------------------------- Captured log call --------------------------------------------------------------------------------------------------------
bucket_iterator.py 92 WARNING shuffle parameter is set to False, while bucket iterators by definition change the order of your data.
bucket_iterator.py 92 WARNING shuffle parameter is set to False, while bucket iterators by definition change the order of your data.
===Flaky Test Report===
===End Flaky Test Report===
=============================================================================================== 1 failed, 2 passed in 3.06 seconds ================================================================================================
(
`
I am using allennlp 0.8.4 and following this example. It gives the error: allennlp.common.checks.ConfigurationError: 'key "type" is required at location "model.text_field_embedder."'
Any idea how to fix it?
As of May 14, tutorial doesn't work out of the box anymore, and must be updated for the current AllenNLP version.
git clone https://github.com/allenai/allennlp-as-a-library-example.git
cd allennlp-as-a-library-example
allennlp train experiments/venue_classifier.json -s /tmp/your_output_dir_here --include-package my_library`
Output:
AssertionError: No super class method found for "decode"
Removing the @OVERRIDES for decode
in the model class, leads to other errors (key error for data loader)
When I set "cuda_device": 0, the training is failing with the following error
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'tensor1'
Do I need to set the cuda flag somewhere else too?
The changes to 0.6.1 removes the usages of all_labels
but its initialization is still here. It made me very confusing when I tried to understand "how the predictor found all the labels?" and it turned out this is dead code.
I am able to train the model with default models
/Desktop/MentionDetector/allennlp-master/allennlp/allennlp-as-a-library-example-master$ allennlp train ../../tutorials/getting_started/simple_tagger.json --serialization-dir /tmp
/tutorials/getting_started
This goes through successfully.
However I am unable to train including my_package
~/Desktop/MentionDetector/allennlp-master/allennlp/allennlp-as-a-library-example-master$ allennlp train experiments/venue_classifier.json -s /tmp/venue_output_dir --include-package my_library
/home/kenome/anaconda3/envs/allennlp/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
/home/kenome/anaconda3/envs/allennlp/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
""")
Traceback (most recent call last):
File "/home/kenome/anaconda3/envs/allennlp/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/kenome/anaconda3/envs/allennlp/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/kenome/Desktop/MentionDetector/allennlp-master/allennlp/run.py", line 18, in <module>
main(prog="python -m allennlp.run")
File "/home/kenome/Desktop/MentionDetector/allennlp-master/allennlp/commands/__init__.py", line 62, in main
import_submodules(package_name)
File "/home/kenome/Desktop/MentionDetector/allennlp-master/allennlp/common/util.py", line 256, in import_submodules
importlib.import_module(package_name + '.' + name)
File "/home/kenome/anaconda3/envs/allennlp/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'my_library.models.archival'
Here the model seems to look for 'my_library.models.archival' where as it should be looking for 'allennlp.models.archival'
in allennlp/common/util.py", line 256, in import_submodules
importlib.import_module(package_name + '.' + name)
PYTHONPATH is as follows
:/home/kenome/Desktop/MentionDetector/allennlp-master/allennlp:/home/kenome/Desktop/MentionDetector/allennlp-master/allennlp/allennlp-as-a-library-example-master:/home/kenome/Desktop/MentionDetector/allennlp-master:/home/kenome/Desktop/MentionDetector/allennlp-master/allennlp/allennlp
``
Please advise.
The requirements.txt file has the allennlp version set to 0.8.1,
python3 academic_paper_classifier_test.py
Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
2019-03-11 21:14:11,580 - INFO - allennlp.common.checks - Pytorch version: 1.0.1.post2
Traceback (most recent call last):
File "academic_paper_classifier_test.py", line 12, in setUp
self.set_up_model('/home/kindler/Projects/2019/M03/re_all/dataset/allennlp_test/allennlp-as-a-library-example/tests/fixtures/academic_paper_classifier.json', '/home/kindler/Projects/2019/M03/re_all/dataset/allennlp_test/allennlp-as-a-library-example/tests/fixtures/s2_papers.jsonl'),
File "/home/kindler/.local/lib/python3.6/site-packages/allennlp/common/testing/model_test_case.py", line 25, in set_up_model
reader = DatasetReader.from_params(params['dataset_reader'])
File "/home/kindler/.local/lib/python3.6/site-packages/allennlp/common/from_params.py", line 275, in from_params
default_to_first_choice=default_to_first_choice)
File "/home/kindler/.local/lib/python3.6/site-packages/allennlp/common/params.py", line 317, in pop_choice
raise ConfigurationError(message)
allennlp.common.checks.ConfigurationError: "s2_papers not in acceptable choices for dataset_reader.type: ['ccgbank', 'conll2003', 'conll2000', 'ontonotes_ner', 'coref', 'winobias', 'event2mind', 'interleaving', 'language_modeling', 'multiprocess', 'ptb_trees', 'squad', 'quac', 'triviaqa', 'qangaroo', 'srl', 'semantic_dependencies', 'seq2seq', 'sequence_tagging', 'snli', 'universal_dependencies', 'sst_tokens', 'quora_paraphrase', 'atis', 'nlvr', 'wikitables', 'template_text2sql', 'grammar_based_text2sql', 'quarel', 'simple_language_modeling', 'babi', 'copynet_seq2seq', 'text_classification_json']"
Hi,
We have a use case where we need to predict the label as well as the classes within the label. It is like predicting two columns (multi label target prediction). I am not sure whether I can use the existing library example for my requirement.
Please suggest me whether it's possible using allennlp.
Thanks in Advance
Looks like this tutorial has been updated for 0.6.1, but the releases / tags only reflect up to 0.4.2 . would be good to get that updated.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.