flairnlp / flair Goto Github PK
View Code? Open in Web Editor NEWA very simple framework for state-of-the-art Natural Language Processing (NLP)
Home Page: https://flairnlp.github.io/flair/
License: Other
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Home Page: https://flairnlp.github.io/flair/
License: Other
There is a lot of redundant code in the data fetcher helper routines. Simplify by creating a generic 'CoNLL-column' data reader that can be passed the column definition, so it is applicable to CoNLL03, CoNLL2000 and any other sequence labeling data that is formatted in a similar column style.
It would be great if there was a way to obtain a nested dictionary as output of the NER instead of a string with <...> tags. The string is quite tedious to work with.
I imagine an output like:
"sentence": {
"text": "Facebook, Inc. is a company, and Google is one as well.",
"named_entities": [
{
"mention_text": "Facebook, Inc.",
"start_pos": 0
"end_pos": ...
"type": "ORG",
"confidence": 0.9
},
{
...
}
]
}
Release version 0.2 of Flair and make available via pip install
Add documentation on new release
2 issues here
__getitem__
should return something for sentence[0] (in python this is usually the first thing)flair.data.Token
object should display the string in it's repr methodClarify naming of classes and interfaces:
Trying to use the ner-ontonotes model on a Mac. Installed flair via pip.
from flair.tagging_model import SequenceTagger
tagger = SequenceTagger.load('ner-ontonotes')
Here's the traceback:
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-4-4d86fef1ef40> in <module>()
----> 1 tagger = SequenceTagger.load('ner-ontonotes')
~/.virtualenvs/py36/lib/python3.6/site-packages/flair/tagging_model.py in load(model)
459
460 if model_file is not None:
--> 461 tagger: SequenceTagger = SequenceTagger.load_from_file(model_file)
462 return tagger
463
~/.virtualenvs/py36/lib/python3.6/site-packages/flair/tagging_model.py in load_from_file(cls, model_file)
123 # serialization of torch objects
124 warnings.filterwarnings("ignore")
--> 125 state = torch.load(model_file, map_location={'cuda:0': 'cpu'})
126 warnings.filterwarnings("default")
127
~/.virtualenvs/py36/lib/python3.6/site-packages/torch/serialization.py in load(f, map_location, pickle_module)
301 f = open(f, 'rb')
302 try:
--> 303 return _load(f, map_location, pickle_module)
304 finally:
305 if new_fd:
~/.virtualenvs/py36/lib/python3.6/site-packages/torch/serialization.py in _load(f, map_location, pickle_module)
467 unpickler = pickle_module.Unpickler(f)
468 unpickler.persistent_load = persistent_load
--> 469 result = unpickler.load()
470
471 deserialized_storage_keys = pickle_module.load(f)
OSError: [Errno 22] Invalid argument
Note: tagger = SequenceTagger.load('ner')
works fine
possible simplifications to the SequenceTagger:
In a recent guest talk at Zalando Research, @hanxiao made a strong case for using word dropout in text classification. So, let's add it to Flair and evaluate it vis-a-vis standard and locked dropout.
I have started playing with the embeddings tutorials, and noticed that when using only glove vectors it was very quick to get the embedding (it's only a lookup table without context), so it's usable for applications. However when we use charlm_embedding_forward and/or charlm_embedding_backward (using context) it's very much time consuming. This might be a bottleneck when dealing with long texts with lots of sentences to deal with.
Example:
start = time()
sentence = Sentence('The grass is green .')
stacked_embeddings = StackedEmbeddings(embeddings=[glove_embedding])
stacked_embeddings.embed(sentence)
print(time() - start)
start = time()
sentence = Sentence('The grass is green .')
stacked_embeddings = StackedEmbeddings(embeddings=[glove_embedding, charlm_embedding_forward])
stacked_embeddings.embed(sentence)
print(time() - start)
start = time()
sentence = Sentence('The grass is green .')
stacked_embeddings = StackedEmbeddings(embeddings=[glove_embedding, charlm_embedding_forward,
charlm_embedding_backward])
stacked_embeddings.embed(sentence)
print(time() - start)
This pints out on my machine:
0.000461578369140625
0.017933368682861328
0.03269362449645996
Besides, in long texts sentences are typically much more longer than this example.
Hi,
When running an experiment with NER (run_ner.py), I got the following error:
File "./run_ner.py", line 79, in <module>
main()
File "./run_ner.py", line 76, in main
train(train_file_path, dev_file_path, test_file_path)
File "./run_ner.py", line 63, in train
train_with_dev=True, anneal_mode=True)
File "/home/user/projects/ner/flair/flair/trainer.py", line 80, in train
loss = self.model.neg_log_likelihood(batch, self.model.tag_type)
File "/home/user/projects/ner/flair/flair/tagging_model.py", line 285, in neg_log_likelihood
feats, tags = self.forward(sentences)
File "/home/user/projects/ner/flair/flair/tagging_model.py", line 213, in forward
packed = torch.nn.utils.rnn.pack_padded_sequence(tagger_states, lengths)
File "/home/user/miniconda3/lib/python3.6/site-packages/torch/onnx/__init__.py", line 57, in wrapper
return fn(*args, **kwargs)
File "/home/user/miniconda3/lib/python3.6/site-packages/torch/nn/utils/rnn.py", line 124, in pack_padded_sequence
data, batch_sizes = PackPadded.apply(input, lengths, batch_first)
File "/home/user/miniconda3/lib/python3.6/site-packages/torch/nn/_functions/packing.py", line 12, in forward
raise ValueError("Length of all samples has to be greater than 0, "
ValueError: Length of all samples has to be greater than 0, but found an element in 'lengths' that is <= 0
Do you have any suggestion why this error happens?
Thanks.
when the code call eval.pl for evaluation, it causes this bug:
Traceback (most recent call last):
File "char_lm.py", line 68, in
max_epochs=150)
File "E:\users\v-tizhao\vs\flair\trainers\sequence_tagger_trainer.py", line 114, in train
embeddings_in_memory=embeddings_in_memory)
File "E:\users\v-tizhao\vs\flair\trainers\sequence_tagger_trainer.py", line 235, in evaluate
p = run(eval_script, stdout=PIPE, input=eval_data, encoding='utf-8')
File "E:\users\v-tizhao\anaconda\lib\subprocess.py", line 403, in run
with Popen(*popenargs, **kwargs) as process:
File "E:\users\v-tizhao\anaconda\lib\subprocess.py", line 709, in init
restore_signals, start_new_session)
File "E:\users\v-tizhao\anaconda\lib\subprocess.py", line 997, in _execute_child
startupinfo)
OSError: [WinError 193] %1 is not a valid Win32 application
btw, I run the code on windows server, 64bit machine.
Could you help me? Thank you !
Currently, we cannot use flair to train a text-classifier.
We want to add a
Hi,
I tried to replicate the GermEval results, but there's one problem with the examples in resources/docs/EXPERIMENTS.md
: ft-german
is used to call the WordEmbeddings
constructor.
But for ft-german
no case if checked in here - I guess de-fasttext
should be used instead.
If you want I could add a new case for ft-german
or I could modify the examples in the experiment section :)
We should add tests for
We could also add tests for training, to make sure training works without issues.
We need some routines to do visualisation.
Implement these in flair.visual
Useful features:
I encountered a "weird"/advanced python statement which has a colon at the beginning as follows:
corpus: TaggedCorpus = NLPTaskDataFetcher.fetch_data(NLPTask.CONLL_03).downsample(0.1)
print(corpus)
Could you please explain what the python statement (which uses a colon after a name at the beginning of the statement) means? I tried to search for usage of colon in Python but could not found the answer.
corpus: TaggedCorpus = NLPTaskDataFetcher.fetch_data(NLPTask.CONLL_03)
Thanks.
Train a simple NER tagger for Swedish trained for instance over this dataset.
For this task, we need to adapt the NLPTaskDataFetcher for the appropriate Swedish dataset and train a simple model using Swedish word embeddings. How to train a model is illustrated here.
Swedish word embeddings can now be loaded with
embeddings = WordEmbeddings('sv-fasttext')
For issue #2
Currently, the predict()
method of a trained text classification model will just return the label name, but not the confidence. Same holds true for the sequence labeling model.
However, depending on the use case you maybe only want to use labels with a confidence value higher than 0.9. Thus, we should add the confidence to the return value of the predict()
method.
Issue #37 pointed to lower efficiency.
Solve speed issues by:
Right now the code path is to download some pretrained embedding from a remote source. However I have some domain specific embeddings that are in the standard gensim format. Would it be possible to support loading word embeddings simply from disk?
Hi,
I'm currently running flair
in a nvidia-docker flavoured container using pyenv
and Python 3.7. When installing flair
via pip
(or from source via pip install -e .
) the following error message appears:
root@599cf5a2dde2:~/flair# pip install flair
Collecting flair
Downloading https://files.pythonhosted.org/packages/1f/08/b2bdb5ef305227a7dacfc69b0c16689de21e70b9d76e3d07eefa1690311b/flair-0.1.1.tar.gz
Collecting torch==0.4.0 (from flair)
Could not find a version that satisfies the requirement torch==0.4.0 (from flair) (from versions: 0.1.2, 0.1.2.post1, 0.4.1, 0.4.1.post2)
No matching distribution found for torch==0.4.0 (from flair)
You are using pip version 10.0.1, however version 18.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
With 0.4.1.post2
training runs perfectly, so I would like to ask if we can bump the Torch version?
Notice: using 0.4.1
has a strange numpy
bug, which is discussed here.
Seems like
https://github.com/zalandoresearch/flair/blob/master/flair/embeddings.py#L139
Does not make use of the OoV functionality present in FastText, seems like it would be a nice addition!
Consider me running the following code:
test.py
:
import numpy as np
import tensorflow as tf
It issues very few warnings when launched:
$ python3 -i test.py
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
Now I put import of flair on top and I have whole lot of them shown:
print("Loading flair NER")
from flair.models import SequenceTagger
from flair.data import Sentence
tagger = SequenceTagger.load('ner')
print("NER Loaded")
import numpy as np
import tensorflow as tf
Output:
Loading flair NER
NER Loaded
/usr/lib/python3.6/importlib/_bootstrap_external.py:426: ImportWarning: Not importing directory /home/hcl/.local/lib/python3.6/site-packages/google: missing __init__
_warnings.warn(msg.format(portions[0]), ImportWarning)
/usr/lib/python3.6/importlib/_bootstrap_external.py:426: ImportWarning: Not importing directory /home/hcl/.local/lib/python3.6/site-packages/mpl_toolkits: missing __init__
_warnings.warn(msg.format(portions[0]), ImportWarning)
/usr/lib/python3.6/importlib/_bootstrap_external.py:426: ImportWarning: Not importing directory /home/hcl/.local/lib/python3.6/site-packages/jaraco: missing __init__
_warnings.warn(msg.format(portions[0]), ImportWarning)
/home/hcl/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:936: DeprecationWarning: builtin type EagerTensor has no __module__ attribute
EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase)
/home/hcl/.local/lib/python3.6/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()
return _inspect.getargspec(target)
/home/hcl/.local/lib/python3.6/site-packages/tensorflow/python/keras/backend.py:4712: ResourceWarning: unclosed file <_io.TextIOWrapper name='/home/hcl/.keras/keras.json' mode='r' encoding='UTF-8'>
_config = json.load(open(_config_path))
/usr/lib/python3.6/importlib/_bootstrap.py:219: ImportWarning: can't resolve package from __spec__ or __package__, falling back on __name__ and __path__
return f(*args, **kwds)
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/usr/lib/python3.6/importlib/_bootstrap.py:219: ImportWarning: can't resolve package from __spec__ or __package__, falling back on __name__ and __path__
return f(*args, **kwds)
The warnings are of different types and are shown not only during import, but also when calling imported functions, which creates a lot of garbage in the output. I tried to return warning settings to the default level by doing
import warnings
warnings.resetwarnings()
after loading Flair, but it didn't help. All that is possible now for me is to completely silence warnings after importing Flair:
import warnings
warnings.filterwarnings("ignore")
but this is not a good solution, because imports seem to show some important warnings (two in the first example).
Hello,
I would like to use custom embeddings since the language that I want to work with is not supported yet in the languages supported by current flair embeddings (link).
How can I add a custom embedding.
I would like to use the one from fastText.
https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md
Hi,
recently I received the training, dev and test data from the "Fine-grained POS Tagging of German Tweets" paper (which can be found here). The highest accuracy mentioned in that paper is 89.42%.
I trained a model with German fasttext embeddings and forward/backward lm embeddings and after 40 epochs (on CPU) an accuracy ~ 92% could be achieved. I double checked that result with an own evaluation script that loads the trained model, predicts the tags and compares the predicted tags with the gold standard and this also achieves 92%.
So I would like to ask if the core team is interested in such a pos tagging model for German twitter data? I would really like to share the model, but then I have to ask the paper author for a permission to share a trained model.
https://github.com/zalandoresearch/flair/blob/3635220bdc05c19e4fe47555071db6c39c009bae/flair/models/language_model.py#L41
What is nout
?
In the Tutorial for your own Character LM Embeddings also didn't mention about the parameter nout
Can you also give the original configuration used to train 1-billion words
I repeatedly run into the following error on one of my machines:
Traceback (most recent call last): File "/var/www/scminer/live_extractor/views.py", line 138, in process_text tagger = FlairSequenceTagger.load('ner') File "/root/anaconda3/envs/scminer_live/lib/python3.6/site-packages/flair/models/sequence_tagger_model.py", line 488, in load tagger: SequenceTagger = SequenceTagger.load_from_file(model_file) File "/root/anaconda3/envs/scminer_live/lib/python3.6/site-packages/flair/models/sequence_tagger_model.py", line 131, in load_from_file state = torch.load(model_file, map_location={'cuda:0': 'cpu'}) File "/root/anaconda3/envs/scminer_live/lib/python3.6/site-packages/torch/serialization.py", line 303, in load return _load(f, map_location, pickle_module) File "/root/anaconda3/envs/scminer_live/lib/python3.6/site-packages/torch/serialization.py", line 476, in _load deserialized_objects[key]._set_from_file(f, offset, f_is_real_file) RuntimeError: unexpected EOF. The file might be corrupted.
Environment:
• Ubuntu 16.04
• Anaconda 5.2.0
• Python 3.6
• Latest version of flair (Version: 0.2.1 according to pip show)
I have tried reinstalling flair multiple times and in different virtual environments.
I know it should generally work since it runs on another machine of mine.
Reached end of my wit.
Any idea what might cause this error?
I would like to train my own contextual string embeddings from scratch, since my target corpus is very idiosyncratic and likely won't work very well with the pre-trained ones (historically diverse OCRed scanned books/newspapers).
Unfortunately I could not find a way to train these in the code, are there plans to add something like the TagTrainer
for the CharLmEmbeddings
?
Hello. Thanks for your wonderful code. I find you also calculate the F1 score using the perl script. I think there is a problem if this script is used to calculate the F1 score for the BIOS tag scheme, since this perl script cannot deal with those tag start with 'S-'. So I wonder if you guys correct the script code or you convert the tag scheme of tokens before using the script? If I am not right, please let me know. Thanks again.
When running an experiment with NER (run_ner.py), I found that the model does not utilize my GPU fully. I have 2 GPUs Nvidia Titan Xp on my machine but only 1 of them is used with the utilization varying from 10-60%.
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.45 Driver Version: 396.45 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN Xp Off | 00000000:03:00.0 Off | N/A |
| 40% 61C P2 65W / 250W | 5367MiB / 12196MiB | 15% Default |
+-------------------------------+----------------------+----------------------+
| 1 TITAN Xp Off | 00000000:04:00.0 Off | N/A |
| 23% 35C P8 10W / 250W | 10MiB / 12188MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 30384 C python 5357MiB |
+-----------------------------------------------------------------------------+
Training my data NeuroNLP2 yields utilization of both GPUs at more than 90%.
Hi,
While stepping through code, following function appears in language_model_trainer.py
` @staticmethod
def _get_batch(source, i, sequence_length):
seq_len = min(sequence_length, len(source) - 1 - i)
data = Variable(source[i:i + seq_len])
target = Variable(source[i + 1:i + 1 + seq_len].view(-1))
return data, target`
The dump of data and target shows these values for instance:
Data Tensor:
`tensor([[ 60, 2, 5, 9, 3, 4, 5, 6, 1, 12],
[ 2, 8, 1, 7, 6, 77, 6, 15, 8, 12],
[ 77, 77, 22, 1, 1, 13, 8, 1, 2, 13],
[ 13, 13, 7, 2, 14, 2, 1, 5, 11, 8],
[ 2, 2, 6, 13, 9, 5, 8, 2, 1, 1],
[ 1, 1, 2, 1, 3, 1, 2, 15, 7, 6],
[ 27, 6, 8, 6, 11, 7, 11, 22, 4, 5],
[ 4, 11, 2, 13, 4, 4, 1, 9, 8, 1],
[ 5, 4, 3, 16, 15, 8, 8, 7, 13, 13],
[ 6, 22, 5, 13, 2, 13, 2, 1, 8, 7]], device='cuda:0')
Shape: (10, 10)`
Target Tensor:
`tensor([ 2, 8, 1, 7, 6, 77, 6, 15, 8, 12, 77, 77,
22, 1, 1, 13, 8, 1, 2, 13, 13, 13, 7, 2,
14, 2, 1, 5, 11, 8, 2, 2, 6, 13, 9, 5,
8, 2, 1, 1, 1, 1, 2, 1, 3, 1, 2, 15,
7, 6, 27, 6, 8, 6, 11, 7, 11, 22, 4, 5,
4, 11, 2, 13, 4, 4, 1, 9, 8, 1, 5, 4,
3, 16, 15, 8, 8, 7, 13, 13, 6, 22, 5, 13,
2, 13, 2, 1, 8, 7, 2, 4, 1, 2, 3, 8,
11, 9, 18, 3], device='cuda:0')
Shape: (100)`
The output predictions from language_model_trainer is (10, 10, 275) and converted to (100, 275), which means we have 275 scores for each character. (dimensions are reffering to batch_size, sequence_length, output_embedding, and numbers refer to those obtained from test_language_model_trainer.py)
Question:
We see that Targets are shifted by sequence_length which is 10 in this case. This means that character at ith position is trained to predict character at i+seq_length position (i+10 position in test code provided). Shouldn't the characterLM predict the next word (i+1th)? Or am I missing something?
Thanks!
Hi Alan,
Feel free to deprioritize it but currently, the inference is slow on CPUs. In a separate ticket, you did implement batching to improve the inference for long text but it still cannot be used in production settings.
Hi :)
this is not directly related to flair
, but I have a question/suggestion for a customer review corpus (comparable to the Amazon Review corpus) which is based on German customer reviews from the Zalando site?
Such a review corpus could be a great resource for sentiment analysis/text classification in German and would encourage researchers to develop models, because it would be the first large corpus for German in this domain. A kind of public leaderboard (like the SQuAD one) could also be introduced :)
What do you think about this idea and do you think it would be possible to build such a review corpus?
Thanks,
Stefan
In testing I found the code which wraps the language model gives current implementation: 5.07 seconds/ sentence.
I.e.:
charlm_embedding_forward = CharLMEmbeddings('news-forward')
charlm_embedding_backward = CharLMEmbeddings('news-backward')
embeddings = StackedEmbeddings(
[charlm_embedding_backward, charlm_embedding_forward]
)
embeddings.embed(sentence)
However by calling the forward and backward language models directly I find: 0.29 seconds/ sentence.
I.e.:
embeddings_f = CharLMEmbeddings('news-forward')
f = embeddings.lm.get_representation(sentence)
embeddings_b = CharLMEmbeddings('news-backward')
b = embeddings.lm.get_representation(sentence)
There must be some slowdown in the stacking operations or the preparing operations.
Compared to other deep learning based NER models, tagger.predict() appears to be slow. It took around 70 seconds to parse a string with 455 tokens.
Upon running line profiler, it seems all the time is spent in creating the embeddings
self.embeddings.embed(sentences)
Any ideas why this would be so slow?
Hi! Flair looks amazing. Clean code, easy to use. Thanks for making it open source!
I was wondering if you plan to add support for more languages? Maybe all the languages where Zalando operates? :) I'm working for a company that need NLP-code that works across pretty much the same set of countries.
Looking at different available libraries, pre-trained models for more than just English (and German in this case!), is lacking in all the other libraries.
Hello,
I just trained a Spanish LM.
I wonder if it is a good enough one.
What are the ways for you to test if it is a good enough LM?
For example, what do you get for loss in the English model? What does ppl stand for?
This is what I got for the very last split.
Add pre-trained classification models for
The models should be loadable by calling TextClassifier.load('ag-news')
and TextClassifier.load('imdb')
Hi,
a strange bug occured during training of a ner model:
....................................................................................................16600 of 36276 (0.457603)
..Traceback (most recent call last):
File "train.py", line 83, in <module>
train_with_dev=True, anneal_mode=True)
File "/root/flair/flair/trainer.py", line 80, in train
loss = self.model.neg_log_likelihood(batch, self.model.tag_type)
File "/root/flair/flair/tagging_model.py", line 295, in neg_log_likelihood
forward_score = self._forward_alg(sentence_feats)
File "/root/flair/flair/tagging_model.py", line 336, in _forward_alg
alpha = log_sum_exp(terminal_var)
File "/root/flair/flair/tagging_model.py", line 30, in log_sum_exp
max_score = vec[0, argmax(vec)]
IndexError: index 1090821856 is out of bounds for dimension 0 with size 14
As you can see, that error occured in the middle of an epoch. Here's the loss.txt
output of previous epochs:
0 (14:47:06) 2.707507 0 0.100000 DEV 0 _ TEST 497 acc: 97.17% p: 74.52% r: 61.25% FB1: 67.24
1 (15:17:10) 2.242203 0 0.100000 DEV 0 _ TEST 567 acc: 96.77% p: 67.50% r: 63.59% FB1: 65.49
2 (15:47:09) 2.186211 0 0.100000 DEV 0 _ TEST 681 acc: 96.12% p: 72.25% r: 62.66% FB1: 67.11
3 (16:17:10) 2.141665 0 0.100000 DEV 0 _ TEST 525 acc: 97.01% p: 74.62% r: 62.03% FB1: 67.75
4 (16:47:51) 2.150997 1 0.100000 DEV 0 _ TEST 549 acc: 96.88% p: 75.09% r: 64.53% FB1: 69.41
So the IndexError
occured during an epoch and previous epochs could be trained sucessfully.
I'm using Torch in version 0.4.1.post2
with 326b383.
Hi,
I trained an own model on a dataset (using pretty much the same instructions as in train.py
). For loading the trained model I used:
from flair.data import Sentence
from flair.tagging_model import SequenceTagger
tagger: SequenceTagger = SequenceTagger.load_from_file('resources/taggers/example-ner/final-model.pt')
But then the following error message occurs:
Traceback (most recent call last):
File "predict.py", line 4, in <module>
tagger: SequenceTagger = SequenceTagger.load_from_file('resources/taggers/example-ner/final-model.pt')
File "/flair/flair/tagging_model.py", line 129, in load_from_file
hidden_size=state['hidden_size'],
TypeError: 'SequenceTagger' object is not subscriptable
I hope the flair
Team could help me :) Thanks!
Hello
I was following steps given in the documentations in here.
Situation:
I am trying to train a Spanish NER. The data I am using to train is similar is a CONLL format.
The error happened when I was testing a few options to load the input files.
I have installed flair through git clone and installed from master branch.
I first wanted to install via pip install however I wanted to train my own embedding and according to this it seems a feature available only through the git clone not from pip install.
When the error occurred:
When I ran the function to load the file
data = NLPTaskDataFetcher.read_conll_sequence_labeling_data('./data/esp2.train')
I got an error that said:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 291: ordinal not in range(128)
What I tried:
I have tried (1) different functions implemented in the file data_fetcher. (2) also gave option of encoding to 'utf-8' in open(path_to_conll_file, encoding='utf-8')
, but didn't solve the issue.
I have also tried a couple of options to change system encoding but not so successful.
When we are not in train_with_dev mode, scheduler.best stores the highest F1 (dev_score, not the current_loss).
https://github.com/zalandoresearch/flair/blob/3635220bdc05c19e4fe47555071db6c39c009bae/flair/trainers/sequence_tagger_trainer.py#L138
add pre-trained models for NER, chunking, semantic role labeling and part-of-speech tagging for upcoming 0.2 release
Hi,
what do you think about adding a Travis CI configuration for executing all unit tests?
This will e.g. ensure that flair
is always cloned and installed in a fresh environment for executing the unit tests :)
Flair currently supports training of custom sequence taggers, but not custom language models. Many people want to try out contextual string embeddings for new languages (#2) or domains.
Task: add a trainer class to facilitate training of new language model embeddings.
Hi,
I've discovered the flair framework recently and the experience so far is great!
Following what has been by Howard and Ruder with ULMFit, and others, I would be interested in fine-tuning the language models to custom datasets and then plug a custom layer to do some tasks.
I think I can work out the language model fine-tuning by downloading one of your pre-trained models and then use it as initialization of the language model training.
However, for the downstream tasks, I wish I could first train on the e.g. classification layer, and then gradually fine-tune the language models layers.
Thank you very much for your help!
I recently read the generative pretraining paper of openAI.
According to the benchmarks, fine-tuning the openAI model on a custom dataset takes a very less amount of time compared to a LSTM based approach.
Also the model has shown to improve SOTA in a lot of tasks.
So I was wondering if it is possible to replace the pipeline by a transformer based model implemented by OpenAI.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.