Code Monkey home page Code Monkey logo

rasa_lookup_demo's Introduction

Phrase Matcher Demo

This is a simple demo of the new lookup table feature in rasa_nlu. See the blog post accompanying this repository here

The goal is to show how lookup tables may improve entity extraction under certain conditions and also give some advice on using this feature effectively.

This repo contains two demos:

  1. A simple restaurant example with very few training examples and only one entity.
  2. A medium-sized company name extraction example with a few thousand examples and several entities.

Running the demo.

No installation is necessary although you must have rasa_nlu installed and version > 0.13.3 or above.

To run one or both of the demos:

python run_lookup.py <demo_key>

where <demo_key> is one of {food, company}. If <demo_key> is ommitted, it will run both of the demos.

Code Structure

data/ holds the training data and lookup tables for each of the demos.

models/ is where the models are persisted.

configs/ holds the rasa_nlu configs to do the baseline evaluation and the lookup table evaluation.

img/ stores plots and outputs from the runs.

Cleaning lookup tables

The script filter_lookup.py may be used to clean up lookup tables by removing any elements that match with a cross-list.

You can call this scripy by running

python filter_lookup.py <lookup_in> <cross_list> <lookup_out>

<lookup_in> is a lookup table with newline-separated elements.

<cross_list> is either a comma or newline-separated list of elements that you'd like to remove from <lookup_in>

<lookup_out> is the name of the file that you'd like to write the filtered list to.

Speed Testing

We include the directory speed_test/ for testing the speed of training as a function of the lookup table size.

This generates random lookup tables and times each component of the training and evaluation process. We use the company dataset data/company/company_train_lookup.py.

cd speed_test
python time_lookups.py

See speed_test/README.md for more details.

Ngrams

A simple ngrams tester is included and can be run by

python run_ngrams.py

This loads two lookup tables, data/company/pos_ngrams.txt & data/company/neg_ngrams.txt, each containing ngrams that were found to be influential to classifying phrases as company names. We then compute the f1 score as a function of random noise injected into the entities. The 'noise' value is the probability of a character flip in each character of each company entity in the test set.

This gives the following plot

rasa_lookup_demo's People

Contributors

akelad avatar akshay2000 avatar imsurinder90 avatar llermaly avatar metcalfetom avatar twhughes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rasa_lookup_demo's Issues

lookuptable error

raceback (most recent call last):
File "run_lookup.py", line 1, in
from rasa_nlu.training_data import load_data
ImportError: cannot import name 'load_data'

i get the following error. and i have installed rasa_nlu correctly. there is no issue with that

Error training data on examples provided in repo.

I've been following this totorial I Cloned the repo, and when I run python run_lookup.py everything works fine(i.e it gives me precision and recall). When I try to train a model on food data provided using cmd
python -m rasa_nlu.train -c /home/chaitanya/rasa_lookup_demo/configs/config.yaml --data food -o models --project current --verbose
it loads spacy model and throws error

2019-04-01 11:33:54 INFO     rasa_nlu.training_data.training_data  - Training data stats: 
	- intent examples: 36 (1 distinct intents)
	- Found intents: 'restaurant_search'
	- entity examples: 19 (1 distinct entities)
	- found entities: 'food'

Traceback (most recent call last):
  File "/home/chaitanya/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/chaitanya/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/chaitanya/Desktop/rasanlu_final/venv/lib/python3.6/site-packages/rasa_nlu/train.py", line 184, in <module>
    num_threads=cmdline_args.num_threads)
  File "/home/chaitanya/Desktop/rasanlu_final/venv/lib/python3.6/site-packages/rasa_nlu/train.py", line 153, in do_train
    training_data = load_data(data, cfg.language)
  File "/home/chaitanya/Desktop/rasanlu_final/venv/lib/python3.6/site-packages/rasa_nlu/training_data/loading.py", line 55, in load_data
    data_sets = [_load(f, language) for f in files]
  File "/home/chaitanya/Desktop/rasanlu_final/venv/lib/python3.6/site-packages/rasa_nlu/training_data/loading.py", line 55, in <listcomp>
    data_sets = [_load(f, language) for f in files]
  File "/home/chaitanya/Desktop/rasanlu_final/venv/lib/python3.6/site-packages/rasa_nlu/training_data/loading.py", line 109, in _load
    raise ValueError("Unknown data format for file {}".format(filename))
ValueError: Unknown data format for file food/lookup/food.txt

Im using a virtual environment and my versions are

rasa-core==0.13.6
rasa-core-sdk==0.12.2
rasa-nlu==0.14.6

ValueError: Unknown data format for file ./data/data.json

Hi,

I'm learning Rasa and downloaded a sample project. When I run the training model, I get the '-'data format error'. I reviewed the json file and also validated the syntax on jsonlint, but could not find any errors. Can you please help resolve the issue. I've attached the json and the .py file.

$ python nlu_model.py
Traceback (most recent call last):
File "nlu_model.py", line 26, in
model_directory = train_nlu('./data/data.json', 'config_spacy.yml', './models/nlu')
File "nlu_model.py", line 14, in train_nlu
training_data = load_data(data)
File "c:\cygwin64\home\svenugopal\rasa_nlu-master\rasa\nlu\training_data\loading.py", line 52, in load_data
data_sets = [_load(f, language) for f in files]
File "c:\cygwin64\home\svenugopal\rasa_nlu-master\rasa\nlu\training_data\loading.py", line 52, in
data_sets = [_load(f, language) for f in files]
File "c:\cygwin64\home\svenugopal\rasa_nlu-master\rasa\nlu\training_data\loading.py", line 110, in _load
raise ValueError("Unknown data format for file {}".format(filename))
ValueError: Unknown data format for file ./data/data.json

Thanks,
nlu_model.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.