Code Monkey home page Code Monkey logo

nball4tree's Introduction

NBalls Visualization:

Wiki

Install the package

  • for Ubuntu platform please first install python3-tk
sudo apt-get install python3-tk
  • for Ubuntu or Mac platform type:
$ git clone https://github.com/gnodisnait/nball4tree.git
$ cd nball4tree
$ virtualenv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

Experiment 1: Training and evaluating nball embeddings

Experiment 1.1: Training nball embeddings

% you need to create an empty file nball.txt for output

$ python nball.py --train_nball /Users/<user-name>/data/glove/nball.txt --w2v /Users/<user-name>/data/glove/glove.6B.50d.txt  --ws_child /Users/<user-name>/data/glove/wordSenseChildren47634.txt  --ws_catcode /Users/<user-name>/data/glove/glove.6B.catcode.txt  --log log.txt
% --train_nball: output file of nball embeddings
% --w2v: file of pre-trained word embeddings
% --ws_child: file of parent-children relations among word-senses
% --ws_catcode: file of the parent location code of a word-sense in the tree structure
% --log: log file, shall be located in the same directory as the file of nball embeddings

The training process can take around 6.5 hours.

Experiment 1.2: Checking whether tree structures are perfectly embedded into word-embeddings

  • main input is the output directory of nballs created in Experiment 1.1
  • shell command for running the nball construction and training process
$ python nball.py --zero_energy <output-path> --ball <output-file> --ws_child /Users/<user-name>/data/glove/wordSenseChildren.txt
% --zero_energy <output-path> : output path of the nballs of Experiment 1.1, e.g. ```/Users/<user-name>/data/glove/data_out```
% --ball <output-file> : the name of the output nball-embedding file
% --ws_child /Users/<user-name>/data/glove/wordSenseChildren.txt: file of parent-children relations among word-senses

The checking process can take around 2 hours.

  • result

If zero-energy is achieved, a big nball-embedding file will be created <output-path>/<output-file> otherwise, failed relations and word-senses will be printed.

** Test result at Mac platform: img|630x420 ** Test result at Ubuntu platform:

Experiment 2: Observe neighbors of word-sense using nball embeddings

$ python nball.py --neighbors beijing.n.01 berlin.n.01  --ball /Users/<user-name>/data/glove/glove.6B.50Xball.V10.txt  --num 6
% --neighbors: list of word-senses
% --ball: file location of the nball embeddings
% --num: number of neighbors
  • Results of nearest neighbors look like below:

Experiment 3: Consistency analysis

deviation of word-stems

$ python nball.py  --std_stem /Users/<user-name>/data/glove/wordstem.std --dim 50 --w2v /Users/<user-name>/data/glove/glove.6B.50d.txt --ballStemFile /Users/<user-name>/data/glove/glove.6B.50Xball.words --ball /Users/<user-name>/data/glove/glove.6B.50Xball.V10.txt
  • Result of consistency analysis

Experiment 4: Validating unknown word-senses or words

$ python nball.py  --validate_member /Users/<user-name>/data/glove/memberValidation/membershipPredictionResult.txt \
                    --numOfChild 10  --percentages 5 10 20 30 40 50 60 70 80 90  \
                    --taskFiles /Users/<user-name>/data/glove/memberValidation/membershipPredictionTask.txt \
                    --w2v /Users/<user-name>/data/glove/glove.6B.50d.txt \
                    --ws_child /Users/<user-name>/data/glove/wordSenseChildren.txt  \
                    --ws_path /Users/<user-name>/data/glove/wordSensePath.txt \
                    --ws_catcode /Users/<user-name>/data/glove/glove.6B.catcode.txt \
                    --logPath /Users/<user-name>/data/glove/logMemberValidate
  • command for viewing the result of validating unknown word-sense or word
$ python nball.py  --plot_validate_member /Users/<user-name>/data/glove/memberValidation/membershipPredictionResult.txt      --numOfChild 10       --percentages 5 10 20 30 40 50 60 70 80 90
  • Precision of validating the category of unknown words

  • Recall of validating the category of unknown words

NBalls for other languages

Cite

If you use the code, please cite the following paper:

Tiansi Dong, Chrisitan Bauckhage, Hailong Jin, Juanzi Li, Olaf Cremers, Daniel Speicher, Armin B. Cremers, Joerg Zimmermann (2019). Imposing Category Trees Onto Word-Embeddings Using A Geometric Construction. ICLR-19 The Seventh International Conference on Learning Representations, May 6 โ€“ 9, New Orleans, Louisiana, USA.

nball4tree's People

Contributors

gnodisnait avatar ghanem-mhd avatar himmelstein avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.