Code Monkey home page Code Monkey logo

wordpos's Introduction

wordpos

NPM version Build Status

wordpos is a set of fast part-of-speech (POS) utilities for Node.js and browser using fast lookup in the WordNet database.

Version 1.x is a major update with no direct dependence on natural's WordNet module, with support for Promises, and roughly 5x speed improvement over previous version.

CAUTION The WordNet database wordnet-db comprises 155,287 words (3.0 numbers) which uncompress to over 30 MB of data in several unbrowserify-able files. It is not meant for the browser environment.

๐Ÿ”ฅ Version 2.x is totally refactored and works in browsers also -- see wordpos-web.

Installation

 npm install -g wordpos

To run test: (or just: npm test)

npm install -g mocha
mocha test

Quick usage

Node.js:

var WordPOS = require('wordpos'),
    wordpos = new WordPOS();

wordpos.getAdjectives('The angry bear chased the frightened little squirrel.', function(result){
    console.log(result);
});
// [ 'little', 'angry', 'frightened' ]

wordpos.isAdjective('awesome', function(result){
    console.log(result);
});
// true 'awesome'

Command-line: (see CLI for full command list)

$ wordpos def git
git
  n: a person who is deemed to be despicable or contemptible; "only a rotter would do that"; "kill the rat"; "throw the bum out"; "you cowardly little pukes!"; "the British call a contemptible person a 'git'"  

$ wordpos def git | wordpos get --adj
# Adjective 6:
despicable
contemptible
bum
cowardly
little
British

Options

WordPOS.defaults = {
  /**
   * enable profiling, time in msec returned as last argument in callback
   */
  profile: false,

  /**
   * if true, exclude standard stopwords.
   * if array, stopwords to exclude, eg, ['all','of','this',...]
   * if false, do not filter any stopwords.
   */
  stopwords: true,

  /**
   * preload files (in browser only)
   *    true - preload all POS
   *    false - do not preload any POS
   *    'a' - preload adj
   *    ['a','v'] - preload adj & verb
   * @type {boolean|string|Array}
   */
  preload: false,

  /**
   * include data files in preload
   * @type {boolean}
   */
  includeData: false,

  /**
   * set to true to enable debug logging
   * @type {boolean}
   */
  debug: false

};

To override, pass an options hash to the constructor. With the profile option, most callbacks receive a last argument that is the execution time in msec of the call.

    wordpos = new WordPOS({profile: true});
    wordpos.isAdjective('fast', console.log);
    // true 'fast' 29

API

Please note: all API are async since the underlying WordNet library is async.

getPOS(text, callback)

getNouns(text, callback)

getVerbs(text, callback)

getAdjectives(text, callback)

getAdverbs(text, callback)

Get part-of-speech from text. callback(results) receives an array of words for specified POS, or a hash for getPOS():

wordpos.getPOS(text, callback) -- callback receives a result object:
    {
      nouns:[],       Array of words that are nouns
      verbs:[],       Array of words that are verbs
      adjectives:[],  Array of words that are adjectives
      adverbs:[],     Array of words that are adverbs
      rest:[]         Array of words that are not in dict or could not be categorized as a POS
    }
    Note: a word may appear in multiple POS (eg, 'great' is both a noun and an adjective)

If you're only interested in a certain POS (say, adjectives), using the particular getX() is faster than getPOS() which looks up the word in all index files. stopwords are stripped out from text before lookup.

If text is an array, all words are looked-up -- no deduplication, stopword filtering or tokenization is applied.

getX() functions return a Promise.

Example:

wordpos.getNouns('The angry bear chased the frightened little squirrel.', console.log)
// [ 'bear', 'squirrel', 'little', 'chased' ]

wordpos.getPOS('The angry bear chased the frightened little squirrel.', console.log)
// output:
  {
    nouns: [ 'bear', 'squirrel', 'little', 'chased' ],
    verbs: [ 'bear' ],
    adjectives: [ 'little', 'angry', 'frightened' ],
    adverbs: [ 'little' ],
    rest: [ 'the' ]
  }

This has no relation to correct grammar of given sentence, where here only 'bear' and 'squirrel' would be considered nouns.

isNoun(word, callback)

isVerb(word, callback)

isAdjective(word, callback)

isAdverb(word, callback)

Determine if word is a particular POS. callback(result, word) receives true/false as first argument and the looked-up word as the second argument. The resolved Promise receives true/false.

Examples:

wordpos.isVerb('fish', console.log);
// true 'fish'

wordpos.isNoun('fish', console.log);
// true 'fish'

wordpos.isAdjective('fishy', console.log);
// true 'fishy'

wordpos.isAdverb('fishly', console.log);
// false 'fishly'

lookup(word, callback)

lookupNoun(word, callback)

lookupVerb(word, callback)

lookupAdjective(word, callback)

lookupAdverb(word, callback)

Get complete definition object for word. The lookupX() variants can be faster if you already know the POS of the word. Signature of the callback is callback(result, word) where result is an array of lookup object(s).

Example:

wordpos.lookupAdjective('awesome', console.log);
// output:
[ { synsetOffset: 1285602,
    lexFilenum: 0,
    lexName: 'adj.all',
    pos: 's',
    wCnt: 5,
    lemma: 'amazing',
    synonyms: [ 'amazing', 'awe-inspiring', 'awesome', 'awful', 'awing' ],
    lexId: '0',
    ptrs: [],
    gloss: 'inspiring awe or admiration or wonder; [...] awing majesty, so vast, so high, so silent"  '
    def: 'inspiring awe or admiration or wonder',     
    ...
} ], 'awesome'

In this case only one lookup was found, but there could be several.

Version 1.1 adds the lexName parameter, which maps the lexFilenum to one of 45 lexicographer domains.

seek(offset, pos, callback)

Version 1.1 introduces the seek method to lookup a record directly from the synsetOffset for a given POS. Unlike other methods, callback (if provided) receives (err, result) arguments.

Examples:

wordpos.seek(1285602, 'a').then(console.log)
// same result as wordpos.lookupAdjective('awesome', console.log);

rand(options, callback)

randNoun(options, callback)

randVerb(options, callback)

randAdjective(options, callback)

randAdverb(options, callback)

Get random word(s). (Introduced in version 0.1.10) callback(results, startsWith) receives array of random words and the startsWith option, if one was given. options, if given, is:

{
  startsWith : <string> -- get random words starting with this
  count : <number> -- number of words to return (default = 1)
}

Examples:

wordpos.rand(console.log)
// ['wulfila'] ''

wordpos.randNoun(console.log)
// ['bamboo_palm'] ''

wordpos.rand({starstWith: 'foo'}, console.log)
// ['foot'] 'foo'

wordpos.randVerb({starstWith: 'bar', count: 3}, console.log)
// ['barge', 'barf', 'barter_away'] 'bar'

wordpos.rand({starsWith: 'zzz'}, console.log)
// [] 'zzz'

Note on performance: (node only) random lookups could involve heavy disk reads. It is better to use the count option to get words in batches. This may benefit from the cached reads of similarly keyed entries as well as shared open/close of the index files.

Getting random POS (randNoun(), etc.) is generally faster than rand(), which may look at multiple POS files until count requirement is met.

parse(text)

Returns tokenized array of words in text, less duplicates and stopwords. This method is called on all getX() calls internally.

WordPOS.WNdb

Access to the wordnet-db object containing the dictionary & index files.

WordPOS.stopwords

Access the array of stopwords.

Promises

As of v1.0, all get, is, rand, and lookup methods return a standard ES6 Promise.

wordpos.isVerb('fish').then(console.log);
// true

Compound, with error handler:

wordpos.isVerb('fish')
  .then(console.log)
  .then(doSomethingElse)
  .catch(console.error);

Callbacks, if given, are executed before the Promise is resolved.

wordpos.isVerb('fish', console.log)
  .then(console.log)
  .catch(console.error);
// true 'fish' 13
// true

Note that callback receives full arguments (including profile, if enabled), while the Promise receives only the result of the call. Also, beware that exceptions in the callback will result in the Promise being rejected and caught by catch(), if provided.

Running inside the browsers?

See wordpos-web.

Fast Index (node)

Version 0.1.4 introduces fastIndex option. This uses a secondary index on the index files and is much faster. It is on by default. Secondary index files are generated at install time and placed in the same directory as WNdb.path. Details can be found in tools/stat.js.

Fast index improves performance 30x over Natural's native methods. See blog article Optimizing WordPos.

As of version 1.0, fast index is always on and cannot be turned off.

Command-line (CLI) usage

For CLI usage and examples, see bin/README.

Benchmark

See bench/README.

Changes

See CHANGELOG.

License

https://github.com/moos/wordpos Copyright (c) 2012-2020 [email protected] (The MIT License)

wordpos's People

Contributors

blakmatrix avatar dependabot[bot] avatar moos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wordpos's Issues

Numbers recognized as adjectives

wordpos.isAdjective("17", function(bool) {
    console.log(bool); // true
});
wordpos.isAdjective("two", function(bool) {
    console.log(bool); // true
});

Why are numbers recognized as adjectives?

A lot of verbs are also classified as adjectives.

yarn v2 found a bug: line.substring is not a function

TypeError: line.substring is not a function
at BufferedReader. (/home/vic/Documents/GitHub/mbCLIx/src/.yarn/unplugged/wordpos-npm-1.2.0-031b04010c/node_modules/wordpos/tools/stat.js:92:22)
at BufferedReader.emit (events.js:219:5)
at PassThrough. (/home/vic/Documents/GitHub/mbCLIx/src/.yarn/unplugged/wordpos-npm-1.2.0-031b04010c/node_modules/wordpos/tools/buffered-reader.js:148:10)
at PassThrough.emit (events.js:219:5)
at endReadableNT (_stream_readable.js:1206:12)
at processTicksAndRejections (internal/process/task_queues.js:84:21)
at runNextTicks (internal/process/task_queues.js:66:3)
at processImmediate (internal/timers.js:417:9)

This file contains the result of Yarn building a package (wordpos@npm:1.2.0)

Script name: postinstall

DB folder: /home/vic/Documents/GitHub/mbCLIx/src/.yarn/cache/wordnet-db-npm-3.1.14-62a7238db5-1.zip/node_modules/wordnet-db/dict
/home/vic/Documents/GitHub/mbCLIx/src/.yarn/unplugged/wordpos-npm-1.2.0-031b04010c/node_modules/wordpos/tools/stat.js:92
var key = line.substring(0, Math.min(line.indexOf(' '), KEY_LENGTH));
^

Cannot find module 'lapack' when bundling with Browserify

When including wordpos in my project and bundling via browserify, running the browserify command, I'm receiving this error:

Error: Cannot find module 'lapack' from '[PROJECT_ROOT]/node_modules/wordpos/node_modules/
natural/node_modules/sylvester/lib/node-sylvester'

and here's the browserify command I'm running from the command line:

browserify [PROJECT_ROOT]/js/main.js -o src/content/content.js`

My main.js file is probably as simple as you can get; all I have in there right now is literally this:

var WordPOS = require('wordpos');

I'm assuming that I'm setting my paths correctly, since the error message is resolving to one of the node_modules directories down the dependency chain. I'd be happy to provide more code samples if needed; I'm pretty new to the whole browserify/node-packages-in-the-browser thing, so it's entirely possible that I'm just missing a step.

how to get proper noun

I want to get POS where there are name of person etc..NNP.
They fall into the proper nouns category.

Unable to open c:\...\data.verb

Unable to open c:\dev\...\node_modules\WNdb\dict\data.verb
Unable to open c:\dev\...\node_modules\WNdb\dict\data.verb
Unable to open c:\dev\....\node_modules\WNdb\dict\data.noun
Unable to open c:\dev\....\node_modules\WNdb\dict\data.noun
Unable to open c:\dev\....\node_modules\WNdb\dict\data.noun
Unable to open c:\dev\....\node_modules\WNdb\dict\data.noun
Unable to open c:\dev\....\node_modules\WNdb\dict\data.noun
Unable to open c:\dev\....\node_modules\WNdb\dict\data.noun
Unable to open c:\dev\....\node_modules\WNdb\dict\data.verb
Unable to open c:\dev\...\node_modules\WNdb\dict\data.verb

Any suggestions on how to wait before doing the next word in the array?

yarn install issue

I have noticed the following behaviour trying to install wordpos with yarn.

During the first install

  • the post install script is triggered, populating wordnet-db fast index

During a consecutive, yarn thing get messy

  • yarn see that the wordnet-db folder changed and reset it to his original state. However, as wordpos is still the same, his postinstall script isn't call.
  • At that point, wordpos can't no longer be used as missing the fast index.

ReferenceError: Promise is not defined

On an Ubuntu VM running on Azure, I am getting this error:

ReferenceError: Promise is not defined at null.getNouns (/home/azureuser/node/XMLParser/node_modules/wordpos/src/wordpos.js:131:12)

When I debug with the node debugger, I can see that wordPOS and getNouns() are clearly there when instantiating like this:

var WordPOS = require('wordpos'); var wordpos = new WordPOS();

I have tried this code form:

`wordpos.getNouns(_content, function(result, err){
if(err){
console.log(err.message);
return;
}

  _item.tags = result;
});`

and this code form:
wordpos.getNouns(_content) .then(function(result){ _item.tags = result; }) .catch(function(err){ console.log('Error in backdoorsurvival: ' + err.message); });

with identical results.

This seems like its broken but I could be missing some key info - in which case I would appreciate your feedback.

retrieving nouns etc. from nondelimited text

I'd like to detect nouns and verbs etc. from domain names. For example:

americanexpress.com should return nouns such as "american", "express", "america", "can", and "press".

Is this possible?

Moreover, the obvious two main words in the above domain are "american" and "express", and not "america", "can", or "press". Do you think that it is possible to distinguish which words are the most important words?

Broken npm install in nodejs v4.4.0

On node v4.4.0 on Ubuntu 14.04, if a project specifies wordpos as dependency, on running npm install it will wrongfully not install wordnet-db first, causing the error below. This does not happen on node v5.x. Tested the issue on both travis and server Ubuntu VMs.

The travis log is here. The VM log (same error) is below.

root@raw-test:~/aiva# npm i

> [email protected] postinstall /root/aiva/node_modules/wordpos
> node tools/stat.js --no-stats index.adv index.adj index.verb index.noun

fs.js:808
  return binding.readdir(pathModule._makeLong(path));
                 ^

Error: ENOENT: no such file or directory, scandir '/root/aiva/node_modules/wordnet-db/dict'
    at Error (native)
    at Object.fs.readdirSync (fs.js:808:18)
    at Object.<anonymous> (/root/aiva/node_modules/wordnet-db/index.js:4:31)
    at Module._compile (module.js:409:26)
    at Object.Module._extensions..js (module.js:416:10)
    at Module.load (module.js:343:32)
    at Function.Module._load (module.js:300:12)
    at Module.require (module.js:353:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/root/aiva/node_modules/wordpos/src/wordpos.js:16:10) 

add lexical domain to result

as detailed here [https://github.com/Planeshifter/node-wordnet-magic#lexdomain],

The lexical domain of the synset. Each domain category is composed of the word type followed by a dot and then the category name. WordNet has implemented the following domain categories:

For example 'dance' gives lexdomain 'noun.act'

Postinstall fails

Didn't dig into details, here is the log:

`DB folder: /usr/local/lib/node_modules/wordpos/node_modules/wordnet-db/dict
index.adv buckets 1172, max 125 at in_, sum 4475, avg 3.82, median 2
fs.js:646
return binding.open(pathModule._makeLong(path), stringToFlags(flags), mode);
^

Error: EACCES: permission denied, open '/usr/local/lib/node_modules/wordpos/node_modules/wordnet-db/dict/fast-index.adv.json'
at Object.fs.openSync (fs.js:646:18)
at Object.fs.writeFileSync (fs.js:1299:33)
at BufferedReader. (/usr/local/lib/node_modules/wordpos/tools/stat.js:151:10)
at emitNone (events.js:106:13)
at BufferedReader.emit (events.js:208:7)
at ReadStream. (/usr/local/lib/node_modules/wordpos/tools/buffered-reader.js:150:8)
at emitNone (events.js:111:20)
at ReadStream.emit (events.js:208:7)
at endReadableNT (_stream_readable.js:1064:12)
at _combinedTickCallback (internal/process/next_tick.js:138:11)
`

Error thrown when `rand().then(console.log)`

In the process of writing the types (for Typescript, so people don't need to read a readme to use this) for this awesome library, I had to test every function and their combinations.

Node version:
8.9.4 (latest stable)

When running the following code:

wordpos.rand().then(console.log);

I get..

(node:99162) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): TypeError: Cannot set property 'count' of undefined
(node:99162) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

It does work when I run it with:
rand(console.log) or rand({}).then(console.log).

random word methods appear to be inaccurate

Just discovered this module today, was playing with the random word methods and found that running randNoun() would often give me words that were not nouns. such as objectification or featheriness. No issues opened regarding it so I figured I should bring it up.

Deno support?

Will there be a Deno version of this coming soon?

fastIndex causes getPOS to not call the callback sometimes

Compare:

var
    WordPOS       = require('wordpos'),
    wordpos       = new WordPOS({fastIndex: true});

  getAllPOS('se', function(res) {
    getAllPOS('sea', function(res) {
      getAllPOS('sear', function(res) {
        console.log('all done');
      });
    });
  });

RESULT: last callback is never called!

With:

var
    WordPOS       = require('wordpos'),
    wordpos       = new WordPOS({fastIndex: false});

  getAllPOS('se', function(res) {
    getAllPOS('sea', function(res) {
      getAllPOS('sear', function(res) {
        console.log('all done');
      });
    });
  });

RESULT: works as expected, but is very slow.

postinstall script fails with npm 5

> [email protected] postinstall /Users/user/devel/project/node_modules/wordpos
> node tools/stat.js --no-stats index.adv index.adj index.verb index.noun

fs.js:896
  return binding.readdir(pathModule._makeLong(path), options.encoding);
                 ^

Error: ENOENT: no such file or directory, scandir '/Users/user/devel/project/node_modules/wordpos/node_modules/wordnet-db/dict'
    at Object.fs.readdirSync (fs.js:896:18)
    at Object.<anonymous> (/Users/user/devel/project/node_modules/wordpos/node_modules/wordnet-db/index.js:4:31)
    at Module._compile (module.js:569:30)
    at Object.Module._extensions..js (module.js:580:10)
    at Module.load (module.js:503:32)
    at tryModuleLoad (module.js:466:12)
    at Function.Module._load (module.js:458:3)
    at Module.require (module.js:513:17)
    at require (internal/module.js:11:18)

French version?

Hi,
Is there a version able to parse french or a way to inject an equivalent of wordnet-db such as WoNeF ?
Thanks :)

Confusion on Output -- 'is' isn't a verb?

I know you don't have a ton of control over how the results are generated (i.e. comes from Natural and/or WordNet db), so don't know if this is the right place to talk about this. Basicall, using a very simple input sentence "the ball is red", I get the following output:

{ nouns: [ 'ball', 'red' ],
  verbs: [ 'ball' ],
  adjectives: [ 'red' ],
  adverbs: [],
  rest: [ 'is', 'the' ] }

I did a search on WordNet's site, and it does show "is" as a verb (after a few other entries that seem to be based on acronymns or whatever). Again, not sure if this is on wordpos or Natural or whatever, but figured I'd throw it out for you.

Random word with length N

First of all, thank you for the fantastic effort in developing and maintaining this library ๐Ÿ‘
I was wondering how I can query a random word of length 5. Is that possible?
I would like to query 4 number of random words of length a,b,c,d each? Any suggestions

getNouns() is not accurate

Thanks for this useful library! I noticed the getNouns() function is not accurate.

For example:

Please give this note to the man in the blue hat.

Returns the following nouns:

["give","note","man","blue","hat"]

It's true the word 'give' can sometimes be a noun, such as when a material has 'some give', but in this sentence, and 99% of the time, it's a verb.

Here's another simple example:

Can you ask her what time it is?

WordPos returns the following as nouns:

["Can","time"]

But of course a sentence starting with 'Can' does not refer to a metal container. And it skips the pronoun.

Would it be possible to improve the accuracy of getNouns()? I realize the answer might be no, as NLP like this is very hard.

Thanks!

DeprecationWarning: Calling an asynchronous function without callback is deprecated.

Using Node 7.2.1
The followoing results in the warning:
DeprecationWarning: Calling an asynchronous function without callback is deprecated.

const WordPOS = require('wordpos');
const wordpos = new WordPOS();
wordpos.getNouns(text).then((nouns) => {
    console.log(nouns)
})

Still works but it's annoying going through the logs and seeing this.

Local package.json exists, but node_modules missing, did you mean to install?

I can't install wordpos, I tried to install again npm, doesn't help, also i can install every other npm module, just not wordpos. I have node modules, everything other is working.

npm install -g wordpos

[email protected] postinstall C:\Users\stulejka\Desktop\programowanie\testApp\node_modules\wordpos
npm run postinstall-web && npm run postinstall-node

[email protected] postinstall-web C:\Users\stulejka\Desktop\programowanie\testApp\node_modules\wordpos
node scripts/makeJsonDict.js index data

internal/modules/cjs/loader.js:584
throw err;
^

Error: Cannot find module 'C:\Users\stulejka\Desktop\programowanie\testApp\node_modules\wordpos\scripts\makeJsonDict.js'
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:582:15)
at Function.Module._load (internal/modules/cjs/loader.js:508:25)
at Function.Module.runMain (internal/modules/cjs/loader.js:754:12)
at startup (internal/bootstrap/node.js:283:19)
at bootstrapNodeJSCore (internal/bootstrap/node.js:622:3)
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] postinstall-web: node scripts/makeJsonDict.js index data
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] postinstall-web script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
npm WARN Local package.json exists, but node_modules missing, did you mean to install?

npm ERR! A complete log of this run can be found in:
npm ERR! C:\Users\stulejka\AppData\Roaming\npm-cache_logs\2019-05-29T16_05_21_877Z-debug.log
npm WARN [email protected] No description

npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] postinstall: npm run postinstall-web && npm run postinstall-node
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] postinstall script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR! C:\Users\stulejka\AppData\Roaming\npm-cache_logs\2019-05-29T16_05_22_853Z-debug.log

throw error when rand

Throw error when run below code (v1.1.5)

wordpos.rand({count: 3}, console.log)

The error is below:

/Users/pro/github/wordbot/node_modules/[email protected]@wordpos/lib/natural/trie/trie.js:62
	return next.addString(string.substring(1));
	                             ^

TypeError: string.substring is not a function
    at Trie.addString (/Users/pro/github/wordbot/node_modules/[email protected]@wordpos/lib/natural/trie/trie.js:62:31)
    at Trie.addStrings (/Users/pro/github/wordbot/node_modules/[email protected]@wordpos/lib/natural/trie/trie.js:70:8)
    at collector (/Users/pro/github/wordbot/node_modules/[email protected]@wordpos/src/rand.js:145:14)
    at piper.wrapper (/Users/pro/github/wordbot/node_modules/[email protected]@wordpos/src/piper.js:67:14)
    at executeBound (/Users/pro/github/wordbot/node_modules/[email protected]@underscore/underscore.js:701:67)
    at bound (/Users/pro/github/wordbot/node_modules/[email protected]@underscore/underscore.js:733:14)
    at /Users/pro/github/wordbot/node_modules/[email protected]@wordpos/src/indexFile.js:88:6
    at FSReqWrap.wrapper [as oncomplete] (fs.js:629:17)

Words like "get" or "take" are throwing errors

Hello! I am trying to use this library to build a text adventure. Verbs like "get" and "take" are very integral to the game, but when I run them in a sentence through getPOS, they don't show up in any of the returned arrays. If I try to send them alone as a string:
wordpos.getPOS("take")
I get the following error:
TypeError: Cannot read property 'toLowerCase' of undefined at normalize (/Users/joshuadowns/Code/Personal/mern-text-adventure/node_modules/wordpos/src/util.js:32:15)

I've also tried running them through getVerbs() with the same errors. When I visit wordnet and run searches for the words, they come up as expected. Any ideas?

Mixing promises and callback code in wordpos

Firstly, thanks for the great library. The ~5x speed improvement of v1 is really helping us out a lot!

I wondered if I could get your advice on using straightforward callbacks in wordpos. I would like to just be able to do something like this:

const myFunction = (callback) => {
  wordpos.lookup(..., (results) => {
    callback(results);
  });
};

But sometimes my callback will throw an error, causing a Promise within wordpos to be rejected, and my callback is called again from wordpos, throwing another error, and then finally node hangs with a UnhandledPromiseRejectionWarning, which breaks our test suite as it was expecting a normal throw, but got nothing.

I would like the throw somewhere down the callback chain to bubble up and throw at the top-level. Is this just not possible with this library, given the use of Promises in it? I won't be able to convert all our code to use Promises as it is a monumental effort.

At the moment, I'm getting around this by doing something like:

let lookupDone = false;
let lookupResults = [];

wordpos.lookup(word).then((results) => {
  lookupDone = true;
  lookupResults = results;
});

const wait = function wait() {
  if (!lookupDone) {
    setTimeout(wait, 10);
  } else {
    // do stuff with results
  }
}
wait();

But obviously this is highly not ideal. I may simply be not understanding Promises right. Thanks in advance!

Getting synset details

Is it possible to get all the details for the word in one function call ?
What I mean with details is that when we look up a word in wordnet portal we can see synset details as antonym, torponym, hypernym, sister term etc. (even frequency counts)

One way to do it, if I am not wrong iterate over ptrs node and get details for the synset by Id.
In this case @ symbol is direct hypernym I guess. (and others +, ~, $ ...)

http://wordnetweb.princeton.edu/perl/webwn?o2=1&o0=1&o8=1&o1=1&o7=1&o5=1&o9=&o6=1&o3=1&o4=1&r=2&s=accept&i=15&h=00011010001101011223022220000000#c

incorrect results

I am running wordpos (v2.0.0) in a Node.js (v12.16.1 LTS) environment.

It seems like this library is returning unexpected results. The code below ...

const WordPOS = require('wordpos')
const wordpos = new WordPOS()

const text = 'The quick brown fox jumped over the lazy dog.'
const results = await wordpos.getPOS(text)
console.log(results)

returns the following (incorrect) result.

{
  nouns: [ 'quick', 'brown', 'fox', 'dog' ],
  verbs: [ 'brown', 'fox', 'dog' ],
  adjectives: [ 'quick', 'brown', 'lazy' ],
  adverbs: [ 'quick' ],
  rest: [ 'The', 'jumped', '' ]
}

I mean, obviously 'quick' and 'brown' are adjectives - not nouns. The verbs array [ 'brown', 'fox', 'dog' ] is filled with one adjective and two nouns and is missing the only verb 'jumped' in the sentence.

Am I missing something, or is there a big problem here?

EDIT: Do i need to tokenize and lemmatize the sentence first?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.