Code Monkey home page Code Monkey logo

fastcontext's Introduction

FastContext

FastContext is an optimized Java implementation of ConText algorithm (https://www.ncbi.nlm.nih.gov/pubmed/23920642). It runs two orders of magnitude faster and more accurate than previous two popluar implementations: JavaConText and GeneralConText.

Maven dependency set up

<dependency>
  <groupId>edu.utah.bmi.nlp</groupId>
  <artifactId>fastcontext</artifactId>
  <version>1.3.1.9</version>
</dependency>

Note: the maven distribution doesn't include the context rule file, you can download it here if needed.

Quick start

// Initiate FastContext
FastContext fc = new FastContext("conf/context.csv");
String inputString = "The patient denied any fever , although he complained some headache .";
ArrayList<Span> sent = SimpleParser.tokenizeOnWhitespaces(inputString);
LinkedHashMap<String, ConTextSpan> matches = fc.getFullContextFeatures("Concept", sent, 4, 4, inputString);
// To find the context information of "fever"

For more detailed API uses, please refer to TestFastContextAPIs.java

Acknowledgement

Special thanks to Olga Patterson and Guy Divita for contributing rules as part of the context rule set.

Citation

If you are using FastContext within your research work, please cite the following publication:

Shi, Jianlin, and John F. Hurdle. “Trie-Based Rule Processing for Clinical NLP: A Use-Case Study of n-Trie, Making the ConText Algorithm More Efficient and Scalable.” Journal of Biomedical Informatics, August 6, 2018. https://doi.org/10.1016/j.jbi.2018.08.002.

Full text are available at: https://www.sciencedirect.com/science/article/pii/S1532046418301576

Preprint: https://arxiv.org/abs/1905.00079

fastcontext's People

Contributors

jianlins avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fastcontext's Issues

Command line interface

I would like to use this software, but don't know how to create a stand alone package to be used from a command line environment.

How could this be achieved?

Thanks!

[Feature request] FastContext constructor taking an InputStream instead of a path as a String

Currently one can either provide a string with the path of the rule file (.tsv by ferault) to load or an existing list of rules that were already parsed from a file.
This does not allow to load a this .tsv to be loaded from a resource packaged in a jar file (e.g. a servlet deployed on an application server), short of reading the file through a stream an creating a temporary file with the contents, which is clearly sub-optimal.

Could the loading mechanism be refactored to also take an InputStream rather than just a path?

NullPointerException when numbers separated by "-"

When there are two numbers separated by "-" the code gives null pointer exception
Exception in thread "main" java.lang.NullPointerException
at edu.utah.bmi.nlp.fastcontext.ContextRuleProcessor.processDigits(ContextRuleProcessor.java:290)
at edu.utah.bmi.nlp.fastcontext.ContextRuleProcessor.processRules(ContextRuleProcessor.java:237)
at edu.utah.bmi.nlp.fastcontext.ContextRuleProcessor.processRules(ContextRuleProcessor.java:202)
at edu.utah.bmi.nlp.fastcontext.FastContext.processContextWEvidence(FastContext.java:147)
at edu.utah.bmi.nlp.fastcontext.FastContext.processContextWEvidence(FastContext.java:138)
at edu.utah.bmi.nlp.fastcontext.TestFastContextAPIs.test6(TestFastContextAPIs.java:119)
at edu.utah.bmi.nlp.fastcontext.TestFastContextAPIs.main(TestFastContextAPIs.java:147)

I did debug the code and found that line 290 in ContextRuleProcessor has the problem

You might need to change
"HashMap ruletmp = (HashMap) rule.get(ruleDigit + "");"
to
"HashMap ruletmp = (HashMap) rule.get(num);"

That solves the problem.

Here is MWE:

FastContext fc = new FastContext("./conf/context.txt");
String inputString = "Vitals - Tm=Tc:98.2 (range 97.0-98.2 o/n) HR:64-86 myocardial infarction";
String concept = "myocardial infarction";
int conceptBeginOffset = inputString.indexOf(concept);
int conceptEndOffset = conceptBeginOffset + concept.length();
LinkedHashMap<String, ConTextSpan> matches = fc.processContextWEvidence(inputString, conceptBeginOffset, conceptEndOffset, 30);

Syntax specification for the textual rule file

We are trying to adapt existing trigger terms in another language from the format of the original java implementation of ConText to the format of FastConText. While most of the syntax can be extrapolated trivially, some elements require some clarification:

  • Does \w+ mean one or more words?
  • What is the syntax for the temporarily triggers ? In particular: > and -
    Does the > always go before the numerical value? Could I have a trigger term such that > is in the middle? e.g. : "il y a > 1 semaine|backward|trigger|historical|30"? Am I inferring the syntax improperly?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.