Code Monkey home page Code Monkey logo

certainty-estimator's Introduction

certainty-estimator

Intro

certainty-estimator is a package used to estimate the certainty of scientfic findings. It is released with EMNLP 2021 paper Measuring Sentence-level and Aspect-level (Un)Certainty in Science Communications by Jiaxin Pei and David Jurgens.

Install

Use pip

If pip is installed, certainty-estimator could be installed directly from it:

pip3 install certainty-estimator

Dependencies

python>=3.6.0
torch>=1.6.0
transformers >= 3.1.0
numpy
math
tqdm

Example:

clone this repo to your machine: git clone https://github.com/Jiaxin-Pei/certainty-estimator.git

run the example code: python3 play.py

Estimating sentence-level certainty

Notes: During your first usage, the package will download a model file automatically, which is about 500MB.

Construct the Predictor Object

>>> from certainty_estimator.predict_certainty import CertaintyEstimator
>>> estimator = CertaintyEstimator('sentence-level')

Cuda is disabled by default, to allow GPU calculation, please use

>>> from certainty_estimator.predict_certainty import CertaintyEstimator
>>> estimator = CertaintyEstimator('sentence-level',cuda=True)

predict

predict is the core method of this package, which takes a single text or a list of texts, and returns a list of raw values in [1,6] (higher means more certain, while lower means less).

# Predict certainty for a single scientific finding
>>> text = 'The reason for this might be that the fetal central nervous system, which controls movements in general and facial movements in particular did not develop at the same rate and in the same manner as in fetuses of mothers who did not smoke during pregnancy.'
>>> estimator.predict(text)
>>> [2.6891987]

# Predict certainty for a list of scientific finding
>>> text = ['The reason for this might be that the fetal central nervous system, which controls movements in general and facial movements in particular did not develop at the same rate and in the same manner as in fetuses of mothers who did not smoke during pregnancy.', 'Mice lacking the tet1 gene were able to learn to navigate a water maze, but were unable to extinguish the memory.']
>>> estimator.predict(text)
>>> [2.6891987, 5.01066]

# when calculating certainty for a long list of findings, use the following code to display the progress
>>> from tqdm import tqdm
>>> text = [a long list of findings]
>>> estimator.predict(text,tqdm=tqdm)
>>> [2.6891987, 5.01066, ... ,4.28066, 5.77066]

Estimating aspect-level certainty

Notes: During your first usage, the package will download a model file automatically, which is about 500MB.

Construct the Predictor Object

>>> from certainty_estimator.predict_certainty import CertaintyEstimator
>>> estimator = CertaintyEstimator('aspect-level')

Cuda is disabled by default, to allow GPU calculation, please use

>>> from certainty_estimator.predict_certainty import CertaintyEstimator
>>> estimator = CertaintyEstimator('aspect-level',cuda=True)

predict

predict is the core method of this package, which takes a single text or a list of texts, and returns a list of tuples for aspect-level certainty.

# Predict certainty for a single scientific finding
>>> text = 'Mice lacking tet1 had much lower levels of hydroxymethylation -- an intermediate step in the removal of methylation -- in the hippocampus and the cortex, which are both key to learning and memory.'
>>> result = estimator.predict(text)
>>> print(result)
>>> [[('Extent', 'Uncertain'), ('Probability', 'Certain')]]
#By default, only returns the presented aspects and their certainty.

# Predict certainty for a list of scientific finding
>>> text = ['Mice lacking tet1 had much lower levels of hydroxymethylation -- an intermediate step in the removal of methylation -- in the hippocampus and the cortex, which are both key to learning and memory.', 'Dopamine and serotonin are important for different forms of flexibility associated with receiving reward and punishment.']
>>> result = estimator.predict(text)
>>> print(result)
>>> [[('Extent', 'Uncertain'), ('Probability', 'Certain')], [('Probability', 'Certain')]]


# when calculating certainty for a long list of findings, use the tqdm to display the progress
>>> from tqdm import tqdm
>>> text = [a long list of findings]
>>> estimator.predict(text,tqdm=tqdm)

Contact

Jiaxin Pei ([email protected])

certainty-estimator's People

Contributors

jiaxin-pei avatar mallaham avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

mallaham uwts

certainty-estimator's Issues

Thank you note

This is not an issue. I would like to thank you for this great open-source project. It has been the only open-sourced package I could possibly find which implements the textual measure of certainty level.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.