Code Monkey home page Code Monkey logo

matching-comparative-review's Introduction

matching-comparative-review

Binder

This repository allows to reproduce the experimental study comparing the effectiveness of 7 matching algorithms selected from the most used matching techniques like deterministic, probabilistic, and machine learning techniques. To conduct the experiment, we started by generating synthetic data from real-world names using the Freely Extensible Biomedical Record Linkage (FEBRL) software. Then we ran multiple deduplication algorithms on the synthetic data using the Python Record Linkage Toolkit (PRLT). Finally, we evaluated the effectiveness of the deduplication using matching quality metrics like recall, precision, and F score using PRLT.

To use or test this code no need install or setup python you can click on the "launch binder" button above.

Using BinderHub

After clicking on the "launch binder" link above, wait for a few minutes to BinderHub build the Docker container.

Run Locally

To run locally on your computer:

  • Install Anaconda and jupyter notebooks or jupyter lab on your computer
  • Clone or Download the folder
  • Install dependencies : pip install requirements.txt
  • Open the jupyter notebook A comparative review of patient matching approaches.ipynb and run it.

Folder contents

This folder contains :

  • The main jupyter notebook : A comparative review of patient matching approaches.ipynb
  • a Python module patientlinkr.py required to run the notebook
  • The datasets results when you run the notebooks
  • A dataset folder with the 3 datasets to deduplicate
  • A docs folder with images, a README file and requirements file

Credits and Acknowledgements

The original first names and last names used to generate the synthetic datasets were scraped from :

matching-comparative-review's People

Contributors

mayerantoine avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.