Code Monkey home page Code Monkey logo

fhiry's Introduction

🔥 fhiry - FHIR to pandas dataframe for data analytics, AI and ML

Virtual flattened view of FHIR Bundle / ndjson / FHIR server / BigQuery!

Libraries.io SourceRank PyPI download total GitHub tag (latest by date)

🔥 FHIRy is a python package to facilitate health data analytics and machine learning by converting a folder of FHIR bundles/ndjson from bulk data export into a pandas data frame for analysis. You can import the dataframe into ML packages such as Tensorflow and PyTorch. FHIRy also supports FHIR server search and FHIR tables on BigQuery.

Test this with the synthea sample or the downloaded ndjson from the SMART Bulk data server. Use the 'Discussions' tab above for feature requests.

✨ Checkout this template for Multimodal machine learning in healthcare!

🔥 Checkout MedPrompt for Medical LLM prompts, including FHIR related prompts, such as text-to-FHIRQuery mapper!

Installation

Stable

pip install fhiry

Latest dev version

pip install git+https://github.com/dermatologist/fhiry.git

Usage

1. Import FHIR bundles (JSON) from folder to pandas dataframe

import fhiry.parallel as fp
df = fp.process('/path/to/fhir/resources')
print(df.info())

Example source data set: Synthea

Jupyter notebook example: notebooks/synthea.ipynb

2. Import NDJSON from folder to pandas dataframe

import fhiry.parallel as fp
df = fp.ndjson('/path/to/fhir/ndjson/files')
print(df.info())

Example source data set: SMART Bulk Data Server Export

Jupyter notebook example: notebooks/ndjson.ipynb

3. Import FHIR Search results to pandas dataframe

Fetch and import resources from FHIR Search API results to pandas dataframe.

Documentation: fhir-search.md

Example: Import all conditions with a certain code from FHIR Server

Fetch and import all condition resources with Snomed (Codesystem http://snomed.info/sct) Code 39065001 in the FHIR element Condition.code (resource type specific FHIR search parameter code) to a pandas dataframe:

from fhiry.fhirsearch import Fhirsearch

fs = Fhirsearch(fhir_base_url = "http://fhir-server:8080/fhir")

my_fhir_search_parameters = {
    "code": "http://snomed.info/sct|39065001",
}

df = fs.search(resource_type = "Condition", search_parameters = my_fhir_search_parameters)

print(df.info())
from fhiry.bqsearch import BQsearch
bqs = BQsearch()

df = bqs.search("SELECT * FROM `bigquery-public-data.fhir_synthea.patient` LIMIT 20") # can be a path to .sql file

Filters

Pass a config json to any of the constructors:

  • config_json can be a path to a json file.
df = fp.process('/path/to/fhir/resources', config_json='{ "REMOVE": ["resource.text.div"], "RENAME": { "resource.id": "id" }  }')

fs = Fhirsearch(fhir_base_url = "http://fhir-server:8080/fhir", config_json = '{ "REMOVE": ["resource.text.div"], "RENAME": { "resource.id": "id" }  }')

bqs = BQsearch('{ "REMOVE": ["resource.text.div"], "RENAME": { "resource.id": "id" }  }')

Columns

  • see df.columns
patientId
fullUrl
resource.resourceType
resource.id
resource.name
resource.telecom
resource.gender
...
...
...

Give us a star ⭐️

If you find this project useful, give us a star. It helps others discover the project.

Contributors

fhiry's People

Contributors

dependabot[bot] avatar dermatologist avatar fhirfly avatar github-actions[bot] avatar mandalka avatar sinujackson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

fhiry's Issues

'charmap' codec can't decode byte 0x81 in position 1603

I encountered this error while running fhiry against synthea:

File ".....Programs\Python\Python310\lib\site-packages\fhiry\fhiry.py", line 54, in read_bundle_from_file

json_in = f.read()

File "...\Python\Python310\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1603: character maps to

Upload sdist to PyPI to enable building conda-forge package

Hi,
I've been trying to get fhiry onto conda-forge using grayskull but it seems like the source distribution (sdist) for fhiry is missing on PyPI. So running grayskull pypi fhiry to generate the conda-forge recipe fails with the error:AttributeError: There is no sdist package on pypi for fhiry. Here is the link for the PR

I have also tried to generate the conda recipe with grayskull pypi https://github.com/dermatologist/fhiry but the building process is failed with the error:

LookupError: setuptools-scm was unable to detect version for /opt/homebrew/Caskroom/miniforge/base/envs/nfcore/conda-bld/fhiry_1716032605268/work.
Make sure you're either building from a fully intact git repository or PyPI tarballs. 
Most other sources (such as GitHub's tarballs, a git checkout without the .git folder) don't contain the necessary metadata and will not work.

I think the reason is that GitHub's tarball is used if the recipe is generated with git url.
So I would like to ask whether it is possible to upload the sdist (tar.gz file) to PyPI too? Big thx!

import from BigQuery FHIR tables

The Google Cloud Healthcare API supports the SQL on FHIR schema in BigQuery. This analytics schema is the default schema on the ExportResources () method and is supported by the FHIR community. Feature request to import data from this nested schema to pandas dataframe.

Performance warning: DataFrame is highly fragmented

base_fhiry.py:96: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling frame.insert many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy()

FHIR version

Hello,

We are using fhiry in our package ehrapy to read FHIR data, and now I am trying to add some dataloaders with Synthea data.

However, Synthea provides three formats of FHIR: R4, STU3 and DSTU2. We were wondering which version does fhiry support, so that we can choose the correct one.

Thank you very much for your efforts!

Flattening FHIR resources / bundle for LLMs

Add support for representing a single FHIR resource or a FHIR bundle as text that can be injected into an LLM prompt for combining with unstructured notes.

Plan:

  • Create a Jinja template for relevant fhir resources.

",

fhiry/docs/conf.py

Lines 73 to 78 in 578960b

"sphinx.ext.todo",
"sphinx.ext.autosummary",
"sphinx.ext.viewcode",
"sphinx.ext.coverage",
"sphinx.ext.doctest",
"sphinx.ext.ifconfig",


This issue was generated by todo based on a todo comment in 578960b when #8 was merged. cc @dermatologist.

Support for fuzzy search on FHIR bundle

Currently, FHIR server specifications do not include fuzzy search capabilities. Request support for fuzzy search within a FHIR bundle. Example: Search for serum creatinine in a bundle of FHIR observations.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.