Comments (10)
To add more context to this: Currently the batch pipeline relies on FHIR search API and the specific way HAPI implements paging. This means that for very large DBs, the initial query for creating the list of IDs for resources to return can take very long. The idea is to implement a way that avoids such long DB queries and instead reads the list of IDs in segments from the DB directly.
That said, one benefit of the current implementation is that it supports general FHIR search URLs, e.g., with filters as mentioned here so it would be great to keep the current functionality while adding a direct DB based implementation too.
from fhir-data-pipes.
@kimaina That seems like a sensible way to implement this to me!
from fhir-data-pipes.
Thanks @bashir2 for adding more context!
from fhir-data-pipes.
I am almost done working on this, just a few design issue and a possible solution that needs to be discussed,
some tables do not have the UUID column necessary for FHIR extraction (e.g patient table), to resolve this, I have added a new field in the JSON config schema that solves this problem. What do you think? @bashir2 @ibacher
"patient": {
"enabled": "true",
"title": "Patient",
"uuidTable": "person",
"linkTemplates": {
"rest": "/ws/rest/v1/patient/{uuid}?v=full",
"fhir": "/Patient/{uuid}"
}
},
from fhir-data-pipes.
@kimaina My question here would be how do we know how to join to the person
table in that instance? For instance to find the UUID for a patient, you need to match the patient_id
field to the corresponding person
record where person.person_id = patient_id
; however, to do the same for, e.g., drug orders, we need to match order
record where order.order_id = drug_order.order_id
.
from fhir-data-pipes.
Thanks @ibacher! For the batch case, we do not need to do any joins since we can directly fetch UUIDs from parent tables. However, I see how this approach can be vital for streaming mode. I guess we need another field to indicate how we do the join for the streaming bit:
"patient": {
"enabled": "true",
"title": "Patient",
"parentTable": "person",
"joinClause": "person.person_id = patient_id"
"linkTemplates": {
"rest": "/ws/rest/v1/patient/{uuid}?v=full",
"fhir": "/Patient/{uuid}"
}
},
WDYT?
@kimaina My question here would be how do we know how to join to the
person
table in that instance?
from fhir-data-pipes.
@ibacher thanks for the suggestion! Thinking about it carefully, we still need to do join even for the batch case. So the same suggestion applies! We will need to create a ticket for this!
For the batch case, we do not need to do any joins since we can directly fetch UUIDs from parent tables.
from fhir-data-pipes.
I see we already to have a ticket for this: #46
from fhir-data-pipes.
@kimaina can we close this issue now that PR #72 is submitted or are there any pieces that are still left?
from fhir-data-pipes.
We can mark this as "done" and track specific issues/bugs in separate tickets.
from fhir-data-pipes.
Related Issues (20)
- Setup Sonar for developers to identify bugs through static code analysis
- Sonar detected bugs on metrics package of pipeline-controller
- Fix Code smells in pipeline-controller
- Investigate Cloud Build flakiness because of dockerised pipeline runs HOT 4
- compose-controller-spark-sql-single.yaml fails to launch HOT 2
- Remove `hiveJdbcDriver` configuration property and unify driver loading
- New recurring Thrift server errors HOT 3
- Upgrade HAPI FHIR version
- Fix issues with `compose-controller-spark-sql-single.yaml` HOT 6
- Do not display the `View Raw Logs` button in case of no logs
- In the HAPI JDBC mode, when resources are created with PUT, resource Id's mismatch between the original FHIR resource and the corresponding resource in the parquet file. HOT 4
- Missing extension in the parquet file compared to source FHIR resource. HOT 1
- How to handle DataFormatException while parsing JSON encoded FHIR content HOT 1
- Make recursive depth of Bunsen a configuration parameter. HOT 1
- The `answer` fields are dropped in QuestionnaireResponse due to recursive structure.
- Unable to create Individual columns of nested array of objects in a dataframe HOT 9
- Investigate high memory utilisation for the pipeline controller and provide configurations to control them HOT 3
- Cannot override fhirServerUserName or fhirServerPassword as it keeps defaulting HOT 6
- Help needed running a fhri pipeline controller HOT 6
- Gracefully handle resource types with no instances in the FHIR server. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fhir-data-pipes.