Comments (15)
Update, just heard back from the DrugBank people and they said that including the info in the header of the file is fine. Full speed ahead!
@forzavitale can you update us on your progress from the hackathon (if you ended up working on this)?
from drug-spending.
I don't have the coding ability to do this, but I am knowledgeable about the domain as an informatics pharmacist and willing to offer some help from that aspect. Pretty sure the answer to this problem is the Structured Product Labeling (SPL). It is a document markup standard approved by Health Level Seven (HL7) and adopted by FDA as a mechanism for exchanging product and facility information.
Different datasets use different drug identifiers: brand name, generic name, NDA, NDC, etc. and it is hard to find the same drug in different datasets. The OpenFDA features harmonization of drug identifiers and fields for various pharmacological use are part of the dataset. Take a look: https://open.fda.gov/drug/label/reference/
from drug-spending.
Thanks for making this issue @jenniferthompson!
- re: 1. I'll email them today!
- re: 2. It would be great to have help from someone who's been more involved in the data analysis projects happening here to work with me to figure out what the most interesting/useful part of the data will be. This can probably wait until after we've started poking around and seeing what's available in DrugBank.
- re: 3. Happy to take the lead on this, looks like it's easy to download and will probably be relatively straightforward to parse.
from drug-spending.
They got back to me pretty quickly, and had some questions about data.world that I'm not sure I know the answer to:
Looks like an interesting project, thanks for reaching out!
I checked out your site and noticed a couple of issues:
Data.world looks like a commercial project that requires people have accounts to download data. It doesn't look like they have a good way to post the licenses for datasets? Maybe I am not understanding what data.world is.
I don't see a clear indication of the license for the datasets available through your website, or clear citations to the datasets there?
Your use case looks like a non-commercial use case, so that should be fine but, when our data is shared it has to be shared both with a citation and the license we share our data under.
We also have 2 datasets that are public domain and you can do whatever you want with them, on this page: https://www.drugbank.ca/releases/latest#open-data
They include DrugBank identifiers, names, and synonyms to permit easy linking and integration into any type of project.
Is there any way we can include their license and citation on data.world? I'm pretty sure it will be more characters than are allowed in the "description" on data.world, and I'm not sure where else dataset metadata can be put on data.world (which is pretty surprising...)
Alternatively, should we just stick with the public domain data?
from drug-spending.
from drug-spending.
hi all-- first time jumping in here! at the NYC hackathon rn, seems like this issue is pretty recent and would like to start munging something.... guidance?
from drug-spending.
@forzavitale I pinged the DrugBank people again to ask if we could just include the license and citation info in the header of the file, since we can't assign it to the file directly via data.world. They haven't gotten back to me about that though. That said, in my opinion it should be fine so you can probably start working on the data. Let's just make sure to check back in with them before we post the data to data.world.
Alternatively, you can poke around the public domain data and see if that's enough to get us what we want!
from drug-spending.
I think that's a good plan @cduvallet - and at the speed the data.world folks move (read: blazing fast), it's entirely plausible that we might be able to assign a file-specific license by the time we're ready to post it.
from drug-spending.
Fantastic! Thanks so much for following up, @cduvallet! 🎉
from drug-spending.
Is this still a project that needs help? I see the label but comments are fairly old.
Been lurking on D4D for a while but interested in working on something.
from drug-spending.
Hello! The project has been dormant for a while (hence the old comments), I'm one of the people that's trying to get this project going again. Any issue with the label status-under-review can be ignored for now, it either can't be tackled yet or may need to be trimmed/reformatted. This is one of the older issues that I thought would be good to try and get through because drugbank.ca materials seem to be very useful for our current goal of matching drugs to therapeutic uses.
from drug-spending.
In PR #83 @proof-by-accident investigated how many of the Medicare drugs can be found in the drugbank.ca data. The results seem similar to matching attempts attempts from other sources: a good number of drugs can be matched easily on the first pass, but about twice as many were not matched and will probably require a non-trivial amount of research to match the rest properly.
from drug-spending.
Oh this seems great! The OpenFDA might be just what we need because you're right, we have been running into the issue where not everything is in one dataset and the names can be inconsistent between datasets. Thanks for this suggestion.
from drug-spending.
Can I help?
from drug-spending.
Is this still active? Can I start this or is this throw away work?
from drug-spending.
Related Issues (20)
- Create Github doc listing current and potential data sources HOT 1
- Update and finalize data contribution docs HOT 1
- Create keys to join Medicare Part D spending, manufacturer, and lobbying info HOT 9
- Create glossary of terms found in our datasets/context HOT 8
- Add datasets collected & cleaned by `read_data.py` to data.world HOT 3
- Tidy, document and submit data from OpenPaymentsData.CMS.gov HOT 20
- Create visualization to help understand physician/hospital payments HOT 2
- Visualize/describe how payroll taxes are allocated when funding Medicare Part A HOT 2
- Visualize/model relationship between lobbying expenditures and brand name prices HOT 12
- Explore and extract data from data.cms.gov HOT 9
- Scrape Merck Manuals for drug names and uses HOT 10
- Inventory gathered data and document relationships among sets HOT 9
- Call for new leadership HOT 2
- Explore and/or tidy the FDA_NDC_Product dataset HOT 7
- Tidy drug_list.json HOT 1
- Find more Medicare Part D Spending data (if it exists) HOT 9
- Tidy and upload Medicare Part B Drug Spending data (2011-2015) HOT 1
- [WIP] Join Medicare data to USP drug classification
- Join Medicare Part D spending data to ATC Classification System HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from drug-spending.