Code Monkey home page Code Monkey logo

hospital-chargemaster's Issues

issue with advent-health central-texas-medical-center-CTMC.xml

Parsing /home/vanessa/Documents/Dropbox/Code/database/hospital-chargemaster/data/advent-health/latest/central-texas-medical-center-CTMC.xml
---------------------------------------------------------------------------
ExpatError                                Traceback (most recent call last)
<ipython-input-249-5cab8ddfeb7b> in <module>()
     14     if filename.endswith('xml'):
     15         with open(filename, 'r') as filey:
---> 16             content = xmltodict.parse(filey.read())
     17 
     18         if "dataroot" in content:

~/anaconda3/lib/python3.6/site-packages/xmltodict.py in parse(xml_input, encoding, expat, process_namespaces, namespace_separator, disable_entities, **kwargs)
    328         parser.ParseFile(xml_input)
    329     else:
--> 330         parser.Parse(xml_input, True)
    331     return handler.item
    332 

ExpatError: no element found: line 1, column 0

Skipping for now, could be related to xmltodict or the file itself.

List of Hospitals that I couldn't Parse

Some of these have embedded frames, or "click to confirm you aren't a robot" or use the price index that blocks ip address, etc. I was unable to parse these hospitals. I don't think this is in compliance with what the CDC mandated, but not much to do about that.

  • barnes-jewish-hospital
  • cleveland-clinic-hospital
  • florida-hospital
  • loma-linda-university-medical-center
  • massachusetts-general-hospital
  • memorial-hermann-texas-medical-center
  • memorial-hospital-for-cancer-and-allied-diseases
  • new-york-university-hospitals-center
  • norton-hospitals-inc.
  • robert-wood-johnson-university-hospital
  • st.-francis-medical-center
  • university-of-alabama-hospital
  • university-of-chicago-hospitals
  • university-of-kansas-hospitals
  • university-of-oklahoma-medical-center

oshpd-ca hospitals missing

Interesting project! I'm looking at the OSHPD CA data. The records appear to list 354 distinct hospital_id values, but there are only 130 in the combined TSVs of the latest data (data-latest-1.tsv and data-latest-2.tsv).

I'll look into the parser and submit PRs if I find anything, but do you have any thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.