This repo was created to house my intermediate and final work product for the Open Street Map (OSM) project as part of Udacity's Data Analyst Nanodegree program. In this project, I download OSM data, audited several fields, iteratively cleaned the data while parsing the OSM into to five CSVs, imported the CSVs to SQL tables, and used SQL to query the data.
This repo includes the following files:
- README.md: This file, which contains a summary of files used for this project
- project_file.md: The final write-up documenting my data wrangling process and findings
- street_names.py: The script used to clean the street names
- state.py The script used to clean state values
- zip_code.py The script used to clean zip codes
- convert_xml_to_sql.py: The script that calls the above-mentioned functions to parse, clean the data, and output as CSVs
- schema.py: The schema used for my SQL database
Additionally, during the course of completing this project, I consulted the following sources:
- The Udacity Data Wrangling course, especially the case study on working with SQL
- Stack Overflow