atx-osg / atx-buildings Goto Github PK

Importing Austin buildings and address data into OpenStreetMap

Makefile 48.75% JavaScript 51.25%

atx-buildings's Introduction

Austin, TX Buildings and addresses import project

This repository is a central place to keep code and scripts for our project to import the building footprint and address point datasets from the City of Austin into OpenStreetMap.

Our planning and import documentation is at the OSM wiki page

If you just want to help with the import, see the wiki for details on the import workflow.This repo and the instructions below are just for setting up and running the data processing scripts to prepare for the import.

Getting set up

Dependencies

First, install these dependencies:

Make
curl
gdal/ogr
nodejs and npm

OSX

Steps for installing everything on OSX. Assuming homebrew is installed and configured:

brew install gdal
brew install node
npm install

Making data

There is a Makefile in this repository that manages the steps of downloading and processing the original source data for import into OSM.

To prep all the data, just run make from the root directory of this repo.

atx-buildings's People

Contributors

Stargazers

Watchers

Forkers

wilsaj jchammons

atx-buildings's Issues

schedule mapping party

Schedule mapping party focused on the import. Need to find:

space
food
drinks

The OSM US board has provided some funds that should cover us, but additional sponsorship is definitely welcome.

Looking at one of the following Saturdays:

10/24
11/7
11/14

cleanup task: rm / merge duplicate addresses

We have some duplicate addresses for cases where addresses were already in OSM (usually tagged on pre-existing POIs).

Determine how to split up the data for QA

We need to pick a set of boundaries to use to split up the data into smaller sets that can be QA'd and by validated by individuals. Whatever we choose should cover the entire extent of the CoA buildings dataset and be mutually exclusive (no overlapping boundaries). Some proposed methods:

Zip codes
Census Tracts
Census Block Groups
Census Blocks

contributor guide / documentation

Need to write up a contributor guide that walks through the import process. This can probably be mostly cribbed from the excellent Louisville, KY Contributor Guide

filter out CoA import buildings if an intersecting building already exists in OSM

It is tempting to use the lidar-derived footprints since they are more precise, but it goes against the OSM guideline of giving preference to direct human contributions. So if a CoA building intersects with one already in OSM, it should be filtered out of the import set.

use .osm as file ending for OSM XML files

looks to be the standard way to do it

housenumber/street

I grabbed buildings-to-import.osm from https://github.com/wilsaj/atx-buildings/tree/with-import-data/osm/480219501003 and dumped the tags ( https://gist.github.com/maxerickson/ee1b419e69699e53c341 ).

One of the values for addr:street is "South Webberwood Way,1100", it would be good to check if that happens very often and try to handle it systematically if it does. I looked at the code for a minute but don't really see where to start looking where it might be dealt with.

drop coa:place_id

Base on the discussion on the imports list, we should probably drop the coa:place_id tag if we can't rely on it.

We should grab the current (pre-import) address data from OSM before import we can validate/merge any duplicate addresses afterwards. Once the import addresses are merged in, they will be harder to identify and extract without the coa:place_id tag distinguishing import and non-import addresses.

export preexisting address nodes

There are 1520 pre-existing freestanding address nodes in our area. These will almost always end up as duplicate nodes with the import, so we should make sure to merge or delete these. Since there aren't a ton of these, they can be done as a single, final cleanup step but we should be sure to export these before the bulk import since querying them will be more complicated after the import.

Task #177 - addresses-to-import should be attached to buildings

Many of the addresses in this task look like they should be associated with buildings.

fix mistakes in import

I jumped the gun on the import and a few bad datas made their way into the import datasets and ultimately OSM, so now we need to fix those. Logistically, we can mark these 33 tasks as invalid in the task manager, and run some cleanup on them before marking them as done again.

We'll need to run through a small checklist for each of those:

make sure building height values are rounded to proper number of decimal places (not super long float values)
make sure there are no addr:street values that look like <address>,1000 (see #25)

final task comment changes

rm the /load_and_zoom remote API call - this is done already via selecting Edit with JOSM in the dropdown, and is confusing to have it there
include an easy copy/paste version of the changeset comment

task manager setup(?)

Recently got access to tasks.openstreetmap.us - this is how a couple of other cities coordinated their mapping efforts. Looking into using the task manager to setup a project with tasks per census block.

add postcode tags

Please hold off on the import until you have review with the community.

Thanks,
Clifford

building height precision

Trim building height precision - this should have been handled in the data processing scripts already, but sneaky javascript numbers are sneaky.

drop addr:state and addr:county tags

per feedback on import-us mailing list

Missing task/census block - south of Palm Valley Blvd

West side of Task #761

contact osm import mailing list

To let them know what that we're doing this.

task validation guidelines

Some guidelines and things to look for when reviewing a task and marking it as valid.

Did everything import?

Kind of obvious, but buildings and addresses for the task should have made it into OSM.

Validation errors and warnings to look out for

crossing ways OR crossing buildings

meaning: This happens whenever two ways intersect. With imported buildings, this can happen when buildings were imported on top of pre-existing ways or if there was funky import data for whatever reason. If one of the ways involved is not a building we imported, then don't worry about it (you can definitely fix if you want, but it's not a blocker).

how to fix:

Check imagery to see what's going on.
Sometimes the features are just off and can be fixed by moving them as appropriate.
If a way actually passes underneath a building then set layer or covered tags as needed.

examples:

A toll booth building over a highway. Split the highway into segments with nodes joined to the building way where it passes under the building. The segment under the building should have the covered=yes tag set (see more about the [covered tag]). The building should also have the building=roof tag set.
A creek crossing a building. This happens if a culvert passes underneath the building. The creek should be split into segments where it passes underground, and both layer=-1 and tunnel=culvert tags should be set on that segment.

building node within a building way

meaning: There are many variants on this, but happens if a node already existed for a building before the building was imported and now there is a building way and node in the same area.
how to fix: Either merge the tags from the building way and node into the building way, or just leave the node and remove the building tag. If the tag is more specific than building=yes, then set that building value on the building way.

duplicate nodes

meaning: This might indicates that addresses nodes were have been imported twice by accident.
how to fix: You just run auto-fix on this.

duplicate ways

meaning: This might indicate that buildings were imported twice by accident. If this happened from a duplicate import, there will usually be a ton of them (as many as there are buildings, so at least a few hundred). If the duplicate ways are not buildings, then you can ignore this.
how to fix: Use the JOSM reverter plugin to roll back all but one of the duplicate changesets.

abbreviated street name

meaning: This is a pre-existing error in OSM but we should fix it now since it affects the address we are importing and is important for routing. This occurs when an OSM street name is abbreviated (for example: Lighthouse Landing Dr should be Lighthouse Landing Drive)
how to fix:

Expand the abbreviate of the name= value on the street way
Also update any address node or building that has the abbreviated street name.
1. Open Edit -> Search tool: search for any object with the incorrect addr:street tag (example search string: "addr:street":"Lighthouse Landing Dr") and have the replace selection radio button selected. This will select all the objects with the former name.
2. Then use the Tags window to update all these tags at once by setting the correct expanded value for the "addr:street" tag

match addr:street to OSM street names

OSM tag addr:street should match an OSM street name.

I think this should work:

download OSM streets within census block and extract street names
try to find a match with a few permutations - (with/without direction, without generic, expand abbreviations)

"Half" streets are missing addresses

Greetings ATX OSM team. I was visiting your city this weekend and noticed some peculiarities on the map. There are a number of signed "half" streets, e.g. 38th 1/2 Street or 45th 1/2 Street, in Austin that appear to be introducing some error into your import (or possibly the other way around).

The main issue is that the street name signs (SNS in traffic engineering parlance) usually say E 38th 1/2 St, or something like that. In OpenStreetMap, users have spelled out the 1/2 ordinal to the word "half". I am not sure if this is due to TIGER expansion or some local editor interpretation, but the use of "Half" doesn't reflect the sign.

That aside, it appears your address matching code has ignored addresses along "half" streets throughout Austin. In the example below, you can see the 45th 1/2 Street does not have any addresses.

http://www.openstreetmap.org/#map=19/30.30620/-97.72033

38th 1/2 Street (recently edited by mapbox) also has no addresses along it

Note: The favored OSM naming for half addresses and streets is to write out the "1/2", and not to use the Unicode character (see osmlab/nycbuildings#67).

cleanup task: merge in preexisting address nodes

The address nodes that existed before the import need to be merged into our imported buildings and addresses. Most often these include additional information (usually these are POIs like historical buildings, businesses, and places of worship).

This should be done after the bulk import so the features to merge are available.

figure out the appropriate mapping of source attributes -> OSM tags

Specifically, what tags should we be using to represent the source information:

building height
base elevation
address info
anything else?

only merge addresses with buildings if they intersect, else import address points as nodes

We'll have CoA addresses for vacant lots or places where new construction has happened, so we shouldn't throw out addresses that don't intersect a building. Also, buildings with multiple address points should keep their addresses as individual nodes.

Rename `.ql` files to `.overpassql`

*.overpassql is now the recommended file extension for Overpass QL: OSM Wiki: Overpass QL § File extension
(Disclaimer: I added that section after some discussion.)

I propose to rename all Overpass QL files in this repo to *.overpassql for consistency.

Context: GitHub's syntax highlighting engine requires that 200 repositories use the same file extension before they consider supporting that syntax and file extension: github-linguist/linguist#5890

add a metadata tag for holding address place_id

To make future updates / syncing easier

review and choose import tools

Decide on which import tool/tools and workflows to use. Main options look to be:

JOSM

Looks to be the way almost every other city has gone about things. The JOSM conflation plugin looks great. Only downside is JOSM takes some learning and might be exclusionary due to technical and computational requirements (needs a semi-beefy laptop).

Workflow

If we use JOSM, we need to nail down specifics of a workflow for import and QA. We have data split up by census block groups and that looks reasonable for individuals to import or review QA. There are about 775 census block groups, so we'll need a way of tracking progress. Maybe a google spreadsheet? Maybe a github issue per census block? Maybe slackbot to make things interesting? The world is our OpenStreetOyster.

Web-based import tools

Early on, there was some talk of using something like osmly or to-fix. Unfortunately, neither of them will work as-is. osmly only does one polygon at a time, and we have 475k buildings to get through. to-fix is a QA tool that works with data already in OSM, so not really solving our problem. But there is value in the general idea of a newbie-friendly web-based import tool. If one can be found or built relatively quickly, that would be great.

Possible missing task/census tract between 45th-51st and Guadalupe/Rowena

find good resources for on-boarding importers

It's looking like we have a lot of people who want to join in on the import effort, but don't know where to start. It would help to sift through the various available guides and learning materials and put together a mandatory reading / learning list that is specifically geared towards our import effort and the tools we'll be using. A lot of what is on learnosm.org is great and there are some pretty video tutorials that NYC team produced for their import. We can assume we'll be using JOSM for the actual import and QA, but the tasking / coordination situation is still TBD.

cleanup task: add or fix building heights

We have pre-existing OSM buildings that will be missing their heights, and we also need to fix wonky heights from the initial import (example: 3.43999999999999)

should be easy to do both by conflating with nodes that intersect and have the correct height values

vertical datums are hard

CoA elevations are NAVD88, but OSM uses WGS84/EGM96 so we need to make that conversion somehow.