Code Monkey home page Code Monkey logo

spandex's People

Contributors

conorhenley avatar cvanoli avatar daradib avatar janowicz avatar jiffyclub avatar waddell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spandex's Issues

A love to letter to dependencies: describing an environment and set of commits in which spandex runs

The TableLoader class is sensitive to the version of Anaconda and SQLAlchemy that one is using.

For example, this commit 5356cf9 was necessary to get these scripts to work: https://github.com/synthicity/bayarea_urbansim/blob/master/data_regeneration/run.py#L35-L38

However, that same commit breaks https://github.com/synthicity/bayarea_urbansim/blob/master/data_regeneration/run.py#L35-L38 scripts on Anaconda 2.1.0 on this machine https://github.com/MetropolitanTransportationCommission/bayarea_urbansim_setup/tree/vagrant-ubuntu14-bloomberg

Perhaps this would be a good argument for CI? Or very up-front documentation of current requirements?

Create from point / impute from point functions

Function to add/impute values in the disaggregate data (e.g. buildings, establishments) from spatial points representing observed values, such as from commercial data sources (e.g. InfoUSA, Metrostudy, Costar, Exceligent, REIS, Axiometric, Real Facts, D&B).

Examples of create from point:


create_from_point(buildings, commercial_data_source, how='replace')
create_from_point(buildings, commercial_data_source, how='add')

Examples of impute from point:


impute_from_point(buildings, commercial_data_source, {'unit_price':'price', 'non_residential_sqft':'sqft'})
impute_from_point(buildings, commercial_data_source, criteria='unit_price<50000')
impute_from_point(buildings, commercial_data_source, within=50)
impute_from_point(buildings, commercial_data_source, aggregation=('zone_id','mean'))

test_tableframe test fails

In this environment, which is as close to the travis config in the repo as i could get, https://github.com/MetropolitanTransportationCommission/bayarea_urbansim_setup/tree/vagrant-ubuntu14-giuliani

test_tableframe fails.

vagrant@vagrant-ubuntu-trusty-64:/vm_project_dir/spandex$ py.test --cov "/home/vagrant/miniconda/lib/python2.7/sit
e-packages/spandex" --cov-report term-missing
=============================================== test session starts ===============================================
platform linux2 -- Python 2.7.9 -- py-1.4.27 -- pytest-2.7.1
rootdir: /home/vagrant/miniconda/lib/python2.7/site-packages, inifile:
plugins: cov
collected 74 items

../../home/vagrant/miniconda/lib/python2.7/site-packages ...................................................................Fs.....Coverage.py warning: Module /home/vagrant/miniconda/lib/python2.7/site-packages/spandex was never imported.
Coverage.py warning: No data was collected.


==================================================== FAILURES =====================================================
_________________________________________________ test_tableframe _________________________________________________

loader = <spandex.io.TableLoader object at 0x7fe4f1088390>

    def test_tableframe(loader):
        table = loader.tables.sample.hf_bg
        for cache in [False, True]:
            tf = TableFrame(table, index_col='gid', cache=cache)
>           assert isinstance(tf.index, pd.Index)

spandex/tests/test_io.py:14:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
spandex/io.py:483: in index
    index_col=self._index_col).index
spandex/io.py:741: in db_to_df
    q = db_to_query(query)
spandex/io.py:672: in db_to_query
    return sess.query(*orm)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x7fe4f07ee110>

>   ???
E   NotImplementedError: Class <class 'sqlalchemy.orm.attributes.InstrumentedAttribute'> is not iterable

build/bdist.linux-x86_64/egg/sqlalchemy/sql/operators.py:316: NotImplementedError
--------------------------------- coverage: platform linux2, python 2.7.9-final-0 ---------------------------------
Name    Stmts   Miss  Cover   Missing
-------------------------------------
================================= 1 failed, 72 passed, 1 skipped in 32.30 seconds =================================

Reshape (clip) function

Function to reshape (clip) one geometry by another geometry.

image

A reshape (or clip) operation is used to adjust parcel boundaries in cases where this is appropriate. The most common application of this operation is when part of a parcel is underwater. We are interested in the land associated with parcels, so the underwater portion of parcels should be clipped away.

The main idea is that st_area(parcel.geom) should yield the land area. An example of where this operation would be applied: land parcels in San Mateo county that sit by the bay or ocean, but extend very far out into the bay or ocean.

Calculate distance function

Calculate distance between one geometry and another.

Examples:


calc_dist(parcels, canyon_edge, how='network')
calc_dist(parcels, transit, how='network')
calc_dist(parcels, water_bodies, how='straight_line')
calc_dist(parcels, highway, how='straight_line')
calc_dist(parcels, air_pollutant_source, how='straight_line')

Slice function

Function to slice geometry along the boundaries of another geometry.

image

Example SQL (using the small spandex test data to slice parcels along block group boundaries):


with a as(
select 
        heather_farms.parcel_id,
        heather_farms.calc_area,
        heather_farms.shape_area,
        heather_farms.parcel_acr,
        heather_farms.shape_leng,
        st_intersection(heather_farms.geom, hf_bg.geom) as geom
FROM heather_farms, hf_bg
where st_intersects(heather_farms.geom, hf_bg.geom)
), b as(
select *, st_area(geom) as icalc_area from a
)
select
parcel_id,
geom,
icalc_area/calc_area*shape_area as shape_area,
icalc_area/calc_area*parcel_acr as parcel_acr,
icalc_area/calc_area*shape_leng as shape_leng
from b

This function will typically be applied in the context of parcels. Post-slice child parcels will be assigned field values from the parent parcel, with the user specifying fields for which parent value is taken as is and fields for which the parent value is allocated to the children weighted by area. In the example above, parcel_id is taken from the parent parcel as is, and shape_area/parcel_acr/shape_leng are allocated by area.

Options for preventing slivers will be provided. For parcel slices that result in slivers, the parent parcel is left intact. Slivers can be defined by area (e.g. area < 500) and/or by shape (e.g.((perimeter/4.0)/sqrt(area)) > 2).

Parcels will be sliced so that boundaries align with those of key summary geographies and control geographies. Summary geographies are any geography that simulation results will be summarized at (i.e. disaggregate agents will be aggregated to this level). Slicing parcels along key boundaries ensures a clean accounting of land: parcels will nest cleanly within higher level geographies. This is also useful during imputation- when small area control totals are to be met, but parcel boundaries don’t align with the control boundaries, unintended side-effects can result.

The key idea is: geography.aggregate(parcel.land_area) should equal geography.land_area. Examples of situations encountered in the past where parcel slicing is useful: block-level controls are desired but block boundaries are frequently inconsistent with parcel boundaries, zonal boundaries bisect a set of parcels so the parcels are sliced to ensure a clean accounting of land up to the zone level (a key summary geography); zonal boundaries are smaller than parcels in an area with a very large parcel (so that some zones contain zero parcels and are thus undevelopable) and the large parcel is sliced up into smaller parcels corresponding to zone boundaries; a control geography contains a few parcels in their entirety and a number of other parcels where a significant proportion of their land area overlaps with the control geography but the centroid of those parcels does not fall within the control geography- when replicating agents/buildings to match the geography’s control total the resulting agents/buildings are allocated to sliced parcels so they do not get artificially stuffed into the few parcels that are entirely within the boundary.

Slicing and reshaping parcels appropriately has implications for the correct calculation of other spatial operations like proportion_overlap.

Apply regression equation function

Function to impute missing/invalid/outlying values using predictions from a regression model or other statistical model (e.g. poisson). Applies regression equation, which may have been estimated in urbansim or statsmodels.

Examples:


apply_regression(buildings, variable_to_impute, regression_equation, replacement_criteria)
apply_regression(buildings, 'residential_rent', regression_equation)

Condo detection/merging function

Function used to detect condo ownership records that have been stored as tiny parcels or stacked parcels. Ownership records are merged into a single building/parcel record with one geometry.

targets/synthesis .pop() error

In targets/synthesis, when a count column is passed, the adding or removing rows is done in two functions: _add_rows_by_count or _remove_rows_by_count.

Inside these functions, there is a case that is not contemplated. From the df, after applying the corresponding filters, only the rows that have a value in the count column less or equal than the amount (to add or remove) are saved into the sort_count array which is then used to pick to_add or to_remove indexes. BUT, the unforeseen case is when the sort_count array is empty, meaning that all of the available rows in the filtered df have count values bigger than the amount value.
In the case of adding rows, an error is raised in line 213
https://github.com/UDST/spandex/blob/master/spandex/targets/synthesis.py#L213
when trying to pop out an index from to_add when this is empty.

In the case of removing, an error will not raise, but the process will iterate over the empty sort_count and the function will end up returning the same df.

Scale/synthesize to match aggregate totals

A common step in preparing UrbanSim base year data is ensuring that disaggregate data (e.g. buildings, households, jobs) matches aggregate targets at specified geographic levels. For example, in the case of disaggregate building data, we might want to match residential unit counts by block, match median home values by tract, match building year built by zipcode, or match non-residential-sqft totals by zone. To match totals for a given geography, we either synthesize new agents to match the total by sampling/copying/allocating existing agents within the geography, or we select agents within the geography for deletion. When matching an aggregate mean/median/total of some agent attribute in a certain geography, a scale-to-match approach can also be taken.

We want to be able to:

  • Synthesize new things in the disaggregate data to match an aggregate target value. Disaggregate things are replicated to match the target, and new things are then assigned a location_id within the aggregate geography (i.e. an allocation step).
  • Scale values in the disaggregate data to match an aggregate target value.

A control_to_target function is envisioned that takes the following arguments: agent_df, controls_df, and optionally allocation_geography_df. There are similarities between what this function would do and what the existing transition model does, as examples below will show. There are also similarities between this function and an UrbanSim refinement model. If we want the code to live in UrbanSim instead of Spandex, that is fine.

The operation of this function can be illustrated by looking at fake control table examples. See below. A couple of points: The agent_type column may not be needed as the agent type is implied by the agent_df argument. The allocate_to column may not be needed because the allocate_to geography is implied by the allocation_geography_df argument.

Example 1

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
zone 1 household 1000 synthesize building residential_units residential_units
zone 2 household 3000 synthesize building residential_units residential_units
zone 3 household 2000 synthesize building residential_units residential_units

In Example 1, we match household totals by zone and allocate to buildings within the zone according to the distribution of residential units, respecting a capacity constraint. If zone 1 contains less than 1,000 households, we randomly sample the needed number of new households from the existing households in zone 1, copy them, then allocate the new households to buildings in the zone (i.e. assign a building_id). If zone 1 contains more than 1,000 households, we randomly sample existing households for deletion. The agents_df argument in this case would be a DataFrame of households with a zone_id column. The controls_df argument in this case would be the table shown above. The allocation_geography_df would be a DataFrame of buildings with a zone_id column. In the allocation step, we would respect the capacity constraint identified in the allocation_capacity_attribute column of the controls_df table (number of households assigned to a building should not exceed the number of residential units in the building).

Example 2

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
zone 1 household persons income < 40000 5600 synthesize building residential_units residential_units
zone 2 household persons income < 40000 5600 synthesize building residential_units residential_units
zone 3 household persons income >= 50000 2000 synthesize building residential_units residential_units

In Example 2, we populate both the agent_accounting_attribute and agent_filter column in the control table. This means that the target values now refers to persons, not households, and the households we sample to meet this target must pass the agent_filter. Summing household.persons for households in zone 1 where household income is less than 40,000 should result in 5,600. In other words, there are 5,600 people in zone 1 in households with household income less than 40,000.

Example 3

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
zone 1 job 500 synthesize building non_residential_sqft non_residential_sqft/250
zone 2 job 1200 synthesize building non_residential_sqft non_residential_sqft/250
zone 3 job 700 synthesize building non_residential_sqft non_residential_sqft/250

In Example 3, we want to match zonal job targets (agent_type=='job') and allocate new jobs to building weighted by non_residential_sqft. The allocation_capacity_attribute reflects the assumption that each job spot takes up 250 sq ft, and we don't want to exceed the number of job spots in the buildings being allocated to. After running control_to_target, there will be 500 jobs with zone_id 1.

Example 4

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
zone 1 job sector_id == 11 150 synthesize building non_residential_sqft + 50*residential_units non_residential_sqft/250
zone 2 job sector_id == 11 500 synthesize building non_residential_sqft + 50*residential_units non_residential_sqft/250
zone 3 job sector_id == 32 200 synthesize building non_residential_sqft + 50*residential_units non_residential_sqft/250

In Example 4, we control to job targets again, but now the agent_filter column is populated. In zone_id 1, the target of 150 applies only to jobs in sector 11. There should be 150 jobs in zone_id 1 with sector_id 11. Existing jobs in sector 11 and zone 1 are either copied or deleted to match this target. New jobs get a new, unique job_id.

Example 5

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
zone 1 building residential_units 800 synthesize parcel parcel_sqft parcel_sqft/500
zone 2 building residential_units 200 synthesize parcel parcel_sqft parcel_sqft/500
zone 3 building residential_units 350 synthesize parcel parcel_sqft parcel_sqft/500

In Example 5, we want to match residential_unit targets. Summing building.residential units for buildings in zone_id 1 should get us 800. Existing buildings with residential units are sampled, copied, and allocated if the existing zonal residential unit count is too low. Otherwise, residential buildings are sampled for deletion if the existing count is too high. We allocate new synthetic buildings to parcel, weighting the allocation by parcel_sqft and respecting the parcel_sqft/500 capacity_constraint.

Example 6

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
zone 1 building non_residential_sqft building_type_id == 5 30000 synthesize parcel parcel_sqft parcel_sqft/2
zone 2 building non_residential_sqft building_type_id == 5 85000 synthesize parcel parcel_sqft parcel_sqft/2
zone 3 building non_residential_sqft building_type_id == 5 72000 synthesize parcel parcel_sqft parcel_sqft/2

In Example 6, we match non_residential_sqft totals by zone for building_type_id 5. Note the agent_filter column.

Example 7

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
parcel 111 job 50 synthesize building non_residential_sqft non_residential_sqft/250
parcel 112 job 120 synthesize building non_residential_sqft non_residential_sqft/250
parcel 113 job 70 synthesize building non_residential_sqft non_residential_sqft/250

In Example 7, notice that the location_type is 'parcel' instead of 'zone'. We are matching parcel-level employment targets: there should be 50 jobs attached to buildings on the parcel with parcel_id 111.

Example 8

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
tract 7 household income 70000 scale_to_mean
tract 8 household income 84000 scale_to_mean
tract 9 household income 39000 scale_to_mean

In Example 8, we are scaling to match the target instead of synthesizing to match (see the how_match column). Here we scale households by tract to match the observed household mean income by tract. The average household income in tract 7 is 70,000 and we want the disaggregate data to reflect this. Notice that when scaling to match, new agents are not synthesized, so the "allocate_" columns are left blank.

Example 9

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
tract 7 building year_built building_type_id < 3 1995 scale_to_median
tract 8 building year_built building_type_id < 3 1978 scale_to_median
tract 9 building year_built building_type_id < 3 1925 scale_to_median

In Example 9, we scale building year_built to match the observed tract median year built. In tract 7, we want the median year_built of buildings with building_type_id less than 3 to be '1995'. Note 'scale_to_median' in the how_match column.

Example 10

location_type location_id agent_type agent_accounting_attribute agent_filter target how_match allocate_to allocation_weight allocation_capacity_attribute
zone 1 building non_residential_sqft 100000 scale_to_sum
zone 2 building non_residential_sqft 40000 scale_to_sum
zone 3 building non_residential_sqft 75000 scale_to_sum

In Example 10, we scale building non_residential_sqft to match a zonal target for non_residential_sqft. We want there to be 100,000 square feet of non-residential space in zone 1, and we want to match this target by scaling existing building records with non-residential sqft instead of synthesizing new building records. We scale existing values downwards or upwards depending on whether the zonal target is currently exceeded or short. Note 'scale_to_sum' in the how_match column.

Support custom projections

Discussed with Eddie. Agreed to implement after working on higher priorities.

---------- Forwarded message ----------
From: Dara Adib
Date: Mon, Sep 29, 2014 at 4:50 PM
Subject: Re: North Carolina sample shp
To: Conor Henley
Cc: Eddie Janowicz, Fletcher Foti

[...]

We want to figure out how to deal with non-standard ESRI projections (SRIDs). Current versions of PostGIS only ship with EPSG codes in the spatial_ref_sys table. The prj2epsg API that we use to find SRIDs (if GDAL fails to find an SRID, which it usually does for anything other than WGS84) only matches to EPSG codes.

Here are some non-standard ESRI projections:
http://spatialreference.org/ref/esri/
http://svn.osgeo.org/gdal/trunk/gdal/data/esri_extra.wkt

There are some ways to populate spatial_ref_sys with these extra projections:
http://community.actian.com/wiki/Spatial_ref_sys
https://gis.stackexchange.com/questions/95831/how-can-i-get-proj4text-from-srtext
http://suite.opengeo.org/opengeo-docs/dataadmin/pgBasics/projections.html

spatial_ref_sys defines projections with two columns, srtext (well-known text) and proj4text (PROJ.4), which is used for reprojections. The prj file shipped with the shapefile includes srtext, but not proj4text.

OGR/GDAL can apparently determine the proj4text from the srtext in the prj file.

$ gdalsrsinfo pittzone.prj

PROJ.4 : '+proj=lcc +lat_1=34.33333333333334 +lat_2=36.16666666666666
+lat_0=33.75 +lon_0=-79 +x_0=609601.2199999997 +y_0=0 +datum=NAD83
+units=us-ft +no_defs '

OGC WKT :
PROJCS["NAD_1983_StatePlane_North_Carolina_FIPS_3200_Feet",
    GEOGCS["GCS_North_American_1983",
        DATUM["North_American_Datum_1983",
            SPHEROID["GRS_1980",6378137.0,298.257222101]],
        PRIMEM["Greenwich",0.0],
        UNIT["Degree",0.0174532925199433]],
    PROJECTION["Lambert_Conformal_Conic_2SP"],
    PARAMETER["False_Easting",2000000.002616666],
    PARAMETER["False_Northing",0.0],
    PARAMETER["Central_Meridian",-79.0],
    PARAMETER["Standard_Parallel_1",34.33333333333334],
    PARAMETER["Standard_Parallel_2",36.16666666666666],
    PARAMETER["Latitude_Of_Origin",33.75],
    UNIT["Foot_US",0.3048006096012192]]

I think QGIS does something similar when it defines custom projections for shapefiles that don't use standard ones:
http://docs.qgis.org/2.2/en/docs/user_manual/working_with_projections/working_with_projections.html

Perhaps when loading shapefiles with unrecognized SRIDs, we should define a custom projection, so that we can load and reproject. Thoughts?

Alternatively, we can preload some ESRI projections, but I don't like this solution because it doesn't cover all possible custom projections and requires specifying the SRID when loading the shapefile (since prj2epsg doesn't detect it).

Spandex/Synthesis container size when count not none

In spandex/targets/synthesis, when a count column is passed, the target value is matched by the sum of the count column instead of counting the number of rows. In the allocation process, the size of the containers (container_size array) is calculated by subtracting the corresponding values in the capacity column or the capacity expression to the number of rows in each geo_id_col element. However, the behavior the user expects when passing a count column is that the container size is also calculated by summing the values in the count column that belong to each geo_id_col element.

Set/assert value function

Function used to make and track one-off fixes/assertions/look-up-based-corrections to the data.

Examples:


assert(buildings, 'sqft_per_unit','>250')
assert(buildings, 'non_residential_sqft',0,'building_type_id=1')
assert(buildings, 'non_residential_sqft','footprint_area*stories','building_type_id>2')
assert(parcels, 'land_use_type_id',10,'parcel_id==2314')

Synthesize to match function

Pandas-based function to synthesize new things in the disaggregate data to match an aggregate target value. Disaggregate things are replicated to match the target, and new things are then assigned a location_id within the aggregate geography (i.e. an allocation step).

Examples:


synthesize_to_match(buildings, 'residential_units','block_id', census_block_counts)
synthesize_to_match(buildings, 'residential_units','tract_id', acs_sf_unit_estimate)
synthesize_to_match(buildings, 'non_residential_sqft','zone_id',sqft_implied_from_employment)
synthesize_to_match(buildings, 'residential_units','tract_id', mf_units, allocation_criteria)

clarify the role of conventions so that users know whether spandex requires them to use exec_sql to edit databases in all cases

Its unclear to me as a user whether I can edit a database that Spandex is using/managing outside of the Spandex ORM without breaking the Spandex database class' understanding of the schema.

In short, the Spandex database class does not seem to show any existing tables on my public schema after dropping and then re-adding a table in psql and then inspecting the database with Spandex database class.

The long version:
This is a pretty simple and use-case-specific SQL query that takes a few minutes:

https://github.com/synthicity/spandex/blob/master/spandex/spatialtoolz.py#L495-L513

It works fine when you run this script from start to finish:

https://github.com/synthicity/bayarea_urbansim/blob/master/data_regeneration/run.py

Unfortunately, running that script from start to finish takes more than 12 hours on a well-provisioned (and tuned) machine.

As a user, it would be nice to be able to just call that specific SQL query on an arbitrary table. However, it seems that there may be some conventions or dependencies that I am not following in calling it.

In particular, I suspect that I am getting an error because I am not calling that function on a table that was specifically created or registered with one of the several ORM's (2 if you count Spandex as an ORM?) that seem to be in use in this repository.

The error is below. As a user, this means I will probably re-write the query from the ORM language into SQL in order to accomplish my larger goal of reducing the run-time of data regenerations.

---geom_aggregation1 took 7.37857508659 seconds ---
/home/vagrant/anaconda/lib/python2.7/site-packages/sqlalchemy/dialects/postgresql/base.py:2079: SAWarning: Did not recognize type 'bpchar' of column 'county_id'
  name, format_type, default, notnull, domains, enums, schema)
/home/vagrant/anaconda/lib/python2.7/site-packages/sqlalchemy/dialects/postgresql/base.py:2079: SAWarning: Did not recognize type 'unknown' of column 'imputation_flag'
  name, format_type, default, notnull, domains, enums, schema)
PARCEL AGGREGATION:  Merge geometries (and aggregate attributes) based on within-interior-ring status
Traceback (most recent call last):
  File "geom_aggregation2.py", line 16, in <module>
    df = geom_unfilled(t.public.parcels, 'unfilled')
AttributeError: type object 'public' has no attribute 'parcels'
Traceback (most recent call last):
  File "geom_test.py", line 30, in <module>
    check_run('geom_aggregation2.py')
  File "geom_test.py", line 22, in check_run
    return subprocess.check_call([python, path])
  File "/home/vagrant/anaconda/lib/python2.7/subprocess.py", line 540, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/vagrant/anaconda/bin/python', 'geom_aggregation2.py']' returned non-zero exit status 1

Additional raster functions

Populate parcel fields with values from raster datasets.

Examples:


aggregate(parcels, slope, how='mean')
aggregate(parcels, altitude, how='centroid')

Extract building data from parcels function

Function that extracts building data from parcel data, assuming the case when building information is embedded in the parcel data.

Example:


buildings = extract_buildings(parcels)

Scale to match function

Pandas-based function to scale values in the disaggregate data to match an aggregate target value.

Examples:


scale_to_match(buildings, 'unit_price', 'tract_id', median_home_values)
scale_to_match(buildings, 'year_built', 'bg_id', acs_mean_year_built)
scale_to_match(buildings, 'non_residential_sqft','zone_id', sqft_implied_from_employment, type='sum')

Clear attributes function

Function used to clear attributes/agents from parcels. Applied when land is to be treated off-model or when land is known to be vacant.

Examples:


clear_attributes(parcels, vacant)
clear_attributes(parcels, gov_land)
clear_attributes(parcels, tribal_land)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.