Code Monkey home page Code Monkey logo

tombolodigitalconnector's Introduction

Tombolo

Tombolo Digital Connector

wercker status

The Tombolo Digital Connector is an open source tool that enables users to seamlessly combine different sources of datasets in an efficient, transparent and reproducible way.

There are three particularly important parts to the Tombolo Digital Connector:

  • Importers
    • Built-in importers harvest a range of data sources into the centralised data format. Examples include data from ONS, OpenStreetMap, NOMIS, the London Air Quality Network and the London Data Store. We welcome the creation of additional importers.
  • Centralised data format
    • All data imported into the Tombolo Digital Connector adopts the centralised data format. This makes it easier to combine and modify data from different sources.
  • Recipes
    • Users generate recipes with a declarative 'recipe language' to combine the data in different ways. This combination can generate new models, indexes and insights. For example, existing recipes can generate models of social isolation, calculate the proportion of an area covered by greenspace and even generate an active transport index. We welcome the creation of additional recipes.

For further information see the documentation.

Table of Contents:

The Challenge

Contributing

Looking to get involved? Have a look at the Open Source Community milestone where we have selected low hanging fruit for you to easily get involved and contribute. Read our Guide to contribution for details.

Requirements

To get started you will need to install the requirements to run the Digital Connector.

Note: you’ll need to have administrator rights on your machine to install these - make sure that you do before you proceed.

Install the following via the link through to their installation page:

After the successful installation of the requirements, you can use the Digital Connector by following the instructions in the quick start section or by going through the intro tutorial in the documentation.

Installation Guides

Quick start

This tutorial will guide you to a quick start on macOS.

A note about the Terminal

The Terminal application can be found in the Applications -> Utilities folder or quickly accessed through Spotlight. It is pre-installed in macOS so there is no need to install it.

You will need this application to run some of the commands of this tutorial. When you enter a command and press return/enter, the terminal will execute it and complete the task.

Make sure to press return after typing a command before you enter the next one.

Let's start

  • Open the Terminal. All the following steps will operate in it.

  • Get the Digital Connector code to your local machine by cloning its repository.

    git clone https://github.com/FutureCitiesCatapult/TomboloDigitalConnector

    If successful, you will see a log similar to the below.

    $git clone https://github.com/FutureCitiesCatapult/TomboloDigitalConnector  
    Cloning into 'TomboloDigitalConnector'...
    remote: Counting objects: 15761, done.
    remote: Compressing objects: 100% (184/184), done.
    remote: Total 15761 (delta 90), reused 193 (delta 49), pack-reused 15487
    Receiving objects: 100% (15761/15761), 178.89 MiB | 3.04 MiB/s, done.
    Resolving deltas: 100% (7647/7647), done.
  • Go to the Digital Connector root directory and run ./setup/setup_osx.sh

If successful the final output will be as the following.

$ gradle test
:compileJava UP-TO-DATE
:processResources UP-TO-DATE
:classes UP-TO-DATE
:compileTestJava UP-TO-DATE
:processTestResources UP-TO-DATE
:testClasses UP-TO-DATE
> Building 85% > :test > 50 tests completed
:test

BUILD SUCCESSFUL

Total time: 4 mins 50.919 secs

If the tests start to fail then check the PostgreSQL server is running and the requirements are properly installed by going through the previous steps.

About to be mentioned a couple of examples of what might have gone wrong in the process if the tests start failing.

uk.org.tombolo.core.AttributeTest > testUniqueLabel FAILED
    java.util.ServiceConfigurationError
        Caused by: org.hibernate.service.spi.ServiceException
            Caused by: org.hibernate.exception.JDBCConnectionException
                Caused by: org.postgresql.util.PSQLException
                    Caused by: java.net.ConnectException

uk.org.tombolo.core.AttributeTest > testWriteJSON FAILED
    java.util.ServiceConfigurationError
        Caused by: org.hibernate.service.spi.ServiceException
            Caused by: org.hibernate.exception.JDBCConnectionException
                Caused by: org.postgresql.util.PSQLException
                    Caused by: java.net.ConnectException

uk.org.tombolo.core.DatasourceTest > testWriteJSON FAILED
    java.util.ServiceConfigurationError
        Caused by: org.hibernate.service.spi.ServiceException
            Caused by: org.hibernate.exception.JDBCConnectionException
                Caused by: org.postgresql.util.PSQLException
                    Caused by: java.net.ConnectException

The former error log is launched if the server is not running and to solve it you need to run the command.

pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log start

OR if you did not set up the tombolo_test database.

In case you see this other error instead, it means that you did not rename the settings files successfully.

FAILURE: Build failed with an exception.

* Where:
Build file '/TomboloDigitalConnector/build.gradle' line: 159

* What went wrong:
Execution failed for task ':test'.
> Test environment not configured. See the README.

If you see other errors, try to go back and follow the steps again.

Run the Digital Connector

Now you are all set to run a task on the Digital Connector.

The next step is to run an example to show how the digital connector combines different data sets. We’re using an example that shows the relationship between air pollution (demonstrated in this example by NO2 levels), and car and bicycle traffic in every borough in London. You can read more about this example here.

When you’ve run this example, you can expect a map that looks like this:

Final Output

To get started:

  • Run the following command into the Terminal.

    gradle runExport -Precipe='src/main/resources/executions/examples/london-cycle-traffic-air-quality-lsoa-backoff.json' -Poutput='london-cycle-traffic-air-quality-lsoa-backoff-output.json'
  • You can expect it to take around 1.5 minutes to generate the output, which will be saved in the current directory. Change the path in the command in case you want it saved elsewhere.

    The output will look similar to the next content:

    {
      "type":"FeatureCollection",
      "features":[
        {
          "type":"Feature",
          "geometry":{
            "type":"Polygon",
            "coordinates":[[[-0.0802,51.5069],[-0.1092,51.5099],[-0.1114,51.5098],
                            [-0.1116,51.5153],[-0.1053,51.5185],[-0.0852,51.5203],
                            [-0.0784,51.5215],[-0.0802,51.5069]]]
          },
          "properties":{
            "label":"E09000001",
            "name":"City of London",
            "Nitrogen Dioxide":81.3333333333333,
            "Bicycle Fraction":0.25473695591455 
          }
        }, ...
        ...
      ]
    }
  • Once you have your output, you can open with a geospatial visualisation tool. For this example, we recommend QGIS, and here you can find a guide on how to use it.

We need your feedback!
If you have any issues with setting up the tool, or running the tutorial, or if you have some advice about how we can do this better, please contact us by creating an issue. Our goal is for someone to get back to you within 24 hours.

See also:

Run tests

gradle test

If you use the IntelliJ JUnit test runner, you will need to add the following to your VM Options in your JUnit configuration (Run -> Edit Configurations -> All under JUnit, and Defaults -> JUnit):

-enableassertions
-disableassertions:org.geotools...
-Denvironment=test
-DdatabaseURI=jdbc:postgresql://localhost:5432/tombolo_test
-DdatabaseUsername=tombolo_test
-DdatabasePassword=tombolo_test

Local deploy

To deploy to your local Maven installation (~/.m2 by default):

gradle install

Run Tasks

Run export

We use the Gradle task runExport to run exports. The parameters are as follows:

gradle runExport -Precipe='path/to/spec/file.json' -Poutput='output_file.json' -Pforce='com.className' -Pclear=true

For example, this calculates the proportion of cycle traffic received at a traffic counter relative to the total traffic in a given borough and outputs the results to the file reaggregate-traffic-count-to-la.json:

gradle runExport -Precipe='src/main/resources/executions/examples/reaggregate-traffic-count-to-la.json' -Poutput='reaggregate-traffic-count-to-la_output.json'

Export data catalogue

We use the Gradle task exportCatalogue to export a JSON file detailing the capabilities of the connector and explore the data catalogue.

gradle exportCatalogue -Poutput=catalogue.json

Importer Info

We use the Gradle task info to get details about a specific importer

gradle info

would give you list of all the Importers available in Digital Connector

gradle info -Pi='uk.org.tombolo.importer.dft.TrafficCountImporter'

Lists all the details of the Importer like Provider, SubjectTypes, Attributes, Datasourceids, Dataurl

gradle info -Pp -Pi='uk.org.tombolo.importer.dft.TrafficCountImporter'

would give user Datasourceids, Dataurl and Provider. Other option like -Pa and -Ps will give Attributes and SubjectType respectively.

Note: Datasourceids and Dataurl will always be provided irrespective of the option given.

Start/Stop server

If you need to start or stop the server (on MacOS X), use the following commands.

# to start
pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log start

# to stop
pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log stop

Implementations

License

MIT

When using the Tombolo or other GitHub logos and artwork, be sure to follow the GitHub logo guidelines.

tombolodigitalconnector's People

Contributors

algogator avatar anotherstarburst avatar arya-hemanshu avatar borkurdotnet avatar brettminnie avatar eddiejaoude avatar lorenaqendro avatar neoeno avatar sassalley avatar sebkur avatar thanosbnt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tombolodigitalconnector's Issues

OSM public realm importer

Open Street Map importers for public realm data (trees, benches, greenspace etc.)

Milton Keynes Workshop output

Depends on Overpass Importer.

Added by: Borkur

OSM cycle importers

Open Street Map importers for cycling infrastructure, facilities (office and home), services (bikeshops, bikerepair, etc).

Milton Keynes Workshop output

Added by: Borkur

City Index: Active Transport

Example recipes for a few Active Transport Accessibility Levels indices.

Milton Keynes Workshop output

Added by: Borkur

Establish default walkability/ cyclability scores but also let the user customise them.

Milton Keynes Workshop output.

Added by: Ioanna

Productivity impact of active transport

Model estimating the productivity impact of active transport. Potentially reduced due to travel time but offset by less sick-days.

Milton Keynes Workshop output. Needs expertise in productivity estimation.

Added by: Borkur

Query expansion in search

An extension of the Searchable data/field catalogue where we do some sort of intelligent query expansion where we treat the input query as a concept and map it to a range of exact search queries for related concepts.

Depends on: #87 #105

Clustering of Subjects as Field

Given specified fields, cluster a set of Subjects based on those field values. Enable options for pre-cluster normalisation, number of clusters and cluster method if possible.

Added by: Joe

Materialise fields

Materialising the values for a field calculation. The values can then be used downstream in other fields/models.

Added by: Ioanna

Plugin for QGIS

Plugin for QGIS to make it easier to explore and import data into QGIS (To be specified further)

Added by: Ed

Real-time data importer

Importer of real time data. TfL Api provides data on road and transport disruptions.

Adde by: Ioanna

Spatial NA Filler

If a value does not exist for a geography can you back-off to the value of a parent geography or otherwise modelled value (e.g. average)

Added by: Joe

Generic GIS importers

A template for how data should be formatted for automatic importing into the Digital Connector data format together with the vanilla importer for that template (e.g. for shapefiles, geojson, etc.)

Added by: Ioanna

Model records comparison

Being able to compare between timed values of the same model that has been processed in two time periods.

Milton Keynes Workshop output.

Added by: Ioanna

Generic CSV importer

An instance of Template Importers where we handle a standard csv file with attribute id in a header and subject id as 1st column (and optionally a geography as 2nd column)

Not clear how we handle timestamps and time-series here.

Added by: Ioanna

OSM Green space importer

The proportion of each LSOA by land use using Open Street Map Importer.

Depends on the OSM Overpass Importer.

Original idea (deprecated):

Data sourced here: http://www.neighbourhood.statistics.gov.uk/dissemination/datasetList.do?JSAllowed=true&Function=&%24ph=60&CurrentPageId=60&step=1&CurrentTreeIndex=-1&searchString=&datasetFamilyId=1201&Next.x=15&Next.y=19&nsjs=true&nsck=false&nssvg=false&nswid=2560 | NOTE old data and not updated need to discuss the use of Open Street Map as an alternative.

Added by: Joe

HEAT model

Implementation of the HEAT model

Milton Keynes Worshop output, needs details on the model or means of calling the online version

Added by: Borkur

Searchable recipe repository

Searchable repository of recipes along with code snippets of how to add them to specification files.

Added by: Borkur

Field catalogue tagging

A tagging system for all fields of datasets. When the user searches for a subject in the data catalogue, the user could see apart from the datasets the fields that are specific to his search.

Added by: Ioanna

Visualisation of fields

Visualise charts that show the different origins of travel to work. A circular chart can show the direction of the travel.

Milton Keynes Workshop output.

Added by: Ioanna

AT Environmental benefit

Model estimating the environmental benefit of active transport.

Milton Keynes Workshop output.

Added by: Borkur

Data catalogue tagging

Search engine for identifying all available datasets specific to a subject. Introduce a tagging system for all datasets the DC can import.

Depends on: #87

Added by: Ioanna

Export field histograms

Export field value histograms for a set of fields and a set of (potentially clustered) Subjects.

Added by: Borkur

Recoding / Mapping Field

Recode or map a field from it's current values to new values based on a given mapping function.

Added by: Joe

Active travel indices

Establish deafult walkability/ cyclability scores but also let the user customise them.

Milton Keynes Workshop output.

Added by: Ioanna

Model storage

Saving a new model/recipe, meaning storing all processes that were needed to create a particular model. With this the user can review the process at any time.

The model recipe storage is already implemented. The version control may be out of the scope of the project.

Added by: Ioanna

How do I query 'Greater London'?

NOT URGENT

The current match rule in the export specification enables pattern matching to select cities.

e.g.:

"subjectType" : "lsoa", "matchRule": { "attribute": "name", "pattern": "Manchester%" }

This works well in most cases. However, if I want to select Greater London, the most appropriate method may include something similar to a repeated version of for all regions (of chosen granularity):

"geoMatchRule" : { "geoRelation" : "within", "subjectSpecifications" : [ { "subjectType" : "localAuthority", "matchRule" : { "attribute" : "name", "pattern" : "Greenwich" } }

Might there be an easier way to do this?

Visualisation of overlaid fields

An application where you can view fields of multiple datasets overlaid on a map.

Do we need an application? Can we do a QGIS plugin for this?

Added by: Ioanna

Safety score

Example recipes for active transport safety score.

Milton Keynes Workshop output

Added by: Borkur

Searchable data catalogue

Search engine for searching what data is importable together with code samples of how to use that data in specification files.

Added by: Borkur

Rank Field

Transformation field that outputs the rank of a certain subject for a depending how it ranks among all other subjects in terms of another field/attribute.

Added by: Joe

Formula calculation

Standard model outputs (e.g. PTAL/PTAL adjusted to a specific formula the user uses). Apply formula to new field calculation

Note: This needs to be more specific.

Added by: Ioanna

Catchment Analysis as Field

Catchment analysis inside the Digital Connector. In this case it will be a field assigning a time/cost to households based on its travel distance to a GP. This can be implemented by FCC in the Digital Connector with methodology designed by Space Syntax.

Catchment score could be:

  • Closest
  • Count within a limit
  • Sum within a limit

Catchment input

  • Points
  • Network

Catchment output

  • New attributes to network nodes based on catchment of points

Added by: Borkur

Geographic disaggregation

Disaggregate data from a high level geography to a lower level geography according to a disaggregation model. E.g. disaggregate LSOA level IMD value to postcodes or even households.

Added by: Eime

GP qos importer

Importer for GP quality of service ratings. Data from PhE where we need to join on practice code.

Further information on the PhE datasource to be provided by @Anafi (Ioanna)

Spatial/Attribute join

spatial/attribute joins to support one-to-one/one-to-many/many-to-one relationships between two datasets

Can we use the same feature for joining more than two datasets at a time e.g. join street segments to buildings and address points to builldngs (Greenwich use case)?

Added by: Ioanna

Cost based catchment

Methodology for calculating catchment based on transport cost.

Depends on basic catchment analysis

Added by: Borkur

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.