Code Monkey home page Code Monkey logo

gtfs-data-pipeline-tfnsw-bus's Introduction

DOI

GTFS Data Pipeline for TfNSW Bus Datasets

Pipeline

Table of Contents

Introduction

Research Project Title: Smart City Applications in Land Use and Transport (SCALUT)

This is a data pipeline developed as part of the research project, SCALUT, at the University of Sydney's TransportLab.

The datasets generated using this pipeline has been used to validate the performance of TfNSW's Transit Signal Priority Request via Public Transport Information and Priority System (PTIPS).

The data pipeline is written in Python and has been tested to work on Windows, Linux and Mac using the Version 1 GTFS TfNSW Bus Datasets.

Note: A seperate data pipeline is currently being developed and tested to work with a wider collection of GTFS datasets.

Installation

You can either download this file (or clone it) from Github, or you can install via pip with

pip install gtfs_dpl

Data Availability Statement

The datasets generated will be made available to public on the University of Sydney Data Repository.

On-going static and realtime datasets are available on the Transport for NSW Open Data Hub:

Data Pipeline Directory Structure

GTFS_TfNSW_Bus_DataWareHouse
├───10_Raw_PB
│   └───FileTP
├───10_Raw_Static
├───10_TfNSW_Traffic_Lights_Location
├───10_TfNSW_Traffic_Volume_Viewer
├───11_CSV_Raw_TU
│   └───FileTP
├───11_CSV_Raw_VP
│   └───FileTP
├───12_CSV_Transformed_TU
│   └───FileTP
├───12_CSV_Transformed_VP
│   └───FileTP
├───12_CSV_Transformed_VP_byAgency
│   └───FileTP
├───13_CSV_Cleaned_Unique_TU
│   └───FileTP
├───13_CSV_Cleaned_Unique_TU_byAgency
│   └───FileTP
├───21_SA_Static
│   ├───GTFS_Static_StaticId
│   └───TL_Location_StaticId
├───22_CSV_Fu_Nodes_Links
│   └───FileTP
├───22_SHP_Fu_Nodes_Links
│   └───FileTP
├───22_SHP_VP_GIS
│   └───FileTP
└───22_SHP_VP_GIS_byAgency
    └───FileTP

Usage instructions

1.1 Convert .PB.GZ (Gzipped Protocol Buffer) to .CSV Files

python TU_PBtoCSV.py <DataDir> <FileTP>
python VP_PBtoCSV.py <DataDir> <FileTP>

Note: The tfnsw_gtfs_realtime_pb2.py file is required to be stored in the same folder.

1.2 Transform .CSV Files

python TU_Transform.py <DataDir> <FileTP> <FileIdStatic>
python VP_Transform.py <DataDir> <FileTP>

1.2A Transform .CSV Files by Agency (Daily to Monthly)

python VP_Transform_byAgency.py <DataDir> <FileTP> <FileIdStatic> <DaysInMonth> <Flt_Agency>

1.3 Prepare Cleaned Unique Datasets

python TU_ClnUnique_byAgency.py <DataDir> <FileTP> <FileIdStatic> <DaysInMonth> <Flt_Agency>

Usage example

The package comes with some data for you to explore. If you installed the package via pip you can find the path to the data with the following command under the category "Location":

pip show gtfs_dpl

To process the example data included with the package, you can run:

python TU_PBtoCSV.py /path/to/gtfs_dpl/example_data/ <FileTP>
python VP_PBtoCSV.py /path/to/gtfs_dpl/example_data/ <FileTP>
python TU_Transform.py /path/to/gtfs_dpl/example_data/ <FileTP> <FileIdStatic>
python VP_Transform.py /path/to/gtfs_dpl/example_data/<FileTP>
python VP_Transform_byAgency.py /path/to/gtfs_dpl/example_data/ <FileTP> <FileIdStatic> <DaysInMonth> <Flt_Agency>
python TU_ClnUnique_byAgency.py /path/to/gtfs_dpl/example_data/ <FileTP> <FileIdStatic> <DaysInMonth> <Flt_Agency>

gtfs-data-pipeline-tfnsw-bus's People

Contributors

benjym avatar teckkean avatar tim-xian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

scalut

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.