Code Monkey home page Code Monkey logo

parkapi's Introduction

parkapi's People

Contributors

augustqu avatar balzer82 avatar defgsus avatar hbruch avatar jb3-2 avatar jklmnn avatar justusadam avatar kawie avatar kiliankoe avatar leonardehrenfried avatar lucaswo avatar manuelciosici avatar mic92 avatar mtrnord avatar nicomue7 avatar robtranquillo avatar sibbl avatar stepan-romankov avatar ubahnverleih avatar ylabonte avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

parkapi's Issues

IDs?

@jklmnn suggested that we remove city IDs alltogether and reference them by their real name only. Not quite sure how we'd deal with duplicates, but due to the current metadata format we'd run into this problem anyways. But otherwise it would make things much easier.

Any suggestions?

Verify integrity of scraped data

To be notified if the format of a page changes as soon as possible. A big plus would be slack integration 🎉

It'd also be nice to periodically automatically update the test fixtures in this repo as well.

sample_city.py should run

The example script should be runable by the server. Not that some meaningful data come out of it, but that the server not crashes like now but came up with an json error. By the way, the example should contains all important objects and vars (maybe with blind or no data, but syntactic right)

<city>.py bad defined variables

the vars
data_url = ""
data_source = ""
city_name = ""
file_name = ""
detail_url = ""
are a little bit redundant. Please make more clear by name whats the use for every single var or set an comment

Adjust for server timezone

Otherwise the time last downloaded and time last updated can differ by several hours, which is kinda weird...

Database connection

Opened this issue to collect some stuff about the database.

I'm planning to write a database connector that connects to a PostgreSQL db. In this it'll throw data with id, timestamp_updated, timestamp_downloaded, city and data attributes. data contains a JSON dump of the data that the scraper acquires (Postgres is pretty damn nice!).

The scraper is then run periodically and talks to the db connector. Before the db connector saves stuff in the database, it first verifies if it looks ok (#6). If it doesn't, it'd probably be best to notify us about it. A slack bot would be pretty damn sweet!

insert best practice with missing data

in sample_city.py should write down, how to handle missing data fields.
should comented the lines out, delete it or should the data set to false, null, ..?

Improve scraping 'logistics'

Specifically:

  • Specify data as generically as possible in the city files (only config files would be too perfect, but probably not possible), so that they can easily be added without writing a python file that handles all the scraping and returns a finished dict
  • Write a separate scraper that handles all the scraping and getting data in the correct format for all cities, so it won't be possible to have surprisingly differing output
  • Dynamically import city files from the scraper
  • Handle a database that scraper and server can talk to

For the time being the server will probably talk to the scraper directly, but the scraper should be able to run on it's own soon and store stuff into a database (I'm really liking the idea of just throwing the json into a mongodb instance - clear out #4 first though!), which the server then queries for current data. That way the scraper can run periodically (as easily as via cron) and the server only touches the previously saved data (usually the most current data) without them getting in each others way.

seperate data structure for geolocation

Instead of saving the geolocations inside of scraping-city.py should they stored in a seperate file to enable easy updating besides the automatic update while scraping a new parking-spot.

the new file should lay besides the scrape-city.py in /cities and should have the suffix .geo.
Like this:
..cities/
dresden.py
dresden.geo
Luebeck.py
Luebeck.geo

and the content should be like this:
{
"Altmarkt": { "lat": 51.05031, "lon": 13.73754 },
"An der Frauenkirche": { "lat": 51.05165, "lon": 13.7439}
}

For later improvements we can change to geojson so github can render this file by it self on a map:
https://help.github.com/articles/mapping-geojson-files-on-github/

[Lübeck] Geodata

Currently existent, but incomplete.

If anyone finds a way to gather the coordinates from here (there's some stuff happening in kwl_maps.js and elsewhere there, but I can't find the data), please add it to Luebeck.geojson.
Or from anywhere else of course...

Add forecast data

Include @balzer82's forecast data. Maybe a route something like /city/forecast/daterange? CSV would still probably be the best option for this I guess.

Invalid free counts

{
  "address": "Ferdinandplatz",
  "coords": {
    "lat": 51.04645,
    "lng": 13.73988
  },
  "forecast": false,
  "free": 8259,
  "id": "dresdenferdinandplatz",
  "lot_type": "Parkplatz",
  "name": "Ferdinandplatz",
  "region": "Prager Strasse",
  "state": "open",
  "total": 140
}

Just got this in Dresden... 8259 free spaces would be nice, but probably aren't correct.

Python version.

Shouldn't we just go on with Python 3 and stay clean there?

Legal stuff

It might not be a bad idea to check the imprints of all sites we're scraping here for any clauses that might prohibit such a thing and if so contact the relevant departments if it's all good.

It might also be a good idea to include a link to the original page, a name or a copyright notice (or all of the above?) in the JSON output.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.