Code Monkey home page Code Monkey logo

scrapers-ca's Introduction

Open Civic Data Technical Documentation

This repository contains documentation for developers including:

  • Writing Scrapers using Pupa
  • Open Civic Data's Data Type Specifications
  • Open Civic Data Proposals

Read these docs at https://open-civic-data.readthedocs.io/en/latest/

scrapers-ca's People

Contributors

agarrow avatar akarshikalowe avatar alberto56 avatar belambic avatar cmonagle avatar cmyr avatar dcorrech avatar dependabot[bot] avatar djac avatar drmeers avatar elynch303 avatar icolwell avatar jpmckinney avatar kporras07 avatar matthewleon avatar menerve avatar michaelmulley avatar mirabuck avatar patcon avatar ponsasinorum avatar pre-commit-ci[bot] avatar rafe-murray avatar seamuslee001 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scrapers-ca's Issues

Add district_name for mayors

All representatives should have a district_name for CSV exports to work nicely. This will automatically be completed once all Scraperwiki scrapers are converted to Pupa.

Wilmot, ON

Shapefile was recently added to Represent

Fix broken scrapers-ca scrapers

Work through the broken scrapers at http://scrapers.herokuapp.com/

  • ca_qc_mercier
  • ca_qc_montreal_est

Ignore:

  • jurisdictions ending in _municipalities
  • ca_on_toronto (website frequently fails)

Previous issue content

Reserved for Matthew Leon:

  • ca_mb

Priority (already in Represent):

  • ca_nb_frederiction
  • ca_nb_moncton
  • ca_on_richmond_hill
  • ca_qc_brossard
  • ca_qc_senneville

Rewrite Scraperwiki scrapers in scrapers-ca

Eliminate warnings in scrapers_ca_app

See http://scrapers.herokuapp.com/warnings/

Valid warnings have no checkbox:

  1. ca_nb_moncton (1) voice (same type and note)
  2. ca_on_markham (3) multiple urls
  3. ca_on_milton (7) some use webforms
  4. ca_on_oakville (12) multiple emails
  5. ca_on_peel (28) post (handled with boundary_url)
  6. ca_on_pickering (2) multiple urls
  7. ca_on_sault_ste_marie (1) email (Frank Manzo, Ward 6 has none)
  8. ca_on_toronto (1) voice (same type and note)
  9. ca_pe_charlottetown (1) email (Edward Rice, Ward 1 has none)
  10. ca_qc (26) email (many people do not yet have emails)
  11. ca_qc_brossard (1) email (Francine Raymond, District 3 has a typo in her name)

Resolved:

  • ca_mb_winnipeg (local part of email is in email webform URL)
  • ca_ns (emails are JavaScript-encoded)
  • ca_on_london (emails are in JavaScript)
  • ca_on_milton (some have emails, some use webforms)
  • ca_on_newmarket (loads of warnings, check the scraper)
  • ca_on_peterborough (emails are on same page)
  • ca_qc_saint_jean_sur_richelieu (emails are on councillor's individual page)
  • ca_sk_saskatoon (emails are on http://apps2.saskatoon.ca/app/aForms/councillor.aspx)
  • ca_sk_saskatoon: email, Ward 3, Ward 6

ocd-division-ids: URL scrapers

I have a very complex script that scrapes URLs of municipalities across Canada from the FCM's website, and matches them to a census subdivision according to their name and province. It produces this file.

There are other sources of URLs, such as CivicInfoBC. I wouldn't base myself off the script I'd written, and instead just start by trying to do straight name matching against the list of census subdivisions for BC (just limit yourself to those whose code starts with "59" to get only BC). There are only a few subdivisions with the same name in BC, in which case you need to match on name plus type.

  • Langley
  • North Vancouver
  • Esquimalt
  • Sechelt
  • Okanagan 1
  • Alert Bay

If this works, we can add others for other provinces. You can write the script in a fork of ocd-division-ids.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.