Code Monkey home page Code Monkey logo

dutch-names's Introduction

Dutch Name Statistics

Scrape Dutch names stats, enrich them with cohort tables and serve them via a web app.

The goal of this project is to obtain the expected number of people alive with a given name. Birth rates per name are scraped from the Nederlandse Voornamenbank (set up by the Meertens Instituut). The expected number of people alive for a given name & gender are calculated using yearly life expectancies give by the CBS. This project is inspired by/copied from FiveThirtyEight's excellent article How to Tell Someone’s Age When All You Know Is Her Name.

Status

  • Scraping: done
  • Data manipulation: done
  • Web app: WIP (see branch website)
  • Usage

    Scraping

    Scraping is done with scrapy and consists of two stages. First all the names on the website are collected with their summary statistics. From those statistics the subset of names can with yearly rates are determined and are then scraped.

  • Change directory to [dutch-names/spiders](spiders):
  • cd spiders

  • Scrape names listed on websites:
  • scrapy crawl meertens_list -o list.json
  • Scrape names listed on websites:
  • scrapy crawl meertens_details -o details.json

    Data manipulation

    IPython Notebook How to Tell Someone’s Age When All You Know Is Her Dutch Name

    Web app (WIP)

    python app/app.py

    Tools

    The project uses the following tools:

  • [d3.js](http://d3js.org/)
  • [IPython Notebook](http://ipython.org/notebook.html)
  • [Flask](http://flask.pocoo.org/)
  • [MongoDB](https://www.mongodb.org/)
  • [pandas](http://pandas.pydata.org/)
  • [Scrapy](http://scrapy.org/)
  • dutch-names's People

    Contributors

    hgrif avatar

    Watchers

    James Cloos avatar Artur Barseghyan avatar  avatar

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. 📊📈🎉

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google ❤️ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.