Code Monkey home page Code Monkey logo

trovemap's Introduction

trovemap

This is an entry for GovHack. TroveMap allows you to navigate a graph of word maps, which has been generated from sydney morning herald newspaper articles, using search terms from the queensland library. A connection between multiple words is made when they are found to often coexist on the same article. The edges are further filtered to a certain threshold, and per edge as well.

Processor.cpp goes through trove articles and downloads them. This took a pretty long time, but after they're downloaded, they're cached on disk, and processing them yields a graph file.

The only keywords recognized are search phrases that exist in the list provided by the Queensland library. This means that the graph ends up being a mapping of the interests of the public, mixed with the realities of the time. For example, the word "palestine" is next to "immigrants", and "jerusalem" isn't very far away in the graph as well.

The main hairball of the graph are often very common topics/words, and world-war 2 related topics. However, there are cliques of topics formed from the newspaper articles, such as the planets (mercury, jupiter) and classical musicians (beethoven, mozart etc).

There are sometimes an excessive lot of names in some of the articles, which skews the importance of certain words. For example, birth lists, death lists, and things like that. I've tried my best to classify and ignore those articles, but there may be a way to tell the trove API not to return those. Ran out of time and couldn't find out.

The history slider at the bottom will color the nodes differently and have them have different size / opacity. The color reflects the change of occurence from the previous period, while opacity / size reflects raw numbers. You can see interesting trends there, for example the sudden spike of mention of "births" during january 1946. If you tap on a node during this mode, you can see their frequency history at the bottom.

Disclaimer:

The graph results are a bit off -- there are missing words and links. Most likely an algorithm tweak is needed to fix it, but I ran out of time.

There should have been a UI for accessing the trove articles inside the app - but for now, transferring to the iOS safari app is the way to go. I ran out of time as well.

trovemap's People

Contributors

elwyos avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.