Code Monkey home page Code Monkey logo

tropology's Introduction

Tropology

Tropology crawls TVTropes.org, converts the relationships between pages into a PostgreSQL database, and helps you visualize relationships between concepts, tropes, creators and material.

This is currently a personal experiment. It'll change, as experiments do, but maybe you'll find use in it as a testing playground.

This is version 1.1-SNAPSHOT. This version is very much a work in progress, as I am making some fundamental changes to use AWS.

You can read more on our site.

Prerequisites

Clojure

You will need Leiningen 2.0 or above installed.

PostgreSQL

My current development environment is PostgreSQL 9.4.5. It expects a database called tropology for the dev environment, and tropology_test for the test environment.

There are migration scripts included, but they act only upon the tables and do not create the databases.

See

scripts/create-test-environment.sh
for how I'm creating a Docker container on OS X for tests.

After you have installed PostgreSQL and created the databases, you'll need to run:

ENV=test lein clj-sql-up migrate
lein clj-sql-up migrate

AWS

I'm in the process of migrating the data storage to AWS. Currently the documents crawled are stored on a S3 bucket.

You'll need to configure your environment to add values for the S3 access key, secret key and endpoint. You can also create a profiles.clj with these values.

{:profiles/dev  {:env {:s3 {:access-key "..."
                            :secret-key "..."
                            :endpoint   "us-west-2"}}}
 :profiles/test {:env {:s3 {:access-key "..."
                            :secret-key "..."
                            :endpoint   "us-west-1"}}}}

Sample database

Here you can find a pg_dump'd copy of the fully scanned site. It's 3.22GBs, and includes the entire CC-licensed pages for those matching our crawl settings.

Import into Postgres using psql as usual.

I'm no longer publishing a database version without the contents, since the current visualization and exploration rely on the HTML to extract the reference descriptions.

I intend to replace that with an S3 bucket dump once the new version is live.

Testing

Clojure tests can be run with lein test, once the test database has been created. For the ClojureScript tests you'll need to install PhantomJS 2, and run lein doo phantom (which can be used to continually run the tests as you work).

A note on Cursive Clojure

Cursive Clojure does not yet support a way to launch a REPL with specific environment profile. Since the application reads its database connection parameters from the environment configuration, if you start a REPL from Cursive and run the tests against it, you'll be running them against the development database and not the test one.

Make sure you either create a REPL profile specifically for the test settings, or just run the tests via lein.

Running

To start a web server for the application, run:

lein cljsbuild once
lein ring server

Then go to http://localhost:3000/ See Using below.

Using

The core of Tropology is the concept exploration. When you first load it, it will display a random trope reference from the anime series Samurai Flamenco.

You can choose to mark a reference snippet as interesting, which adds it to a list of liked items, or just skip it. If you find a trope mentioned interesting, you can also click on the trope link. This will load it as the current article to review, as well as add the text snippet to the list of articles you've liked.

You can also click on Random Article in order to load any random page from TV Tropes.

Click on Show under Relationship graph in order to view the relationship between the items you have liked. Clicking on any of the nodes will show the immediately related concepts, and double clicking will load that article for further exploration.

None of this information is currently saved, since I'm only playing with the trope exploration, but that's on my to do list.

BEWARE: THAR BE SPOILERS! I am not yet applying any style that would hide topic spoilers.

Next steps

Next steps I'm considering are:

  • Save a set of references we've found interesting during the exploration stage.
  • We're currently showing as possible snippets all twikilink elements, but some summary articles use that only to link to other sub-sections and don't contain any actual information. Consider filtering them out, see if can easily differentiate between those that have content and those that don't.
  • Search, to allow you to start your exploration from a preferred topic. Need to decide if this will happen from a topic title or if I want to add full text search (which would significantly increase the database size).
  • Likely a lighter visual theme.

License

Tropology is released under the Eclipse Public License 1.0.

Includes Sigma.js for visualization.

TV Tropes content is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Copyright © 2015 Numergent Limited

tropology's People

Contributors

ricardojmendez avatar

Stargazers

Ishan Marikar avatar Sim avatar jonathan avatar  avatar  avatar Iulian Pleșoianu avatar  avatar NOPPON WONGTA avatar  avatar Matt "Trip" Maker avatar Joseph Kunin avatar ACCOUNT MOVED avatar Aleksei Sotnikov avatar aaron avatar  avatar yāλu avatar Satrio arif wicaksono avatar  avatar Ryuei avatar Ezra Match avatar Tim Feuerbach avatar hamlet avatar Nazeeruddin Ikram avatar Josh Mize avatar AsAddict avatar Marco Dalla Stella avatar

Watchers

 avatar NOPPON WONGTA avatar Ligang Yao avatar

Forkers

cvic mzwaal

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.