Code Monkey home page Code Monkey logo

overrustle_parser's Introduction

overrustle_parser

Some code to extract my messages from this overrustle logs archive torrent. For those unaware, OverRustle collated logs from popular twitch channels for a couple years but were shut down in 2020 -- so this is just to grab some of my old messages so I have access to them.

Thought the twitch data request would've given me my chat logs but sadly did not.

Expects:

  • the logs directory (which has a bunch of .7z files in it) as the first argument
  • your twitch username as the second argument

Extracts the .7z files one by one into the current directory, finds any of my logs, then removes the temporary directory. Can take multiple days to run depending on your computer, is a lot of data (~48G when compressed)

Saves results to a ./<your username> directory -- one JSON file per channel. This saves even if it finds no logs, so in case this crashes, it can re-started and already processed files will be skipped. To combine those into a single file, you can use jq, like jq '.[]' <./<your username>/* | jq -r --slurp > comments.json

Created to be used as part of HPI

Example Usage

git clone https://github.com/seanbreckenridge/overrustle_parser
cd ./overrustle_parser
python3 -m pip install -r ./requirements.txt
python3 parse.py ~/Downloads/OverrustleLogs\ Archive/ moobot

Personally resulted in:

$ jq <* '.[] | .dt' | wc -l
1585  # number of comments
 $ jq -r <* '.[] | .channel' | sort -u | wc -l
43  # from these many channels

To run tests:

python3 -m pip install pytest
python3 -m pytest parse.py

overrustle_parser's People

Contributors

seanbreckenridge avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.