Code Monkey home page Code Monkey logo

twitter-archive's Introduction

Twitter Archive Page

This project aims to provide a simple front-end for the Twitter Archive data scraped from Twitter. The entire project is written in JavaScript & HTML and uses a few lightweight libraries for templating and styling. This is done intentionally so that the project can be run and hosted without any server-side code.

Eventually this project will support full local browsing, but for now it still requires a server to run due to limitations of local file access in browsers.

NOTE: This project is still in early development and is not yet ready for public use. It is currently missing many features and may have bugs. If you want to contribute to the project, feel free to contact me on Twitter @uplynxed.

Table of Contents

Features

  • Zero server-side code, can be hosted on any static web server.
  • Works with any Twitter account, not just your own, as long as you have a tweets.json file for it.
  • View tweets in a familiar Twitter-like timeline layout.
  • View individual tweets and any backed-up replies to them.
  • View tweet's with images, videos, GIFs, quoted tweets, polls, and more (to come).
  • Filter tweets by media, replies, retweets, and more (to come).
  • Search through the entire archive for specific tweets.
  • Set a cutoff date to hide tweets newer than a certain date.
  • Set tweets as favorites and view them in a separate list.
    • Export and import favorites as a JSON file.
  • Responsive design for mobile and desktop.
  • Dark/Light mode support.

Examples

Usage

  1. Download the latest release from the releases page.
  2. Scrape your tweets using the scraper userscript or bookmarklet. (scraper is currently not publicly available, sorry!)
  3. Download the resulting tweets.json file and place it in the same directory as this file.
  4. Find the user ID of the account you want to view (you can use https://tweeterid.com/ to find it) and set it in the config.json file.
  5. Upload the entire directory of the project to a web server. You can use GitHub Pages, Netlify, etc.
    • or run a local web server, using something simple like XAMPP or USBWebserver if you don't have one.
  6. Open this file in a web browser. If you're using a local web server, you can use the URL http://localhost/ to access it.
  7. You're done! You can now view your archived tweets in a web browser.

Dependencies

This project uses the following libraries and frameworks, which are currently loaded in from a CDN (Content Delivery Network) to reduce the size of the project.

Library Purpose Link
jQuery 3.6.0 As a dependency for JSViews https://jquery.com/
JSViews 1.0.13 For templating and data binding https://www.jsviews.com/
Bootstrap 5.3.0 For styling and layout https://getbootstrap.com/
Font Awesome 6.4.0 For icons https://fontawesome.com/
Twemoji 14.0.2 For parsing Twitter emojis https://twemoji.twitter.com/

Development

If you want to contribute to this project, you can clone the repository and run it locally. For development, I use Visual Studio Code with the Live Server extension. You can use any web server you want if you don't want to use VS Code. Useful alternatives include XAMPP and USBWebserver.

Since the project doesn't use any server-side code, it's pretty simple to get started with it.

If you need any help, feel free to contact me on Twitter @uplynxed.

Twitter Archive Scraper

The Twitter Archive Scraper is a userscript/bookmarklet that can be used to scrape your tweets from Twitter and save them as a JSON file. It is currently not publicly available, but may be released in the future.

NOTE: Depending on the number of requests, I'm willing to scrape your tweets for you. Contact me on Twitter @uplynxed if you're interested.

License

This project is licensed under the MIT License.

This project is not affiliated with Twitter in any way.

twitter-archive's People

Contributors

uplynxed avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

violetblend

twitter-archive's Issues

โ€ผ web.archive.org mirrors do not work with new media url substitution

The wayback machine (WBM) rewrites all the media URLs even on a javascript and json entries level.
This means all media URLs are being rewritten to their web.archive.org archived counterparts.

However, when trying to retrieve a URL stored in an attribute in the html, the WBM rewrites that URL as well, removing the web.archive.org prefix from it at some point between the value in the html and the function in js.

This is an issue because we are storing media file replacements with the keys set to the (cleaned up) original media file URL.
Because of WBM messing with the URLs, all keys (and all property values) have the prefix, but the value we retrieved does not. We can not use this value to retrieve its associated media_replacements object. Rewriting the URL to include the prefix again might also not be an issue as the timestamp of when the URL was archived is part of that prefix, and there is probably no way to tell what it might be.

Potential fixes:

  • Figure out how to get the full value from the html attribute without WBM interfering
  • Figure out how to rewrite the URL again to include the web.archive.org prefix
  • Rewrite the media_replacements object and related methods to avoid using a full URL as the index, perhaps just filename and filetype? Might have to account for potential conflicts with matching filenames/filetypes, but at least it's unlikely when it comes to media pulled from Twitter itself.

Picture fallback method causes too many repeated failed requests

With the current implementation of letting the browser resolve the best working image source in order of (original > local > backup) means that every time we load any image media it has to try loading the failed sources again. This gets bad especially with user avatars.

Given that we will have already resolved most media down to a working source at least once, we should store that source and use that directly for subsequent instances of the media.

We could even store this info in localStorage to save us from having to do it all again on a reload.

Bug: User popover cards not initializing on dynamically rendered tweets (like on scroll)

The following error is getting thrown on interaction with working AND broken popovers, as well as mouseenter event on the document itself?

tooltip.js:444 Uncaught TypeError: Cannot read properties of undefined (reading 'trigger')
    at un._setListeners (tooltip.js:444:35)
    at new cn (tooltip.js:123:10)
    at new un (popover.js:42:1)
    at HTMLDocument.<anonymous> (tweets.js:1543:13)
    at HTMLDocument.dispatch (jquery.min.js:2:43184)
    at y.handle (jquery.min.js:2:41168)
    at HTMLDocument.c (rocket-loader.min.js:1:9405)
_setListeners @ tooltip.js:444
cn @ tooltip.js:123
un @ popover.js:42
(anonymous) @ tweets.js:1543
dispatch @ jquery.min.js:2
y.handle @ jquery.min.js:2
c @ rocket-loader.min.js:1

Filtering only hides tweet template instances, doesn't actually filter the data

Problems with this approach:

  • When selecting the next set of tweets to render onto the page:
    • No visual update occurs if the next set does not contain any matching tweets (major issue)
    • Inconsistency in number of tweets it appears to render per scroll (minor issue)
    • No visual update means a user might not feel prompted to scroll again, making it appear as if it is loading infinitely.
      • Implement a timeout based end to the "loading more tweets" message

Blocked by #4

Media fallback for video content

As with the picture content, video content should cycle through the given URL substitutions until we get one that (properly) works.
Currently twitter videos are playing for a second before they stop. We need to figure out how to keep it playing, or attempt to load a local archived version instead (if available).

Idea: Add about page

Can be used for some boilerplate about text regarding the project as well as some customizable text by the person setting up the archive.

Also potentially a place to display license info, like for the Twemojis #24

Turn more of the index.html body contents into templates

Primarily so that any changes to the layout of the page doesn't require replacing index.html in active installations due to the title and meta tag customization being required to be done in it due to social media link embedding limitations.

Related to #5, keep template file size in mind.

Potentially move page layout templates to a separate file from the tweets layout templates?

  • Make a new issue out of this when it comes to this point.

Bug: Sidebar card heights cause issues in some scenarios

Problems happen when the cards fill up a lot of vertical space:

  • Position Sticky means they go over the navbar when you scroll to the footer
    • Other options so far seem to push it over the footer instead
  • Seems to be no good way to make cards fill up the height to a consistent max while also distributing them properly.

I think a programmatic solution is in order, unless Firefox finally releases its :has() CSS functionality.
Related commit: ac6791c

Search - Advanced search

  • Multiple terms support (phrases enclosed in quotes)
  • Modifier support (from:@username, mentions:@username)
  • Replacing former usernames in queries with current ones (if defined in config)
  • Including search results for former usernames when searching for a current one (if defined in config)

Support official self-archive data format

Twitter offers a way to archive your own account, so look into how the data is saved for that and figure out if we can parse it to be compatible with this project.

  • Potentially and hopefully allow merging of self-archived data with scraped data for a more complete data set.
  • Look into how and if it backs up media, and how and if we can use that to save unnecessary download traffic on #26

Create a compiled html containing all necessary scripts for downloading and local viewing

The current page loads in JS and JSON from external files and remote locations, which does not work for local files.

Solution:
Compile all files into a single html page and serve it as a download option.

Issues:

  • The loadConfig() and loadTweets() functions will need refactoring somehow to not load from external files if the contents of those files are already present in the html.

TODO:

  • Refactor loadConfig() and loadTweets() to fix the issue mentioned above
  • Write a JS function to compile all this and start a download (perhaps checking to see if it's not running compiled already).
    • If the above fails to work, do it via PHP. It only needs to run on a server anyhow.
  • Make sure it is dynamic so it stays up to date with any changes.

Implement sorting feature

Allow sorting by:

  • Date (Newest)
  • Date (Oldest)
  • Likes (Most)
  • Likes (Least)
  • Retweets (Most)
  • Retweets (Least)
  • Replies (Most)
  • Replies (Least)
  • Bookmarks (Most)
  • Bookmarks (Least)

Blocked by #3 as we need to filter the list before sorting.

Implement theme switching feature

Predetermined in the config.json file with the options "light", "dark" and "auto".

  • Implement auto option to adhere to a user's browser settings

  • Implement the option to allow or disallow theme toggling

  • Implement theme toggling

  • May require some kind of settings modal or offcanvas sidebar.

Finish implementation of automatic media file back up list generation

Potentially making use of the newer substituteMedia() function to make sure we grab the highest quality media.

  1. Write function to get all of the media of at least the main user account in a list
  2. Write function to go through the list and download the media
  • Make sure the media is successfully downloaded
  • If not, store the list of unsuccessful downloads and try again later until the list is done.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.