hodgesmr / mastodon_digest Goto Github PK

View Code? Open in Web Editor NEW

438.0 438.0 56.0 153 KB

A Python script that aggregates recent popular posts from your Mastodon timeline

License: BSD 3-Clause "New" or "Revised" License

Python 66.82% Jinja 20.48% Dockerfile 4.43% Makefile 8.26%

mastodon_digest's People

Contributors

Stargazers

Watchers

mastodon_digest's Issues

InverseFollowerWeight crashes when the number of followers is hidden

Using the default options I noticed this error:

Traceback (most recent call last):
  File "/opt/mastodon_digest/run.py", line 190, in <module>
    run(
  File "/opt/mastodon_digest/run.py", line 70, in run
    threshold_posts = threshold.posts_meeting_criteria(posts, scorer)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/mastodon_digest/thresholds.py", line 26, in posts_meeting_criteria
    all_post_scores = [p.get_score(scorer) for p in posts]
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/mastodon_digest/thresholds.py", line 26, in <listcomp>
    all_post_scores = [p.get_score(scorer) for p in posts]
                       ^^^^^^^^^^^^^^^^^^^
  File "/opt/mastodon_digest/models.py", line 21, in get_score
    return scorer.score(self)
           ^^^^^^^^^^^^^^^^^^
  File "/opt/mastodon_digest/scorers.py", line 75, in score
    return super().score(scored_post) * super().weight(scored_post)
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/mastodon_digest/scorers.py", line 38, in weight
    weight = 1 / sqrt(scored_post.info["account"]["followers_count"])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: math domain error

Bug is on this line:

mastodon_digest/scorers.py

Line 36 in f70a018

weight = 1 / sqrt(scored_post.info["account"]["followers_count"])

By adding debug info to scored_post.info["account"]["followers_count"] I noticed that indeed I'm following an account which has hidden its number of followers:

Therefore, followers_count=-1 and the sqrt fails

I haven't read the algorithm yet to suggest what's the reasonable thing to do in this case. Perhaps someone will be quicker than me :)

Configure ranges of scores

Thank you for this really awesome tool, putting the algorithm into the hands (and CI-pipelines ;) of the people. I can think of some use cases for not just focusing on the 'top scorers' but also the overlooked low and medium scorers. I think this is partly the intent of #10 - but I would also like to see the page having several sections, with separate result sets.

Idea: Add digest statuses to Mastodon Bookmarks

Hello! Thinking aloud momentarily, would love your thoughts and gut check on this.

I've been thinking a bit about how to integrate this kind of digest experience more natively into Mastodon. Generating the stand alone HTML is a great POC, but user experience and availability would be improved if we had a way to get the results of the digest into Mastodon itself.

In an ideal world (imho, and I'm very biased having worked on this feature at Twitter previously) Mastodon's platform would implement something akin to Twitter's Collections API (aka “Custom Timelines”), which are effectively “lists for statuses”: A named data store into which people can curate posts in an arbitrary order. An application like Mastodon Digest would then add its filtered posts and boosts into a “Mastodon Digest” timeline, which the user could browse through any client.

Since Mastodon Collections don't exist today, I wonder if this could be prototyped by overriding the Mastodon Bookmarks feature. With a few additions for polish, I imagine this:

User would create a token with posting and bookmarks write permission, rather than read only.
When a run has matching posts, Mastodon Digest creates a private post “Mastodon Digest for December 19th 2022”, posts it, and adds it to the user's Bookmarks. This would serve as a chapter heading for the posts we've digested.
Each filtered post gets added to the user's Bookmarks.
The user can then view their bookmarks in any Mastodon client, and act on them in their own instance as they see fit.

Obvious caveat: This is a very opinionated use of the Bookmarks feature and I'm sure wouldn't align with how some people already use it. That's fine. It might work for many people and maybe demonstrate the value of arbitrary custom timelines in Mastodon.

You might activate it with a -b option, allowing the existing functionality to be used to preview and refine the filtering before writing anything to production.

Further thoughts

Bookmarks does seem to sort posts in reverse-chronological order of when the post was bookmarked, which is what this feature would need in order to ensure the new digest is at the top regardless of post age.
I'm not sure what happens if you Bookmark a post twice through the API. I'd hope they just get silent ignored, but should verify.
There's probably an interesting exploration to be done regarding the best order to insert digest posts into Bookmarks: If you enter them in reverse-rank and enter the “Mastodon Digest for…” post last, you'd create the optimal reading experience, but maybe fighting too hard against Mastodon's design and be better to add posts chronologically and have the user scroll backwards.
Could also have an “end of digest” post added, so that if a user is using Bookmarks for things besides Digests, they'd be better able to identify them in between the digests.
Running in this kind of native-and-headless mode would also lend itself to running Digest as a scheduled service for multiple-users.

Love your thoughts. Thanks.

Not all boosts, stars, replies, followers, etc of a post are taken into account for scoring

Since the user's home instance is not aware of all followers of the author and all boosts, stars, etc, related to the post, the Scorer works with incomplete information when calculating the score for the post based on the information retrieved from the timeline request to the home instance.

It would be more accurate to query the information about the post and the user from their respective home instance.

two minor bugs

some instances don't want to be embeded to iframe, probably via X-Frame-Options headers
gotosocial instances just show image of sloth (this is probably their fault)

Don't aggregate posts with opt-out hashtags

Skip posts by authored by anyone with the #nobot or #noindex tag in their bio

Specifying username is potentially unnecessary

I think mastodon_username at

mastodon_digest/run.py

Line 48 in 3f2ea4d

mastodon_username: str,

is unnecessary because you can get the logged in account via mastodon.me() (as specified in https://mastodonpy.readthedocs.io/en/stable/15_everything.html#mastodon.Mastodon.me). I removed the username requirement for https://fediview.com by doing this.

There might be a use case where someone wants to log in with one account, but filter out interactions from another account, but that seems like an edge case?

Let me know if you'd like a PR for this and I'd be happy to create one.

Posts limited to 150px height

At some point the embedded posts all started to render for me with a 150px height for each iframe which means I have to go and scroll within each iframe to be able to read the contents of each post. This is running a Docker image built from d91876a (but I also had this problem in version 0.0.12 which I ran until today).

The small, fixed height makes essentially all posts cut off (if they are longer than one line of text). Since I think this happened without me having updated Mastodon Digest, maybe this is caused by some change in cross-origin behavior in more recent Mastodon versions?

In worst case I can adjust the stylesheet locally to have a chosen height value for iframe.mastodon-embed that works for most posts so I don't have to scroll in each iframe but just the overly long ones. (At the cost of making short posts be unnecessarily long). But it would be nice to have each post have just the right height needed.

Example:

Here is the computed layout in two browsers.

Chrome (version 111.0.5563.64 running on version 12.5.1 of MacOS):

Safari (version 15.6.1 running on version 12.5.1 of MacOS):

The bot should specify a user agent

The Mastodon.py library gives "mastodonpy" as its default user agent. It is possible to specify a different user agent, which would allow instance operators to block this bot script (or distinguish it from others). PR coming.

Allow specification of output file and make a more unique identifier for the default output file

I'd suggest that the -o argument should be usable with either a directory or file parameter.

If given a filename, it should have the extension html and should not exist.

If given a directory, it should exist, and the filename should give some information about the run-conditions (like scorer used, timestamp, and time range, for instance).

"Make Run" is not working

I checked out tag 0.3.1 and tried to build and run the docker container, but the run command did not work

make build:

docker build -f Dockerfile \
-t hodgesmr/mastodon-digest:0.3.1 \
-t hodgesmr/mastodon-digest:latest . \
--build-arg VERSION=0.3.1 \
--build-arg BUILD_DATE="Fr 14. Apr 08:57:40 UTC 2023" \
--build-arg VCS_REF=be98741 \
--build-arg NAME=mastodon-digest \
--build-arg VENDOR="Matt Hodges" \
--build-arg ORG=hodgesmr \
--build-arg WORKDIR="/opt/mastodon-digest"
[+] Building 19.6s (14/14) FINISHED
 => [internal] load build definition from Dockerfile                                                                                                                                                                                                                                                                    0.0s
 => => transferring dockerfile: 1.05kB                                                                                                                                                                                                                                                                                  0.0s
 => [internal] load .dockerignore                                                                                                                                                                                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                                                                                                                                                                                         0.0s
 => [internal] load metadata for docker.io/library/python:3.11-slim-bullseye                                                                                                                                                                                                                                            0.5s
 => [internal] load build context                                                                                                                                                                                                                                                                                       0.0s
 => => transferring context: 774B                                                                                                                                                                                                                                                                                       0.0s
 => [1/9] FROM docker.io/library/python:3.11-slim-bullseye@sha256:286f2f1d6f2f730a44108656afb04b131504b610a6cb2f3413918e98dabba67e                                                                                                                                                                                      0.0s
 => CACHED [2/9] WORKDIR /opt/mastodon-digest                                                                                                                                                                                                                                                                           0.0s
 => CACHED [3/9] COPY requirements.txt .                                                                                                                                                                                                                                                                                0.0s
 => [4/9] RUN mkdir -p venvs                                                                                                                                                                                                                                                                                            0.4s
 => [5/9] RUN python3 -m venv venvs/mastodon-digest                                                                                                                                                                                                                                                                     4.3s
 => [6/9] RUN venvs/mastodon-digest/bin/pip install --upgrade pip                                                                                                                                                                                                                                                       3.0s
 => [7/9] RUN venvs/mastodon-digest/bin/pip install -r requirements.txt                                                                                                                                                                                                                                                 9.6s
 => [8/9] COPY templates/ ./templates/                                                                                                                                                                                                                                                                                  0.1s
 => [9/9] COPY *.py ./                                                                                                                                                                                                                                                                                                  0.1s
 => exporting to image                                                                                                                                                                                                                                                                                                  1.5s
 => => exporting layers                                                                                                                                                                                                                                                                                                 1.5s
 => => writing image sha256:5ba9c5b3b324ad5501f9bb13a2bfc67c738e4ffacd5ef077c16e00edf83926ef                                                                                                                                                                                                                            0.0s
 => => naming to docker.io/hodgesmr/mastodon-digest:0.3.1                                                                                                                                                                                                                                                               0.0s
 => => naming to docker.io/hodgesmr/mastodon-digest:latest

make run:

docker run --env-file .env -it --rm -v "/render:"/opt/mastodon-digest"/render" hodgesmr/mastodon-digest
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "venvs/mastodon_digest/bin/python3": stat venvs/mastodon_digest/bin/python3: no such file or directory: unknown.
make: *** [Makefile:51: run] Fehler 127

I also tried:

sudo docker run --env-file .env -it --rm -v "/render":"/opt/mastodon-digest" hodgesmr/mastodon-digest
sudo docker run --env-file .env -it --rm -v /render:/opt/mastodon-digest hodgesmr/mastodon-digest

but got the same error message

Local setup not working on windows

Following local setup instructions multiple steps are failing.
In case it is too complex to adapt for Windows, please consider indicating that local instructions are linux-only.

Custom account amplification

In the discussion on mastodon I saw the idea to provide a user-defined list of accounts to boost for the digest and implemented a very basic functionality for this using a configuration file. I think the approach could be used for more options. See my fork here: https://github.com/leoluecken/mastodon_digest

@hodgesmr Would that be something that you'd be interested to include here?

Working with the Hometown fork

I'm on a Mastodon instance running the Hometown fork, and am getting this error:

raise MastodonVersionError("Version check failed (Need version " + version + ")")
mastodon.errors.MastodonVersionError: Version check failed (Need version 2.4.3)

This makes me wonder if the mastodon_digest works with Hometown. My instance runs Hometown v1.0.5+3.5.2 (i.e Hometown 1.0.5 and Mastodon 3.5.2)

Parse and build URLs with urllib

Most urls in the app are parsed and constructed with string concatenation. Move that work to urlparse and urlunparse.

Switch to poetry for dependency management?

I'm game to add some tests to this. (Ulterior motive: I want to use it as a back end for something like icymi_law.) It will be easier for me to do this if we use poetry for dependency management rather than requirements.txt. I am happy to do this myself. Is that cool with you?

Idea: Implement Mastodon timeline and login API

If this ran as a service and implemented enough of the Mastodon login and timeline APIs (https://docs.joinmastodon.org/methods/timelines/), and exposed the digests as timelines, third party clients that support multiple accounts (e.g., Tusky) could integrate with this very easily.

Idea: Send via email with Zapier

My ideal way of getting this digest would be via email. For self-hosted instances, Zapier email would suffice. I think what would work would be:

mastodon_digest has access to a Zapier webhook key
a compact template designed for email
the tool POSTs to the webhook, with the email body already in HTML

Kick that off in a cronjob, with a pretty basic Zapier task, and it should just work.

Version check failed (Need version 2.4.3)

Running python3 run.py I get the following output:

Building digest from the past 12 hours...
Traceback (most recent call last):
  File "/home/[...]/mastodon_digest/run.py", line 159, in <module>
    run(
  File "/home/[...]/mastodon_digest/run.py", line 53, in run
    posts, boosts = fetch_posts_and_boosts(hours, mst, mastodon_username, timeline)
  File "/home/[...]/mastodon_digest/api.py", line 20, in fetch_posts_and_boosts
    filters = mastodon_client.filters()
  File "/home/linuxbrew/.linuxbrew/Cellar/[email protected]/3.10.8/lib/python3.10/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/home/linuxbrew/.linuxbrew/Cellar/[email protected]/3.10.8/lib/python3.10/site-packages/mastodon/utility.py", line 42, in wrapper
    raise MastodonVersionError("Version check failed (Need version " + version + ")")
mastodon.errors.MastodonVersionError: Version check failed (Need version 2.4.3)

My instance is running v4.0.2.

Note: [...] is modified by me

Avoid unnecessary calls to stats.percentileofscore

This is not a main bottleneck, but a simple fix should improve the part in thresholds.py

Here's the diff - sorry for inconvenience :)

diff --git a/thresholds.py b/thresholds.py
index 739d869..1524e69 100644
--- a/thresholds.py
+++ b/thresholds.py
@@ -24,13 +24,8 @@ class Threshold(Enum):
         """Returns a list of ScoredPosts that meet this Threshold with the given Scorer"""
 
         all_post_scores = [p.get_score(scorer) for p in posts]
-        threshold_posts = [
-            p
-            for p in posts
-            if stats.percentileofscore(all_post_scores, p.get_score(scorer))
-            >= self.value
-        ]
-
+        q = stats.scoreatpercentile(all_post_scores, per=self.value)
+        threshold_posts = [p for p, s in zip(posts, all_post_scores) if s >= q]
         return threshold_posts

hodgesmr / mastodon_digest Goto Github PK

mastodon_digest's People

Contributors

Stargazers

Watchers

Forkers

mastodon_digest's Issues

Further thoughts

Recommend Projects

Recommend Topics

Recommend Org