social-protocols / news Goto Github PK
View Code? Open in Web Editor NEWQuality News - Towards a fairer ranking formula for Hacker News
Home Page: https://news.social-protocols.org
License: Apache License 2.0
Quality News - Towards a fairer ranking formula for Hacker News
Home Page: https://news.social-protocols.org
License: Apache License 2.0
This would have the same ordering as the HN new page, of course.
Make "X quality" into a link to a score page that shows stats and history of the story. Ideas:
Requesting the frontpage while a rankcrawler update is in progress is delayed, because sqlite delays that query. We should serve a prerendered page from memory instead.
The penalty field is populated with the higher of the most recently calculated penalty, and the value of the penalty field for the previous crawl. However, if the story did not appear in the previous crawl (e.g. it dropped out of the top 90), but does appear in this crawl, then it will not find the previously calculated penalty. This is because when looking for data from the previously crawl, we select on stories where:
sampleTime = (select max(sampleTime) from dataset where sampleTime != (select max(sampleTime) from dataset))
but this value actually is different for each story. The same problem exists for the resubmission time calculation.
A version of our "top" page that shows:
The difference in sitewide expected upvotes would be one way to measure comparative value created. This would be the sum of (quality*expectedUpvotesAtRank) for all stories on the page. If an upvote is a proxy of value for users, this is a measure of "net value": value per unit of attention consumed.
The problem is we are crawling:
https://news.ycombinator.com/newest?p=2
https://news.ycombinator.com/newest?p=3
But for the new page it should be:
https://news.ycombinator.com/newest?n=31
https://news.ycombinator.com/newest?n=61
Remove Causal Model discussion to "Further Improvements" section. Clarify description of "Hypothetical Upvotes".
How much does cumulative attention correlate with age?
Might as well try to capture a little SEO traffic.
It's only item?id=33600715
, without a url.
Since our database is growing indefinitely, we need a strategy to deal with limited storage.
At first, we could just delete old data from the database.
Later see how we can store the huge aggregated dataset.
This blog post has some ideas: https://www.righto.com/2013/11/how-hacker-news-ranking-really-works.html
I am seeing strange newRank data on stories such as this one (newRank going from 31 to 1), then from 60 to >91.
This will be useful for analysis, as well as creating graphs that show history vs. upvotes.
Should we just use MIT?
I see stories being penalized that don't seem like they should be penalized. Such as this one: a YC launch.
Table with aggregate stats (average age, score, weighted average quality) for each minute both the HN and QN front page.
Also related, when there are 0 comments, link should say "discuss" instead of "zero comments"
Once high quality stories disappear from the official pages, they don't receive any more rank information and therefore don't accumulate more attention.
We can sync these headers with our minutely page generation pattern. The crawl happens every minute on the minute, though it can take a few seconds. Roughly we can tell browsers to cache each page until, say, 10 seconds after the minute mark.
The larger margins in iphone result in fewer stories on the feont page.
Include example of over-ranked and under-ranked story with charts.
expectedUpvoteShare (in deltaExpectedUpvotes.go) returns the share of upvotes historically received on average at each rank. We want instead the share of upvotes that the average story would receive if we decided to show it at that rank.
This is different for the same reason that the probability that a hospitalized person will die is not the same as the probability that you will die if you choose to visit the hospital. The effect of rank on upvote rate is confounded by the fact that highly upvoted stories are more likely to be placed at rank 1, in the same way that the effect of hospitals on death is confounded by the fact that people who are dying are more likely to be placed in a hospital.
Can use contents of writeup.md
The general idea is to put more weight on more recent data when calculating upvote rate. For example, we could use the last N units of attention (expectedUpvotes). We don't want to just use the last N datapoints, because if a story is receiving little attention those datapoints provide little information about the true upvote rate. On other hand is a story is at rank 1, the number of upvotes during 10 minutes is probably a good estimate of the true upvote rate.
https://realfavicongenerator.net/
serve the static files from a directory and have them cached in the browser for a week.
[10:33 AM, 11/16/2022] Jonathan Warden: I think we should remove the upvote button. On HN that button only appears if users are logged in. I think that showing will frustrate some users especially if they are not logged in. It also takes up space.
[10:33 AM, 11/16/2022] Felix: ok, I'm fine with that.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.