Short form content feeds are like unloading a dumptruck full of random items into your driveway. Is it actually better if all that stuff you didnt ask for is real information that needs to be organized and pieced together with what you already know without any of the associated context that helps you do that? Or is it better if you know 99% of it is trash and you dont need to remember any of it?
I think a tool like this is great for people who want to use short form content intentionally, and personally that only happens when I am bored and in need of a new topic to research. I think of all short form content like marketing/ads, just showcasing something i might be interested to dig into on my own. It's how i used StumbleUpon website back in the day.
But I have noticed I am rarely using short form content with intention. its because i want to check what my friends have posted, and then with the extra downtime i scroll a little bit, and sometimes get stuck
These attempts to make educational short form content still suffer from the same drawbacks, so I wonder how effective they truly are.
[0] https://cyberpsychology.eu/article/view/33099
[1] https://www.tandfonline.com/doi/10.1080/09658211.2025.252107...
Then you go about your regular day and suddenly everything feels harder in comparison. You have to think about what youre doing, you have to coordinate or plan your actions, you have to put work in. The swiping rots your ability to maintain and coordinate your chain of actions.
it weakens your ability to have intent.
How quick the times change... Back in my days, we put the limit on bundles being maximum 1MB, and it felt large even then.
All of the code is unminified/unobfuscated in the index.html file, you can right click the page and view-source it.
I sort of need to pull all the data at the initialization because I need to map out how every post affects every other - the links between posts are what take up majority of the storage, not the text inside the posts. It's also kind of the only way to preserve privacy.
Why are you serving so much data personally instead of just reformatting theirs?
Even if you're serving it locally...I mean a regular 100mbit line should easily support tens or hundreds of text users...
What am I missing?
Because then you only need to download 40MB of data and do minimal processing. If you were to take the dumps off of Wikimedia, you would need to download 400MB of data and do processing on that data that would take minutes of time.
And also it's kind of rude to hotlink a half a gig of data on someone else's site.
> What am I missing?
40MB per second is 320mbps, so even 3 visitors per second maxes out a gigabit connection.
All I'm getting from your serve is a title, a sentence, and an image.
Why not give me say the first 20 and start loading the next 20 when I reach the 10th?
That way you're not getting hit with 40mb for every single click but only a couple of mb per click and a couple more per scroll for users that are actually using the service?
Look at your logs. How many people only ever got the first 40 and clicked off because you're getting ddosed? Every single time that's happened (which is more than a few times based on HN posts), you've not only lost a user but weakened the experience of someone that's chosen to wait by increasing their load time by insisting that they wait for the entire 40MB download.
I am just having trouble understanding why you've decided to make me and your server sit through a 40MB transfer for text and images...
Because you need all of the cross-article link data, which is the majority of the 40mb, to run the algorithm. The algorithm does not run on the server, because I care about both user privacy and internet preservation.
Once the 40MB is downloaded, you can go offline, and the algorithm will still work. If you save the index.html and the 40MB file, you can run the entire thing locally.
> actually using the service
This is a fun website, it is not a "service".
> you've not only lost a user but weakened the experience of someone that's chosen to wait by increasing their load time
I make websites for fun. Losing a user doesn't particularly affect me, I don't plan on monetizing this, I just want people to have fun.
Yes, it is annoying that people have to wait a bit for the page to load, but that is only because the project has hundreds of thousands of more eyes on it than I expected it to within the first few hours. I expected this project to get a few hundred visits within the first few hours, in which case the bandwidth wouldn't have been an issue whatsoever.
> I am just having trouble understanding why you've decided to make me and your server sit through a 40MB transfer for text and images...
Running the algorithm locally, privacy, stability, preservation, ability to look at and play with the code, ability to go offline, easy to maintain and host etc.
Besides, sites like Twitter use up like a quarter of that for the JavaScript alone.
You did a great job and I love hearing that you did it all by hand in a day rather than having AI make it for you.
Also, all three of the examples are projects that have years of dev effort and hosting infrastructure behind them - Xikipedia is a project I threw together in less than a day for fun, I don't want to put effort into server-side maintenance and upkeep for such a small project. I just want a static index.html I can throw in /var/www/ and forget.
And re: hosting, my bare metal box is fine. It's just slow right now because it's getting a huge spike of attention. I don't want to pay for a CDN, and I doubt I could host a file getting multiple gigabits per second of traffic for free.
Thank you for making my day a little brighter.
What I appreciate the most about this string of comments (from OP) is that digging into "doing it for fun", hosting on your own machine, wanting simplicity for you as the maintainer and builder. This has been a big focus for me over a number of years, and it leads to things being not efficient, or scalable or even usable by others—but they bring me joy and that is more than enough for most things.
The reality is that there are of course ways to make this more efficient AND it simply doesn't need to be.
Good job on making something that people are clearly interested in, it brought me some joy clicking around and learning some things.
If you want it to be more than just this, of course you'll have to make it faster or have it be a different interface—installable offline typa thing so we can expect a bundle download and be fine with waiting. For example I can see this as a native app being kinda nice.
If you don't want it to be more than this, that's okay too.
Regardless, well done
Having too many users is a pretty good problem to have anyway!
The United States Virgin Islands are a group of islands in the Caribbean Sea. They are currently owned and under the authority of the United States Government. They used to be owned by Denmark (and called Danish West Indies). They were sold to the U.S. on January 17, 1917, because of fear that the Germans would capture them and use them as a submarine base in World War I.
https://simple.wikipedia.org/wiki/United_States_Virgin_Islan...
I think it would be nice if you could do a non simple English version but nevertheless happy with what you've put together, and I've added a shortcut to my phone. Please don't let the negativity stop you from continuing to work on it.
Thank you.
I've been swiping a lot for the last 10 minutes and I'm not sure how much it's learning. I have some feedback.
- I have never liked or clicked a biography but it keeps suggesting vast amounts of those
- It does not seem to update the score based on clicking vs liking vs doing both. I would assume clicking is a solid form of engagement that should be taken into consideration
- It would be interesting to see some stats. I have no idea how many articles i've scrolled through or the actual time spent on liked vs disliked article previews. If you can add such insight it would be interesting
- A negative feedback mechanism would be interesting as well. There is no way to signal whether I'm just neutral towards something (and swipe through) or actively negative about it (which is a form of engagement the doomscroll would actually use to show me such content once in a while)
- since this website has already shown me multiple pages about things I'm learning about thanks through it, it might benefit from a "share" button (another engagement signal) as HN folks are likely to want to share on HN things they've just learned
- Would you be willing to make the experiment open source?
It was timewasting or avoidant behavior for sure, and often described negatively. But at least you were learning something.
Silicon Valley then spent the next few decades trying to understand that behavior so they could isolate it, strip all positive value out of it, and make it highly profitable.
You need to like them to update the weights of the algo, it works well
Update, it does not run properly in Firefox on iOS. After it has loaded to 100%, the site refreshes.
Coincidence?! Yeah, probably.
It's kind of like the opposite to my Wikipedia project Redactle.net which takes a lot of effort.
OP, since you're encountering load issues I would suggest narrowing your corpus to Wikipedia vital level 3 and caching all the content since it's only 1000 articles.
Should be: Please only continue if you're not at work.
Yeah, it got really sticky real fast. From the random (?) selection it starts off where I couldn't recognize anything but popular TV shows, it immediately over-indexed on that content and I had to fight for my life to see anything else in the feed that I would recognize and consider a good algorithmic pick for my interests.
Which is brilliant, because Instagram has the same issue for me - absolute metric tons of garbage and whenever there is a gem in that landfill of a feed that I interact with positively, it's nothing but more of that on my feed for weeks until I grow sick of that given thing. In conclusion, Instagram could have used this 30 line algorithm and I'd have the same exact experience when using it.
Algorithmic feeds are obviously problematic for turning several generations into lobotomized zombies, but they are also just not very good at nuance, so it is not even a case of something that's bad for you but it just feels so good. It's just something that's bad, but is able to penetrate the very low defenses in human psychology for resisting addiction and short-term gratification and there is no incentive to improve them for the sake of the user as long as they work for the advertisers.
But then I look at the comments, and it really looks like some people want this.
Now I'm depressed.
DuckDB loaded in the browser via WebAssembly and Parquet files in S3.
Please consider taking an hour and push this to a Github with quick readme. Scientists and developers would get it. We have been building a torrent-based alternative to Youtube for a few years. Not many knowledge out there around operational frontpage algorithm.
I easily have over 100 tabs of wikipedia open at any one time, reading about the most random stuff ever. I'm the guy who will unironically look up the food I'm eating on wikipedia while I'm eating it.
No need to try to make it "doomscrollable" when it's already got me by the balls.
I gave up after about a minute.
And a plug for my own (fiendishly difficult) Wikipedia-based game:
The difference is just that this web page shows you how much data it transfers instead of doing it in the background.