undefined

upvote

points

by pinkmuffinere18 hours ago |

upvote

by rebane200118 hours ago|

[-]

The loading issue is just a hug of death, the site's currently getting multiple visitors per second, and that requires more than a gigabit of bandwidth to handle.

I sort of need to pull all the data at the initialization because I need to map out how every post affects every other - the links between posts are what take up majority of the storage, not the text inside the posts. It's also kind of the only way to preserve privacy.

reply

upvote

by jedberg12 hours ago|

[-]

I think I'm missing something, but does every user get the same 40MB? If so, can you just dump the file on a CDN?

reply

upvote

by goodmythical18 hours ago|

[-]

I feel very strongly that you should be able to serve hundreds or thousands of requests at gbps speeds.

Why are you serving so much data personally instead of just reformatting theirs?

Even if you're serving it locally...I mean a regular 100mbit line should easily support tens or hundreds of text users...

What am I missing?

reply

upvote

by rebane200118 hours ago|

[-]

> Why are you serving so much data personally instead of just reformatting theirs?

Because then you only need to download 40MB of data and do minimal processing. If you were to take the dumps off of Wikimedia, you would need to download 400MB of data and do processing on that data that would take minutes of time.

And also it's kind of rude to hotlink a half a gig of data on someone else's site.

> What am I missing?

40MB per second is 320mbps, so even 3 visitors per second maxes out a gigabit connection.

reply

upvote

by goodmythical17 hours ago|

[-]

no but...why are you passing 40mb from your server to my device in a lump like that?

All I'm getting from your serve is a title, a sentence, and an image.

Why not give me say the first 20 and start loading the next 20 when I reach the 10th?

That way you're not getting hit with 40mb for every single click but only a couple of mb per click and a couple more per scroll for users that are actually using the service?

Look at your logs. How many people only ever got the first 40 and clicked off because you're getting ddosed? Every single time that's happened (which is more than a few times based on HN posts), you've not only lost a user but weakened the experience of someone that's chosen to wait by increasing their load time by insisting that they wait for the entire 40MB download.

I am just having trouble understanding why you've decided to make me and your server sit through a 40MB transfer for text and images...

reply

upvote

by rebane200117 hours ago|

[-]

> no but...why are you passing 40mb from your server to my device in a lump like that?

Because you need all of the cross-article link data, which is the majority of the 40mb, to run the algorithm. The algorithm does not run on the server, because I care about both user privacy and internet preservation.

Once the 40MB is downloaded, you can go offline, and the algorithm will still work. If you save the index.html and the 40MB file, you can run the entire thing locally.

> actually using the service

This is a fun website, it is not a "service".

> you've not only lost a user but weakened the experience of someone that's chosen to wait by increasing their load time

I make websites for fun. Losing a user doesn't particularly affect me, I don't plan on monetizing this, I just want people to have fun.

Yes, it is annoying that people have to wait a bit for the page to load, but that is only because the project has hundreds of thousands of more eyes on it than I expected it to within the first few hours. I expected this project to get a few hundred visits within the first few hours, in which case the bandwidth wouldn't have been an issue whatsoever.

> I am just having trouble understanding why you've decided to make me and your server sit through a 40MB transfer for text and images...

Running the algorithm locally, privacy, stability, preservation, ability to look at and play with the code, ability to go offline, easy to maintain and host etc.

Besides, sites like Twitter use up like a quarter of that for the JavaScript alone.

reply

upvote

by ziml776 hours ago|

[-]

It's incredible how rude and entitled people are about a toy site. It's like they are looking for any reason to take a shit all over it.

You did a great job and I love hearing that you did it all by hand in a day rather than having AI make it for you.

reply

upvote

by nickorlow16 hours ago|

[-]

I believe in privacy but generally people are fine with rec algorithms running on a server if it's transparent enough/self hostable. Mastodon/DuckDuckGo/HN/etc all don't need to download a huge blob locally. (If you do want it to run locally, hosting the blob on a CDN or packaging this as an app and letting someone else host it would probably improve the experience a lot)

reply

upvote

by rebane200116 hours ago|

[-]

Mastodon/HN do not have a personalized weighted algorithm. On HN you see what everyone else sees, and on Mastodon the feed is chronological. DuckDuckGo offers some privacy, but still sends your search queries to Bing.

Also, all three of the examples are projects that have years of dev effort and hosting infrastructure behind them - Xikipedia is a project I threw together in less than a day for fun, I don't want to put effort into server-side maintenance and upkeep for such a small project. I just want a static index.html I can throw in /var/www/ and forget.

And re: hosting, my bare metal box is fine. It's just slow right now because it's getting a huge spike of attention. I don't want to pay for a CDN, and I doubt I could host a file getting multiple gigabits per second of traffic for free.

reply

upvote

by vages13 hours ago|

[-]

I really like how you have done things. Didn’t mind the waiting time.

Thank you for making my day a little brighter.

reply

upvote

by mikodin10 hours ago|

[-]

Seconding this—I had to wait a little bit to download it and play around and have some fun with it. I didn't mind.

What I appreciate the most about this string of comments (from OP) is that digging into "doing it for fun", hosting on your own machine, wanting simplicity for you as the maintainer and builder. This has been a big focus for me over a number of years, and it leads to things being not efficient, or scalable or even usable by others—but they bring me joy and that is more than enough for most things.

The reality is that there are of course ways to make this more efficient AND it simply doesn't need to be.

Good job on making something that people are clearly interested in, it brought me some joy clicking around and learning some things.

If you want it to be more than just this, of course you'll have to make it faster or have it be a different interface—installable offline typa thing so we can expect a bundle download and be fine with waiting. For example I can see this as a native app being kinda nice.

If you don't want it to be more than this, that's okay too.

Regardless, well done

reply

upvote

by nickorlow9 hours ago|

[-]

Yeah, that's fair, though I'd think you can get a CDN/someone else to host the blob this for fairly cheap/free.

Having too many users is a pretty good problem to have anyway!

reply

upvote

by nickorlow9 hours ago|

[-]

(Could have the client download the blob from where the repo is hosted on GitHub, which takes under a second for me to download: https://github.com/rebane2001/xikipedia/raw/refs/heads/mane/...)

reply

upvote

by cwnyth13 hours ago|

[-]

Who made you do anything? It's a fun website. If you don't like it, move along or make one yourself. I could understand if you were paying for something, but this is free.

reply

upvote

by lazide17 hours ago|

[-]

Why not…. Load it on demand?

reply

upvote

by goodmythical17 hours ago|

[-]

That's my point. So confused. Got a ton of users clicking off because of this.

reply

upvote

by fgfarben16 hours ago|

[-]

The point you're missing is that this website is actually a submarine ad for the domain, xikipedia.org, which the owner is probably trying to sell.

reply

upvote

by rebane200116 hours ago|

[-]

That's a very silly claim considering I bought the domain the same day I released the project. I'm sure whoever would've been interested in buying the domain could've already swept it up for 10 bucks before me.

reply

upvote

by stavros10 hours ago|

[-]

It's blocked for me :( I think it must have been a typosquatted domain before.

reply