Thanks for all the effort put into the site!
I have very mixed feelings about Standard Ebooks and would much prefer being able to use Project Gutenberg directly, but one good thing Standard Ebooks does is that every book has an associated git repository (on GitHub), so it's (in principle) possible to see a history of fixes to the text over time.
Less than three is a classic!
Edit: welcome to your first comment after 9 years on HN btw, nice to have you here!
I was unable to load it initially (got an error from firefox) and had to re-attempt. Still slow if one forces a reload (shift-r, etc, to not use local cache).
I have about 50k of the books, I would have used a torrent of just the txt files if it was prominent.
The ISP actually knows which subscriber is on that line, can send them notices, block them, terminate them... loads of things that you simply cannot do because you have no relation to this person. And frankly I wouldn't want to need to have a personal relation with every website that I visit; my ISP can reach me if there is anything relevant to continued use of the internet. From personal experience, when I was a teenager, the ISP cutting our household off after an abuse report was an effective way of stopping what I was doing
Keep up the good work!
I've since discontinued hosting it, but happy to add you all and merge into an official PG offering: https://www.reddit.com/r/SideProject/s/VtYKxjrMme
On the site I noticed the library boxes have roughly a single extra line causing a scrollbar to appear and the last line to be chopped off https://i.imgur.com/PQ8T0qc.png is there an issues/bug portal to properly submit these kinds of things?
autocat3 and gutenbergsite are repos responsible for generating gutenberg.org
(I can’t quite tell if that’s an egregious abuse of the site or you’re perfectly fine to share without human eye balls hitting your www?)
https://www.gutenberg.org/ebooks/offline_catalogs.html
Perhaps you can find the information you are looking for there.
However if you plan on scraping or otherwise hitting them with a ton of traffic, consider at least to donate a good amount for the traffic you cause them. It ain't free after all.
> All Project Gutenberg metadata are available digitally in the XML/RDF format. This is updated daily (other than the legacy format mentioned below). Please use one of these files as input to a database or other tools you may be developing, instead of crawling or roboting the website.
And strongly consider a donation! (My addition)
https://www.gutenberg.org/ebooks/offline_catalogs.html#the-p...
Don't hit the site with agent. The section furtherst bottom machine readable.