I have very mixed feelings about Standard Ebooks and would much prefer being able to use Project Gutenberg directly, but one good thing Standard Ebooks does is that every book has an associated git repository (on GitHub), so it's (in principle) possible to see a history of fixes to the text over time.
Edit: welcome to your first comment after 9 years on HN btw, nice to have you here!
I was unable to load it initially (got an error from firefox) and had to re-attempt. Still slow if one forces a reload (shift-r, etc, to not use local cache).
I have about 50k of the books, I would have used a torrent of just the txt files if it was prominent.
The ISP actually knows which subscriber is on that line, can send them notices, block them, terminate them... loads of things that you simply cannot do because you have no relation to this person. And frankly I wouldn't want to need to have a personal relation with every website that I visit; my ISP can reach me if there is anything relevant to continued use of the internet. From personal experience, when I was a teenager, the ISP cutting our household off after an abuse report was an effective way of stopping what I was doing
Less than three is a classic!
Keep up the good work!
On the site I noticed the library boxes have roughly a single extra line causing a scrollbar to appear and the last line to be chopped off https://i.imgur.com/PQ8T0qc.png is there an issues/bug portal to properly submit these kinds of things?
autocat3 and gutenbergsite are repos responsible for generating gutenberg.org
(I can’t quite tell if that’s an egregious abuse of the site or you’re perfectly fine to share without human eye balls hitting your www?)
https://www.gutenberg.org/ebooks/offline_catalogs.html
Perhaps you can find the information you are looking for there.
However if you plan on scraping or otherwise hitting them with a ton of traffic, consider at least to donate a good amount for the traffic you cause them. It ain't free after all.
Don't hit the site with agent. The section furtherst bottom machine readable.
> All Project Gutenberg metadata are available digitally in the XML/RDF format. This is updated daily (other than the legacy format mentioned below). Please use one of these files as input to a database or other tools you may be developing, instead of crawling or roboting the website.
And strongly consider a donation! (My addition)
https://www.gutenberg.org/ebooks/offline_catalogs.html#the-p...
> Michael S. Hart began Project Gutenberg in 1971 with the digitization of the United States Declaration of Independence.[5] Hart, a student at the University of Illinois, obtained access to a Xerox Sigma V mainframe computer in the university's Materials Research Lab. […] This computer was one of the 15 nodes on ARPANET, the computer network that would become the Internet. Hart believed one day the general public would be able to access computers and decided to make works of literature available in electronic form for free. […]
https://www.gutenberg.org/about/background/history_and_philo...
Any idea what's happening? I thought PG published public domain books...
Full story (in Italian) at https://www.wired.it/internet/web/2020/06/30/progetto-gutenb...
apparently this situation hasn't been resolved yet
Technically, I can also just directly pull the epub from Project Gutenberg, but sometimes the formatting leaves a lot to be desired.
Once you get an e-reader that runs a semi-capable OS (ex - stock android, even an older version), it's hard to go back to something like a kindle.
https://www.gutenberg.org/cache/epub/1513/pg1513-images.html
https://standardebooks.org/ebooks/william-shakespeare/romeo-...
Each has its particular advantages relative to the other ...
Also one should probably compare the former to the single-page version on standardebooks: https://standardebooks.org/ebooks/william-shakespeare/romeo-...
https://www.gutenberg.org/policy/license.html
[Way back in the early days of the iPhone, I sold a book reading app which was backed directly by Project Gutenberg texts, called “Eucalyptus”. I sent 20% of the gross profits to PG - which was never less than very supportive of the app - and felt good about doing so.]
e-book app Gutebooks (in addition to their audio app), but it seems to have been deprecated (I'm no longer able to connect to the server on my copy (which I only got 'cause there was an in-app purchase to fund Project Librivox).
FWIW, Barnes & Noble has been plundering the public domain using a book composition/keying house in the Philippines to make their public domain books which they make available in their stores --- Amazon apparently has a similar setup for the Kindle Store:
https://www.amazon.com/Public-Domain-Books-Kindle-Store/s?k=...
Rather a shame that PG didn't monetize by putting their books up there pre-emptively.
Why is it 'plundering' for B&N to print physical books, transport them to their brick-and-mortar stores to sell? There are real costs associated to doing so. It would not have zero cost for me to print and bind a copy myself at home.
If Amazon is going to sell public domain texts, then it would make sense to source them from PG, and fund some money from those sales to the non-profit, similarly, they could then funnel reports of typos to PG for review and correction (it was a bit of a struggle the last time I tried to get a text corrected, and the project founder/director actually stepped in on my behalf).
https://play.google.com/store/apps/details?id=biz.bookdesign...
should ~~be~~ EDIT have been ENDEDIT opensource --- it does at least work to support Project Librivox (or at least that's my understanding)
Seems to no longer be available (see below)
but yes, generally I agree with your point. Library of 75k books seems pretty valuable to have direct access to.
https://dave.autonoma.ca/blog/2020/04/11/project-gutenberg-p...
> 23644 downloads in the last 30 days.
I wonder if this is bot behavior? 23k downloads feels like a lot?
[0] https://www.gutenberg.org/browse/scores/top [1] https://www.gutenberg.org/ebooks/24855
I like a styled formatted book—would prefer PDFs. (I know, not a popular format apparently.)
I like the idea of Project Gutenberg but guess I found book scans on archive.org my preference.
My go-to example is Lewis Carroll's "Through the Looking Glass" with the fantastic art of John Tenniel and Carroll's sometimes creative formatting of the prose…
I see they (Project Gutenberg) have ePub now, which can be good if well done.
(If not well done it can be a kind of mess. Re-flowable "HTML", paginated… Anyone ever try to print a long web page and did you enjoy the result? Perhaps that is as much on the ePub reader though.)
(I worked on iBooks for the Mac like 15 years ago—it's where I got to dive into the ePub format. A lot has changed in the standard since I am sure.)
EDIT: looks like EPUB3 has a "paginated" mode as well as more sophisticated layout tags.
Also appears to have support for ruby and vertical writing modes. This was not yet supported in WebKit when I worked on iBooks. Somehow, this white guy from Kansas (who knows no language other than English) got tapped to implement the vertical TOC for Asian languages. Also tasked with annotating the ePUB pages to display (also vertical) ruby text…
I've read more (meaningful) text on PG than any other digital platform. Huge fan. Thanks for all the work and for keeping it clean and free
https://www.fadedpage.com/ from Canada I think
https://runeberg.org/ from Sweden
The previous version of the site had two major flaws:
1. The search bar had been removed from the top of the page, and hidden behind a "Click here to search" (or similar) link partway down the page
2. Once you opened that page, the coloring of the site was so washed out on e-ink that the text input was hard to find.
Thanks for fixing it!
You can download books in most browsers. I know Amazon have done things to make life difficult for other stores in the past.
• On the one hand, E Ink devices have a fairly known set of limitations, and it would be ridiculous for me to expect them to render the whole web well.
• On the other hand, it's good for website designs to consider the kind of devices employed by their users. Using a Kindle to access Gutenberg is likely less of an edge case than it would be for other sites, so it's worth the extra design work.
(Keep in mind that -- given my sibling comment -- this is all theoretical. The latest iteration of Gutenberg's site is much better than the previous version)
Any yes, the text needed a lot of processing to make it right.
Now, in my early fifties and with declining eyesight, that's out of reach now.
Thanks for sticking with the project!
https://www.gutenberg.org/ebooks/feeds.html
Every day you'll get much more than you're bargaining for, right into your feed or inbox. Easy download books you're interested in and put them on your Kindle.
I've heard good things. Also - Sherlock Holmes :)
could be a trick to ease that fear :D
If you ask it to assess the relevance of the text in the present day it will also do that very nicely, highlighting the places where the text shows old-fashioned viewpoints that would be sharply criticized today.