Show HN: I built a tech news aggregator that works the way my brain does

(deadstack.net)

188 points

by dreadsword107 days ago |

97 comments

by nathanwallace107 days ago|

[-]

A similar site I've enjoyed for >15 years (!!) is https://techmeme.com/

I also use it's sister aggregator site for political news every day - https://www.memeorandum.com/

by dreadsword107 days ago|

parent|

[-]

Yes - Techmeme is definitely the archetype and a great product, and I have spent lots of time there over the years as well!

by taftster107 days ago|

prev|

[-]

I'm not sure, something about the "Recent Stories Summary" section (first view) is hard to read. The spacing is wrong. And the blue font. Someone mentioned Garamond too.

It's creating a "wall of text" effect to me and I'm not able to quickly skim and allow my eye to catch the bits that are interesting to me.

As a comparison, the HN homepage is very accessible to me for skimming and finding things to click into (like this entry).

UI is often quite subjective, understood. But I can't really "scan" the first view fast enough. It's all blending together and causes extra processing on my mind.

by dreadsword107 days ago|

parent|

[-]

I hear you - there's something to be done there. My initial thought was to stay as close to convention as I could (links are blue!), but as the RECENT list gets long, its definitely gets less scannable.

Thank you for the feedback!

by taftster107 days ago|

parent|

[-]

I hear you about "links are blue" ... except when you are a link aggregator.

The links are blue design from early HTML was meant to highlight links in the context of a paragraph of prose, not a list of link items. "Blue" means something special about the text in the context of the text around it.

In this case, the blue font is distracting because the links are the content. You don't need the blue to help your links "stand out". Because the links are normal text, using a normal palette would be appropriate.

I don't mind some subtle clues that these are links. Underlines, slight grey text. Or even a subtle hover effect. Two cents.

by gneuron106 days ago|

parent|

prev|

[-]

Put the number of occurrences (I assume that's the # at the end) first to help with signal to noise ratio based on quantity of coverage maybe? I'd also take a look at news minimalist if I were you, and how they used significance scoring as a fill in vs upvotes to provide additional signal: https://www.newsminimalist.com/

It's quite scannable, but obviously you're doing reverse chrono order so up to you how best to solve the UI issue.

by dreadsword107 days ago|

parent|

prev|

[-]

How about now? Story titles are still clickable links, but are black. Made the story count a link as well, and kept it blue as a visual cue.

by felideon107 days ago|

parent|

[-]

Some additional feedback:

There's no reason for both the story count and the story summary to be clickable. It's confusing because:

(a) It's not clear what the number in parentheses even means (until you click and infer)

(b) Separate links makes you think they lead to different pages

Also, echoing another comment, it's not really clear what "incoming" and "outgoing" stories mean. Maybe "new" vs. "stale"?

by taftster107 days ago|

parent|

prev|

[-]

Better, to be honest. Keep refining of course. But this is definitely more readable.

I admit that straight black is not quite the right answer either. A slightly toned down dark grey would be nice. And again, subjectively, I like how HN has a row of non-link smaller (lighter shaded) text under each listing, which I think plays nice for the white space between each item.

by bcrl107 days ago|

parent|

[-]

Personally, I far prefer black over grey. Grey is really hard to read across a variety of lighting conditions and devices. The older you get, the more important contrast becomes.

by taftster107 days ago|

parent|

[-]

Fair and good point. Sharp black bothers me, so just adding in a little hue is nice for my eyes. But that's me, of course.

Noting also that this text is #000000 black, per the CSS. Maybe the background color helps soften it a little? Like contrast white/black is hard on the eyes, but HN is not?

by MiiMe19107 days ago|

prev|

[-]

This is exactly the stuff that I think LLMs are best at. We have created the world's coolest string manipulator and this is exactly the kind of things I think LLMs are best suited for. Awesome job!

by Diti106 days ago|

parent|

[-]

How do you ensure the titles aren’t confabulated? I’ve used Kagi News recently and it summed up the articles about France wrong (that’s the only section in which I could reliably spot the made-up stuff).

by dreadsword107 days ago|

parent|

prev|

[-]

Cheers and thanks for the kind words! And yes - LLMs (at least o3-mini) do a great job as my editorial team - the site is 100% automated.

by lateforwork107 days ago|

prev|

[-]

Love it, but the body font (garamond) is not easy on the eyes. Garamond is one of my favorite fonts in print and at not-too-small sizes. On the screen it doesn't look good because where the characters get thin it gets too thin (or as font experts call it, too much contrast).

by dreadsword107 days ago|

parent|

[-]

Noted, thank you! I haven't put a tonne into readability, other than some basics - I prefer a serif'd font, and I made sure the background was easier on the eyes than #FFFFFF haha

by NetOpWibby107 days ago|

prev|

[-]

I love that your site comes with an overview instead of clicking away to another site immediately. Feels snappy and looks good. I can see this being my news roundup. Great work!

by dreadsword107 days ago|

parent|

[-]

Cheers and thank-you!

by swader999107 days ago|

prev|

[-]

Maybe I'm the only one but I would love a feed that never showed me items again that I've already scrolled past without engaging in the first time.

by facundo_olano107 days ago|

parent|

[-]

I built that feature (auto mark as read by scrolling) into my feed reader if you’re up to self host and curate your sources

https://github.com/facundoolano/feedi

I did try to build a public facing news aggregator with a similar ux but I couldn’t pull it off purely based on client side state (and I didn’t want to do user management)

by dreadsword106 days ago|

parent|

[-]

I'm with you on not wanting to do user management, feature creep, user data security concerns, etc.

by stevage107 days ago|

parent|

prev|

[-]

Me too. That's the number one thing I always wish for with every feed. And if I reach the end of the feed, that's fine.

by taftster107 days ago|

parent|

[-]

Right, I'm so tired of "infinite scroll". There's a mental/emotional reward for actually reaching The End of something.

by dasil003107 days ago|

parent|

[-]

I wonder how much this is a factor of the widespread mental health malaise that is often attributed to tech these days? Certainly plenty of factors to go around, but consider the connotation of "scrolling" and how common it is a default replacement to boredom in modern life and suddenly it seems quite insidious.

by taftster106 days ago|

parent|

[-]

Super insightful. I feel the same way. I can't mentally "conclude" my read for the day, because there is always just One More Article that is just under the threshold.

An extension of Fear of Missing Out, basically. And yes, I think it causes mental exhaustion and might be directly related to some mental disorders that we have really yet to understand.

by dreadsword106 days ago|

parent|

prev|

[-]

Yes - I was actually deliberate about leaving infinite scroll out; I started w/ scrolling on tag pages, for example, but switched them to paginated - largely for the feeling you described.

by vivzkestrel106 days ago|

prev|

[-]

Amazing, lots of questions if you don't mind answering 1) is this written in python 2) if yes, does it use feedparser 3) how are you storing these feeds in the database 4) how are you handling CDATA or html based feeds that return lots of html, do you sanitize before storing or store directly in the database as a CDATA string? 5) how do you handle edge cases and anomalies across different feed providers?

by dreadsword106 days ago|

parent|

[-]

Send me more questions, and I'll send you more answers!

by dreadsword106 days ago|

parent|

prev|

[-]

Dude, its old school LAMP stack all day. I use SimpliePie to handle & sanitize feeds, storing text only (stripped of HTML). Edge cases are pretty smoothed out by simplepie!

by fuddle107 days ago|

prev|

[-]

Cool site, an About page would be useful. It's hard to tell how the site works.

by dreadsword107 days ago|

parent|

[-]

Fair enough - its honestly not something I expected anyone to be interested in enough such that an about page would be required.

At a high level, it reads RSS feeds from a number of sources, and uses LLMs to identify clusters of stories about the same thing, group them, tag them, and designate them a "top" story or not. That's it.

The biggest thing I've learned in all of this is that o3-mini is far and away the best at following instructions (for this use case). Periodically I'll cycle through the models available on Groq, and always come back to o3-mini.

by gitaarik106 days ago|

parent|

[-]

Very nice, I've been working on something similar, but for regular news. But I want to summarize complete articles, and RSS only provides the headlines and sometimes the first paragraph of an article.

So I decided to write web crawlers, but then you run into CAPTCHA stuff. So I instead used Selenium to automate my browser to fetch the news articles. That worked well, but I haven't worked on it since.

Now I'm thinking that with all these AI browsers around these days, maybe that's actually easier than doing it with Selenium. But haven't researched it properly yet.

In any case, the LLM work of detecting whether two articles are reporting the same news, and summarizing the story, is the same in your project. So in case your project is open source, I would be interested in that part.

by amatecha107 days ago|

prev|

[-]

This is probably a dumb question, but.. what does "incoming" and "outgoing" mean?

by dreadsword107 days ago|

parent|

[-]

Oh man, don't ask - not a dumb question at all. I'll reshare what I put in another comment that answers it, but bottom line is they're a design gap in the context of /recent.

You're right --- incoming & outgoing end up being redundant on the "Recent" view. Where they're (more) relevant is in the "Top" view where the LLM editor has picked a subset of stories to be categorized as top and incoming/outgoing are the ones that didn't make the cut, organized by timeliness.

Definitely a gap in design!

by amatecha107 days ago|

parent|

[-]

Oh, sure, but I literally just don't understand what their meaning is >_>

by thekevan107 days ago|

parent|

prev|

[-]

I assumed it meant stories that trended highly and were now fading in popularity (outgoing) and stories that are trending but trending quickly and may be on a fast ascent.

Sort of a combo of "in case you missed it" and "the next new big stories".

by ying_zh106 days ago|

prev|

[-]

Nice work! I built a simplified version for my daily routine of searching and reading industry and research papers: https://multimodal-scout.app/. My main content sources are HN and Hugging Face trending papers, which already serve as a content filter for me instead of scraping random news. It works well for the domain I’m interested in. I’ve also made it open source, so you can tweak the RSS feed and adapt it to your needs. https://github.com/yingzha/multimodal-scout/

by jantissler106 days ago|

prev|

[-]

In case you are still reading this: Any plans to add RSS etc.? I might be in a small minority, but for me my feedreader is the central source. If I can't subscribe via feed, it doesn't exist for me. That's the way I'm following Techmeme and also Hacker News (with the minimum points set to 80 to show up in the feed). It's kind of annoying how many sites even in the tech field don't offer an RSS feed anymore.

by dreadsword106 days ago|

parent|

[-]

Still reading! Yes - RSS is on the roadmap - a /recent feed, and feeds for tags. Cheers!

by al_borland107 days ago|

prev|

[-]

I like it. It kind of reminds me of the old Fever RSS reader, which would group together similar articles from different sources, and use that to rank how hot a story was.

by dreadsword107 days ago|

parent|

[-]

Not familiar with fever, but there is something similar buried at the heart of mine - the LLM clusters stories, and they get promoted to public when they reach a threshold of unique sources.

That threshold is a function of day of the week - on weekends when the news cycle is quiet, it lowers the bar --- tuesday to thursday its at its most restrictive.

by hackncheese107 days ago|

prev|

[-]

Combines the strength of AI at summarizing text and easy access to the actual information sources for verification, well done well done!

by dreadsword107 days ago|

parent|

[-]

Cheers and thank you very much --- yes, LLMs are very well suited to editorial tasks!

by jhack107 days ago|

prev|

[-]

I'm REALLY liking this, way more than I thought I would. Great job! What's your stack if you don't mind my asking?

by dreadsword107 days ago|

parent|

[-]

Awesome - glad you're enjoying it and thank you for the kind words!

My "Stack" ---- LAMP + o3-mini for editorial tasks + Bootstrap for responsive front end. That is to say: Its old school, and painfully functional. But, light & fast.

by jflskajfsd107 days ago|

parent|

[-]

[dead]

by _menelaus107 days ago|

prev|

[-]

This is pretty cool man. How do you cluster the articles into stories? It looks like you did a good job of it.

by dreadsword107 days ago|

parent|

[-]

Thanks so much for the kind words - its 100% o3-mini for clustering. I have zero editorial input as to what constitutes a cluster, what's "top" news, etc.

The one subtlety is setting up the LLM to understand whether a new story belongs in an existing cluster, or with > 1 neighbors, constitutes a new cluster. The challenge there is scoping the clustering window (hours of stories for consideration) and topic breadth to avoid creating Katamari-super-clusters that just end up with every story associated to them.

At this point I seem to have found a sweet spot re: the hours window, the frequency of processing, and the design of the prompt such that its working consistently.

Very few false positives in terms of spurious clusters being created, or potential clusters being missed.

by tiaremnt107 days ago|

prev|

[-]

I just configured my own rss website to only find this awesome solution. I’m crying right now if only it found you earlier I would have saved me so much time. Also do you have the code publicly available so that I can customize for my own needs?

by embit107 days ago|

prev|

[-]

I have done similar style for tech news. Aggravating based on Tags. That way I can read tech news on micro topics. https://embit.ca/ Your feedback is appreciated.

by dreadsword107 days ago|

parent|

[-]

Looking good - keep at it! Use it yourself every day, and iterate it continuously.

by metalliqaz107 days ago|

prev|

[-]

Really nice and clean, well done.

What is the purpose of having summaries for "Recent", "Incoming", and "Outgoing" all at the top? Seems like all content from the later two are in the first, right?

by dreadsword107 days ago|

parent|

[-]

You're right --- incoming & outgoing end up being redundant on the "Recent" view.

Where they're (more) relevant is in the "Top" view where the LLM editor has picked a subset of stories to be categorized as top and incoming/outgoing are the ones that didn't make the cut, organized by timeliness.

Definitely a gap in design!

by dreadsword107 days ago|

parent|

prev|

[-]

Oh I should add: incoming will show stories ~20 minutes before they get picked up for "Top" inclusion, if they're going to make the cut, based on how jobs are scheduled.

by sweenzor107 days ago|

prev|

[-]

Very cool. Having an immutable record "time machine" you can use to re-find something you remember reading is very humane. I'd love to see this for world news, politics, etc.

by dreadsword107 days ago|

parent|

[-]

Ah cool! it is built to be extensible, and I'll give you a preview of another vertical here: https://northfeed.ca/

And - did you actually see the time machine at the bottom of the right hand column? Or - was that just a wish list item of yours?

by jains99106 days ago|

parent|

[-]

actually both websites looks same..how? based on same template?

by stevage107 days ago|

prev|

[-]

I would love this but with more blogs and less product announcements.

by dreadsword107 days ago|

parent|

[-]

I hear you - today was heavier on product announcements than normal I feel. And re: blogs - for sure: send me suggestions and I'll add them...!

by stevage106 days ago|

parent|

[-]

https://github.com/surprisetalk/blogs.hn/blob/main/blogs.jso...

by dreadsword106 days ago|

parent|

[-]

Cheers and thanks for sharing - that's a big list!

by clueless107 days ago|

prev|

[-]

looks very similar to https://particle.news how would you distinguish your approach (other than the tech focus)?

by dreadsword107 days ago|

parent|

[-]

Cool - particle looks great - I really like how visual it is.

Distinguishing characteristics - personally I get value from the unambiguous timeline (no editorializing in /recent), and (as nice as the visual is) the non-visual, super simplistic presentation & the curated sources (...which I value b/c I curated them myself haha).

So bottom line is that DS will appeal to a certain kind of obsessive compulsive news consumer and synthesizer that wants the right balance of signal to noise ands a streamlined presentation that doesn't slow them down. I count myself among that group!

by dotneter106 days ago|

prev|

[-]

I am trying to build something similar but for the tech articles https://fooqux.com/

by tamimio107 days ago|

prev|

[-]

If you can get rid of the cookies message that would be great, as I will place the site as an app in my phone and that message is annoying to have when I open it.

by dreadsword107 days ago|

parent|

[-]

You should only see that message once when you first show up, and as annoying as it is, there's a compliance element to it. Let me know if its persisting for you after accepting!

by I_Nidhi106 days ago|

prev|

[-]

Interesting idea. Is there a context for how these stories are being picked up, or is it just to compile all the recent stories in one place?

by Jaauthor106 days ago|

prev|

[-]

Thumbs up! This low-friction aggregator beats Ground News and other aggregators by a mile for simplicity and easy to read. Nice work!

by dreadsword106 days ago|

parent|

[-]

Awesome--- thanks so much for the kind words!

by 107 days ago|

prev|

[-]

deleted

by wccrawford107 days ago|

prev|

[-]

Wow. That's amazing! I've bookmarked it because I think it's one of the best news sites I've seen now.

by dreadsword107 days ago|

parent|

[-]

Well thank you so much for the very kind words, and don't hesitate to reach out with any feedback!

by iJohnDoe106 days ago|

prev|

[-]

Very impressive! I like it and continued to browse the content. Will be added to my daily list of sites to check out.

by dreadsword106 days ago|

parent|

[-]

Awesome - thanks so much for checking it out!

Cheers!

by chicagojoe107 days ago|

prev|

[-]

This is great! Are you using a news API or pulling in RSS feeds yourself? Is there a list of what sources are included?

by dreadsword107 days ago|

parent|

[-]

Reading RSS myself, OLD SCHOOL: Cron Jobs. PHP. Hahaha! List of sources: At present, no; but if its of interest, it would not be hard to add.

I should also add - please post any recommendations re: sources to cover.

by dreadsword107 days ago|

parent|

prev|

[-]

Hey - still thinking about sources here. With the data I have, I could actually do an interesting analysis of news sources - i.e.:

- how often do their stories become members of clusters? - how "fast" are they to publish on a topic vs. other competitors - i.e.: who "breaks" the news? - what tags (people, companies, topics) does a given source stick close? Which do they shy away from?

Thanks very much for a really interesting set of ideas to explore!

by dreadsword106 days ago|

parent|

prev|

[-]

Circling back to this: https://deadstack.net/sources

Note that the top news breaker is: The Verge, having broken about 10% of stories on my site; TechCrunch is next at 8, followed by ... MacRumours at 7.

by wmeredith107 days ago|

prev|

[-]

I've built a couple different versions of this for myself over the years. I like yours! Thanks for sharing.

by dreadsword107 days ago|

parent|

[-]

Cheers - thank you for the kind words!

by HardwareLust106 days ago|

prev|

[-]

So the question really is; What is your "signal-to-noise test"?

by jiwidi107 days ago|

prev|

[-]

Pretty cool! How do you do to build these "stories" based on news?

by dreadsword107 days ago|

parent|

[-]

Cheers and thank you! I'll reshare an earlier comment that I think answers your question - let me know:

Thanks so much for the kind words - its 100% o3-mini for clustering. I have zero editorial input as to what constitutes a cluster, what's "top" news, etc.

At this point I seem to have found a sweet spot re: the hours window, the frequency of processing, and the design of the prompt such that its working consistently.

Very few false positives in terms of spurious clusters being created, or potential clusters being missed.

by jiwidi107 days ago|

parent|

[-]

Very interesting, how do you do that? Do you limit yourself what you feed or via custom instructions? I had a similar case so would love how you are doing the prompting here.

In my case we went with embeddings and clustering to find close papers to each other because llm were allucinating.

by econ107 days ago|

prev|

[-]

Clicking "more" for a few extra words feels wrong.

by dreadsword107 days ago|

parent|

[-]

Where are you seeing that?

by Biologist123107 days ago|

prev|

[-]

Great idea! May I ask what the information source is?

by dreadsword107 days ago|

parent|

[-]

Individual feeds from ~100 sites!

by productiveminds107 days ago|

prev|

[-]

Simplistic site, looks clean and easy to navigate!

by dreadsword107 days ago|

parent|

[-]

Many thanks for the kind words!

by neilellis107 days ago|

prev|

[-]

Really good, clean and to the point, love it.

by dreadsword107 days ago|

parent|

[-]

Cheers - thank-you so much!

by mariusor106 days ago|

prev|

[-]

If I'm not logged in you don't need cookies. Being blasted in the face with a cookie banner as the first thing on a web page is very disrespectful.

by botanrice107 days ago|

prev|

[-]

This is neat! Thanks for sharing.

by dreadsword107 days ago|

parent|

[-]

Cheers and thank you!

by p-s-v107 days ago|

prev|

[-]

cool, how did you create it? whats the architecture like ?

by dreadsword107 days ago|

parent|

[-]

Thanks very much! Architecture - is truly recidivistic - LAMP, cron jobs, o3-mini, bootstrap. It works, its fast because its not complicated, and b/c I'm doing things like updating hourly vs. real time.

by jarmitage107 days ago|

prev|

[-]

RSS?

by dreadsword107 days ago|

parent|

[-]

In the roadmap! RSS by tag - i.e.: for https://deadstack.net/tag/quantum And an RSS feed for /recent are both in progress