SQLite is all you need for durable workflows

upvote

SQLite is all you need for durable workflows

(obeli.sk)

628 points

by tomasol23 hours ago |

upvote

by bitexploder22 hours ago|

[-]

I started setting up my workflows using Temporal. It deploys as relatively light weight local app. For an isolated local installation it uses SQLite. It makes the process of dealing with API retries and organizing workflows and tasks really simple. I recommend giving it a try. It is, philosophically, exactly what this article is suggesting, but it adds an incredibly rich and flexible interface for agents to work with. Additionally, the web UI makes it very easy to inspect workflows, review agent execution, etc. Temporal also encodes much higher reliability into your system, almost for free. Distributed and reliable systems are hard, don't reinvent the wheel IMO.

If you find yourself wanting things like an easy way to then introspect your SQLite database, figure out what is happening in the workflow, compose individual tasks, make workflows trivially callable, etc, give Temporal a look.

Alongside this, I have mostly moved away from files for agents. Markdown and JSON are great, but also feel like traps when building out smaller local apps. LLMs are great at SQLite and you can render anything you want out of it (Markdown, JSON, etc). It saves a lot of tokens when an agent can just query a specific row instead of having to fire up jq or grep through markdown. You get a nice portable self contained data management system that encourages agents to be more disciplined about how they structure their data than a bunch of files. It also continues to scale into MySQL/Postgres if your little local projects start to outgrow or become more formal, you already have schema and discipline around data.

reply

upvote

by fcarraldo15 hours ago|

[-]

It sounds like you’re running this mostly on a single machine? Temporal gets much more complex with scale. Cassandra isn’t fun to manage. Ringpop and TChannel are hard to debug when things go wrong. The SQL backend support doesn’t support horizontally scaled replicas (just single instance) due to consistency requirements. Depending on how your code is written, modifying code baked into workflows becomes complex, as anything that modifies the history event ordering breaks determinism in already-deployed workers.

We use it heavily and everyone who started on it doing simple scripting/automation all love it, everyone who built real production systems on top of it all hate it. Possibly operator error, but my experience hasn’t matched the rosy picture painted in these comments.

reply

upvote

by inglor8 hours ago|

[-]

In the last two years, we built (with a team of 15, now 100) a billion dollar business on top of Temporal that performs business critical applications for fortune 500 companies. We couldn't be happier with temporal.

Determinism sucks, you do have to work hard and make everything idempotent in activities like we would for durable software anyway. The language we used was incorrect (Go) and has a lot of boilerplate compared to alternatives we later investigated (Python and TypeScript). Visibility can be slow and misses information. We needed to write our own APIs to work effectively with Agents for root-cause analysis of failures.

With all the caveats - Temporal is amazing, it feels much better than previous orchestrators I used like Prefect or Airflow. 100% would adopt again.

reply

upvote

by bitexploder1 hours ago|

[-]

That is the real truth people are voicing when they say Temporal is heavy. They are really saying: Durable, reliable, distributed workloads are hard and it takes effort to manage! And that is true. I know of no systems that make that genuinely easy. It is a hard discipline. Maybe Temporal makes that harder than it should be, but I have no experience there.

There are no free lunches in this space. I have no idea how good or bad Temporal is since my usage is pretty small and isolated, but software rarely just works and impresses me and Temporal for my local machine orchestrating genuinely did. I think Netflix's conductor is another cool option, but I ended up with Temporal due to license.

reply

upvote

by UnfitFootprint4 hours ago|

[-]

Could you share a bit more about your learnings on go + temporal? That combo was next in line for us to migrate _to_

reply

upvote

by taberiand8 hours ago|

[-]

> Depending on how your code is written, modifying code baked into workflows becomes complex, as anything that modifies the history event ordering breaks determinism in already-deployed workers.

I see this as Temporal surfacing inherent complexity of the domain in a way that forces the developer to consider it, rather than introducing extra complexity.

If it didn't make workflow determinism a strict requirement, the requirement would still exist - it would just hurt much worse in production when it's broken.

See also: Rust borrowing

reply

upvote

by cnlwsu3 hours ago|

[-]

We are a huge production setup where it’s absolutely critical successfully but we use temporal cloud. Hosting it yourself is what makes it miserable. https://temporal.io/resources/on-demand/netflix

reply

upvote

by chrisss3953 hours ago|

[-]

"gets much more complex with scale" feels like the crux of it. Pick the right solution for your intended scale

That said, I appreciate this is hard in practice. We need to start small to manage the development rabbit hole risk, while also wanting to dream big. There is a tension there that I find hard to balance.

reply

upvote

by bitexploder11 hours ago|

[-]

Yep. Individual systems with yolo agents doing stuff in isolation. I could see how it can get complex. Most distributed systems are. No free lunches I guess. I am not sure what the alternatives are at scale.

reply

upvote

by kumulo10 hours ago|

[-]

Temporal feels massive, I tried it for a small workflow embedded on my system, and worked fine, but when thinking on scaling it, it just didn't made any sense for my use case.

I also have restate.dev on my reseearch list, which on paper should scale well and be definitely more lightweight and simple to setup, worth having a look.

reply

upvote

by burner_acc0019 hours ago|

[-]

give conductor a try. runs with sqlite or if you want to run it on a server mysql or postgres. I know people running with postgres with a decent scale and it works just fine.

https://docs.conductor-oss.org/

reply

upvote

by bambax6 hours ago|

[-]

My current client has a forest of 90+ SnapLogic pipelines that were badly written and maintained even worse; one of those was completely wrong, in that it generated wrong accounting data which could eventually have financial, fiducial and legal repercussions.

I rewrote the pipeline in Python (a correct version of it) with state management in SQLite and logs in plain old flat files, and everything has been running smoothly ever since. In fact this is the only data flow that has worked without errors or interruptions in the last six months.

Instead of replicating the db file with Litestream I do a remote backup with Restic before and after each run; it's not an exact replacement of Litestream as we could possibly lose a whole run if the machine died / disappeared at the end of a run, but it lets one restore any day very easily. In an ideal world I think we should have both (live replica + backups).

reply

upvote

by svara19 hours ago|

[-]

Word on HN is that you're either paying more money than you expected for temporal's managed solution or taking on substantial ops burden ultimately running their very heavy system yourself.

I wouldn't know, I've not done either, but I'd like to learn more from your or other's experience.

reply

upvote

by bitexploder18 hours ago|

[-]

I told an agent to set it up for me for some local stuff. It is written in Go. It has a painless path to run on a local SQLite DB. My agents use it to organize and coordinate workflows. It handles retries and long horizon tasks fine. As far as I can tell for the core workflows and tasks pieces it’s great. MIT license. Like anything it isn’t free to manage but it offers a lot in return. High reliability systems are hard. Temporal only solves some of it. It is far better than rolling it yourself.

I think a genuine problem right now is people are building agentic work flows and learning the hard way highly reliable agentic work flows are hard. Agents are unreliable. They are both not deterministic and not the backing APIs have pretty high error rates. Temporal has solved that pain for me and made it easy to diagnose problems.

I don’t have anything really large scale running. But big enough that it takes billions of tokens and high reliability to finish.

reply

upvote

by platz15 hours ago|

[-]

whats an example of things that you have your agents do that use workflows and sqlite db

reply

upvote

by bitexploder11 hours ago|

[-]

Autonomous C to Rust. Automated penetration testing and vuln validation.

reply

upvote

by trueno10 hours ago|

[-]

you just made me realize how much i wished people stopped talking in abstractions and just stated what they were doing. i hadnt realized how often i saw things like "workflows" and just kinda had my eyes glaze over. none of it ever really clicks until i see the true descriptor of whats going on.

ive been over here using claude relatively simply as of recent, just claude code and i might enter plan mode to do some bigger like scrap together a test suite of some sort, or i just have him scripting and refactoring/reformatting stuff under my direction. i wrote my own cli tool (needed to bake in the snowflake golang driver for external browser sso propagation) and added it as a skill so he can talk to our cloud dbms when im doing analytics things but for the most part its all pretty simple. feel like my productivity is 50x but after over a year with claude ive really backed off on asking him to do insane stuff and mostly keep him churning stuff out for me in domains i know very well.

so i read all this workflow stuff that needs durability and logging and im kind of astounded how many people have their AI stuff just running on their own round the clock. i didn't realize how much of peoples day to days needed to be automated, i don't seem to find myself surrounded by much that should be automated. jira is probably the only thing i need to sit down and automate because its such a translation tax on developers just so business people can feel involved. but outside of that... guess im behind the times, but i dont know if its that. i see the big grand things people use llms for ("im creating the ultimate knowledge base" or "ive automated everything under the sun and im making 10k a week" etc) and i am feeling either too tired, not ambitious enough, or unenthused by the creative and grand ways people are working with AI. seems like everyone has their own "perfect way to use AI" but I can't seem to find the oomph to go beyond using claude as a utility anymore. a year ago (maybe more cant remember anymore its all a blur) with claude in the sonnet era i was so amazed the first thing i did was try to reverse engineer a game using ghidra. had him building test suites to verify the math was correct. we were at this for weeks. my nearby datacenter probably drained 10 lakes. that was just one of _many_ over-ambitious projects i selected because of claude that never saw a finish line.

yesterday i opened beej.us and just started reading. im young and i feel like i somehow went from 'damn this claude shit is pretty cool' to 'AI is whatever its fine' in a year. like the bell curve meme.

reply

upvote

by tauwauwau4 hours ago|

[-]

Check out Matt Pocock's coding workflow. His approach is repeatable, consistent and is backed backed by actual theories in large software development.

reply

upvote

by simon842 hours ago|

[-]

About the same feeling here. I guess not everything is about global banking scale.

I've tried clever tricks to get AI produce unsupervised stuff and came back from it. The slop and loss of cognitive knowledge about what it did was uncomfortable to me... I cannot understand how you would hand off critical job to it.

reply

upvote

by kenforthewin17 hours ago|

[-]

Could you expand on the "substantial ops burden"? Let's say you're using a managed Postgres instance as the underlying data store, how substantial is the ops burden in that case? I understand that temporal is actually a set of 4 or so microservices on top of a data store, but if you're already running a distributed system backed by k8s or something like that, it doesn't seem like it adds significant incremental ops on top of that. But I could be wrong.

reply

upvote

by graerg2 hours ago|

[-]

I run my own temporal service in my k8s cluster; this setup is the backbone for almost all my applications. For simplicity, I opted for the postgres backend. You still need to run the 4 (?) other service (history, matching, frontend, ui, maybe others, definitely others if you want observability with prometheus/grafana, and tad bit more complexity if you want tailscale to get in there and poke around).

They ship Helm charts so reality is somewhere between "helm deploy" and "substantial ops burden". I don't have to touch it very frequently, but that is not to say I don't have to touch it. There's occasional releases and there have been times where (probably due to my inexperience with helm) I botched an upgrade and lost some data. And I've been on this journey for years; when I first started, they didn't have a Python SDK and it was one of my (many) excuses to learn Go. But anyway to your point, yes, if you're comfortable with k8s and Helm then you shouldn't have much of a problem running hundreds of thousands of workflows; if you want to really push the throughput and optimize cost you probably need to get creative the individual services and look into cassandra (maybe? idk).

reply

upvote

by tempest_17 hours ago|

[-]

As a dev I would tell you its an ops burden.

My devops coworker just shrugs, pumps out some yaml and helm and away it goes.

It really depends on your experience and tolerance for a lot of things.

Usually maintenance burden doesent start to make itself known till you get off the happy path or something breaks. Sometimes it can be a long while before that happens, sometimes it happens right away.

reply

upvote

by 15 hours ago|

[-]

deleted

reply

upvote

by __turbobrew__14 hours ago|

[-]

I think it depends a lot on the operational maturity of the company. Some places are running the LGTM observability stack, sentry for error reporting, 24/7 on call rotations, playbooks for all alerts, etc. Those organizations will have less issues running systems like temporal because the operational framework is already there.

Other orgs have never heard of alerts or error reporting and naturally will not catch issues until they are catastrophic (for example services that crash frequently in the background go unnoticed until the crash frequency causes a catastrophic failure). In my experience a lot of issues are pretty simple such as running out of memory, CPU throttling, crashes caused by simple bugs (nil panics). If you have good observability you can catch those issues early.

For example: people rag on Ceph that their cluster somehow got into a broken state, but that really only occurs when abuse of the ceph cluster has went on long enough that the cluster finally reaches the tipping point where it is unrecoverable. If you set ceph up, follow the correct replication rules so components are spread across failure domains, and use the metrics and alerts that are distributed with ceph it is actually quite hard to break the cluster.

reply

upvote

by mnahkies7 hours ago|

[-]

In my experience with a relatively modest number of concurrent workflows (think hundreds) you'll be pushing several thousand transactions per second through that postgres instance.

As best I can tell it doesn't do any batching of it's writes/reads, and it's update heavy in places rather than append (I suspect their cloud version might do some of these things)

It's pretty close to "let's make every function call serialise it's parameters/return value, go through a postgres table and several network hops"

That said it can be very useful, but it's a heavy tool that's best suited for high value/risk workflows where you're earning enough from the execution that you can afford the overhead (for example an Uber trip with several dollars of service fees is probably a good fit, unsurprisingly since it's roots are from Uber)

reply

upvote

by edumucelli18 hours ago|

[-]

Very heavy indeed, people will confuse the durability that Temporal provide with all the other properties a distributed system needs. They will then think that Temporal will solve all their problems.

reply

upvote

by dtech2 hours ago|

[-]

Their managed solution is pricey and especially the linear scaling with how much you use it is very meh. It's comparable with AWS lambda which also isn't cheap. However it's minor on a typical cloud bill.

Self-hosting is very easy in my experience, I've done it for 2 years but management wanted to move to Temporal Cloud. They have a helm chart which just works including upgrades. This does assume you have the whole k8s shebang set up and working in your company. I never had to touch is outside upgrades which took maybe 30m including validation.

reply

upvote

by parthdesai18 hours ago|

[-]

use oban and call it a day: https://oban.pro/

reply

upvote

by peterson_lock21 hours ago|

[-]

This reads like an advertisement for Temporal :)

reply

upvote

by switchbak20 hours ago|

[-]

I'm someone else who has inherited a bunch of ad-hoc orchestration systems and also used Temporal quite heavily. The latter does certainly come with some overhead (not so bad in the age of LLMs), but it also guides you along a well-trodden path of good practices. The latter being very important - it means that when you want to take on more advanced capabilities, you probably haven't painted yourself into a corner too badly and can take that on fairly easily. Think: retries, multi-tenancy, multi-lang, observability, etc.

reply

upvote

by dust421 hours ago|

[-]

I just spent the last two weeks digging into workflow state engines and temporal was one of the candidates. It is a VC backed fork of Cadence. The got 0.3B funding and whatever positive I read about them on the net I take with a big spoon of salt. Just my 2 cents.

reply

upvote

by baq19 hours ago|

[-]

Low key amazing tech, kinda like clickhouse - nobody is bragging it’s running their business

reply

upvote

by pzduniak20 hours ago|

[-]

I can vouch for them too, being a super early adopter. One of the best early bets I've ever made. Awesome OSS product, glad the team decided to leave Uber to commercialize it.

reply

upvote

by bitexploder18 hours ago|

[-]

Well, just my experience. I installed it, had my agents configure it and it immediately solved problems I had with very little friction. Dealing with long running, long horizon agentic tasks that need very high reliability so I don’t have to babysit. I vibed the first version, realized I was reinventing reliable distributed systems. Stopped vibing and started surveying for something that fit :)

reply

upvote

by hedgehog14 hours ago|

[-]

It does, my experience has been that it adds code complexity, deployment complexity, and performance problems. There are some observability benefits, but other ways to solve that. It's possible there are workloads that fit it but not anything I've personally worked on.

reply

upvote

by jawns22 hours ago|

[-]

Could you give an example of a case where you'd use SQLite instead of jq or grep through Markdown?

reply

upvote

by phamilton21 hours ago|

[-]

My favorite lens on SQLite is that it is actually two things:

1. A robust durability implementation 2. A library of high performance data structure and algorithms

The fact this it's SQL is nice, but those two attributes are what make it great.

For example, I'm implement an in-process event log that I want to be durable. I started simple, but soon saw some edge cases and instead of playing whackamole I just swapped to using sqlite as an ordered kv store that gives me ACID.

Another example: ingesting multiple inter related datasets. Instead of a dozen hash maps in memory, I load them up into sqlite (no persistence) and then slice and dice as I need to.

It's a super useful tool.

reply

upvote

by rsalus16 hours ago|

[-]

mirrors my own experience creating a persistent event log. I started with JSON, then JSONL, etc until finally landing on SQLite.

reply

upvote

by chaps21 hours ago|

[-]

The moment my JSON has any sort of depth and I need to write a parser for it and potentially account for unspecified behavior. JSON's nice when it's nice, but it's terrible when it's terrible. It's 100x easier to write SQL than writing jq and... dear god if I have to use grep -A or -B, I'm doing something wrong. Constraints are actually a good thing!

The underlying database isn't the most important thing. Just use SQL. Its namespacing (eg, through CTEs) is good and you're more likely to have colleagues who know SQL compared to jq.

reply

upvote

by sofixa9 hours ago|

[-]

> It's 100x easier to write SQL than writing jq and... dear god if I have to use grep -A or -B, I'm doing something wrong. Constraints are actually a good thing!

As an occasional consumer of JSON/CSV, that's why I really like DuckDB, it's just SQL for such file formats. And it manages to be super fast at it too.

reply

upvote

by gopalv20 hours ago|

[-]

> an example of a case where you'd use SQLite instead of jq or grep through Markdown?

Usually we end up writing a script to incrementally refresh a data-set I'm analyzing (or have someone send me a copy after they pull it).

I've been using sqlite for anything which needs an UPDATE - modifying a row deep inside the data-set with jsonl is a pain.

My github is full of java programs which update sqlite3 files with threadpools and a single big lock around the UPDATE (& then I write or have an agent write code to analyze it).

DuckDB is slowly replacing it in the context of python, simply because of the ease of pushing a UDF into the SQL.

Also because I really like expressing things as LEAD/LAG with a UDF on top.

reply

upvote

by dogline18 hours ago|

[-]

UDF: User Defined Function

reply

upvote

by pokstad18 hours ago|

[-]

SQLite is more efficient for large data sets. A single markdown or JSON file needs to be streamed to locate a piece of data O(n). Updating an existing entry in a sequential file is even worse because you have to rewrite the file. SQLite has the data structures to quickly find data in O(log n) time.

reply

upvote

by fragmede21 hours ago|

[-]

Honest answer is: whenever your markdown or json files get to be big enough that grep/jq takes long enough that you get bored waiting for it.

reply

upvote

by embedding-shape20 hours ago|

[-]

> get to be big enough that grep/jq takes long enough

On a modern processor, that's about GBs of data typically, right?

reply

upvote

by bitexploder18 hours ago|

[-]

Practically yes, but much earlier if agents are touching that data in my experience. Tens of GB even if you design well.

reply

upvote

by rick129021 hours ago|

[-]

Interesting about the files vs db approach. I have been going back and fourth. I landed on db as well.

reply

upvote

by password43215 hours ago|

[-]

> It saves a lot of tokens when an agent can just query a specific row instead of having to fire up jq or grep through markdown

Just wanted to make sure no one missed this point in your comment because eventually users will be paying the full cost for tokens instead of VC's paying, with GitHub Copilot's pricing realignment leading the way.

reply

upvote

by utopiah9 hours ago|

[-]

The cycle of expertise :

- what is X, I just do Y

- wow I can see so many limits of Y, now I do X

- I use X for literally everything

- now that I properly understand the limits of Y but also the heavy constraints of X ... maybe Y is enough

- I use Y for literally everything

rinse & repeat. The thing is with actual usage and actual context one does learn and thus can get away with a lot more "basic" solution but it does require genuinely understanding the limits.

reply

upvote

by dmos628 hours ago|

[-]

Persistence in folly leads a fool to enlightenment.

reply

upvote

by 5 hours ago|

[-]

deleted

reply

upvote

by stingraycharles4 hours ago|

[-]

Yup, when I look back at the silly stuff I did when I was somewhere in the middle (CQRS + event sourcing I’m looking at you), it’s interesting.

It is a source of expertise, because you really learn a lot from it. But when you become old (43 over here), you really learn to appreciate “boring” solutions.

reply

upvote

by roryirvine1 hours ago|

[-]

You also begin to recognise that the definition of "boring" changes over time, and - if you wait long enough - fashions begin to repeat themselves.

So, xBase was all you needed in the mid 80s. Then DBM was all you needed in the mid 2000s. Now, in the mid 2020s, we're told that it's SQLite that is all you need.

It was partly true then, and partly true now. But the full story's always been more complicated, so it's still worth considering a range of potential solutions rather than relying on simplistic rules of thumb or slogans.

(Wake me when nosql comes back into fashion, I'll be able to do a great "old man yells at clouds" routine about that one...)

reply

upvote

by utopiah4 hours ago|

[-]

Boring is the new sexy. /wisdom

reply

upvote

by levkk22 hours ago|

[-]

I don't understand this obsession with SQLite for real, production apps. SQLite is an embedded database, completely unsuitable for managing concurrency. This is what database _servers_ are for, e.g., Postgres, MySQL, etc. Their entire job is to allow you to modify data from multiple processes, on different machines, at the same time.

This is a foundational principle of computer science. It seems to me that the "SQLite for everything" crowd is a little bit inexperienced.

reply

upvote

by jph0021 hours ago|

[-]

You seem to have a rather limited understanding of what kinds of concurrency exist and how those needs are best met. Whether something is a server or not is not very relevant to this discussion.

SQLite is an excellent production db for many real world workloads, as has been widely documented. It is very different to Postgres, so requires learning a whole new skill set.

One way to think about it is that SQLite can work well for the parts of your system where there is naturally strong partitioning.

reply

upvote

by tasuki20 hours ago|

[-]

> SQLite can work well for the parts of your system where there is naturally strong partitioning.

Or the parts of your system that don't have big data and no need for massively concurrent writes. And that's the vast majority of systems!

reply

upvote

by MattJ10018 hours ago|

[-]

You can do big data in SQLite. Concurrent writes, sure, I'd recommend something else.

If you think the majority of systems require massively concurrent writes, I think you need to look a bit harder. SQLite is, after all, the most widely deployed database system, ever.

reply

upvote

by sharts10 hours ago|

[-]

Internet Explorer 6 was the most widely deployed awesome piece of software. Those that hated it need to look a bit harder.

reply

upvote

by Gud9 hours ago|

[-]

It was not really “deployed” by a lot of people in the same sense.

It was forced upon most of us(not me, I used BeOS then Debian then FreeBSD).

I deployed phoenix.

reply

upvote

by HighGoldstein4 hours ago|

[-]

The reason SQLite is the most deployed is that it's used by Android.

reply

upvote

by velcrovan1 hours ago|

[-]

…and iOS, and Windows, and Mac OS, and Boeing, and Sony, and Firefox and Chrome and Safari…

reply

upvote

by HighGoldstein1 hours ago|

[-]

Yes, which goes in line with the argument that claiming that it's "the most deployed" as proof of superiority or suitability for any use case is equivalent to claiming the same for Internet Explorer. It's the most deployed because it's bundled in a lot of systems, not because people are purposefully using it as a DBMS.

reply

upvote

by therealdrag018 hours ago|

[-]

It’s widely deployed as a local DB for local apps like phones, desktops, and web browsers. But it’s not the most used for distributed, concurrent web apps, which db servers were designed for. Maybe people are talking past each other, but that’s the debate I see.

reply

upvote

by kellogah17 hours ago|

[-]

Not sure why there’s a debate at all. The discussion is on using SQLite instead of jq and markdown files. People got lost on a tangent! :)

reply

upvote

by gnabgib17 hours ago|

[-]

No it's not.. the context of other threads on this post (which mention jq) do not apply here. How poorly coded are you?

reply

upvote

by kellogah16 hours ago|

[-]

Like how poorly did Jesus code my DNA? Ask him. By him I mean ChatGPT Jesus mode.

reply

upvote

by petre12 hours ago|

[-]

We recently just partitioned the data into many SQLite databases and got away with it. It's telemetry data from IoT devices: one device, one database. Backups are an easy rsync job now instead of streaming a multi gigabyte database with compression that take hours. Reporting will just open each database and aggregate multi device data into another database (Duckdb, SQLite or something else, we'll see). Duckdb is not readable when locked so it's probably also going to be SQLite. Even it it's going to spit out JSON it will go into SQLite rows instead of many files.

reply

upvote

by datadrivenangel11 hours ago|

[-]

Check out Quack for DuckDB.

reply

upvote

by leowoo914 hours ago|

[-]

What additional skill set do you need to "learn" for SQLite? Copying files around?

reply

upvote

by apatheticonion14 hours ago|

[-]

For me, I have a use case that needs to support a few thousand users, probably a few hundred concurrently.

The combination of SQLite (libsql, a concurrent implementation of sqlite) and Rust means I can do so from a $2/m VPS and a single server instance.

Backups are done via a cron job that uploads to S3.

Does it pass the "Netflix scale" test? No

But it doesn't need to. I'm not profiting from the service and SQLite offers a path to scale if/when ready because... well it's just SQL and I can literally just swap `libsql::Connection` with `psql::Connection` in my repositories.

Plus upgrading from a $2/m VPS to a $10/month VPS quadripples the number of concurrent users I can support.

IMO, you can vertically scale extraordinary far with SQlite and an efficient server implementation.

I'd wager that 90% of forum websites, wordpress sites and online shops would be fine with SQLite.

reply

upvote

by Rohansi13 hours ago|

[-]

> The combination of SQLite (libsql, a concurrent implementation of sqlite) and Rust means I can do so from a $2/m VPS and a single server instance.

You can probably do it with regular SQLite, too. Being limited to a single writer isn't as devastating as it sounds when they get processed very quickly. Probably don't need Rust either but it'll be more efficient than the usual choices.

(Also, it looks like libsql is the same as SQLite? Only Turso has concurrent writes)

reply

upvote

by franga200011 hours ago|

[-]

And I don't understand the obsession with server-based databases for single apps. Especially in containerised setups, every "app" gets its own database anyways, and if the app is further broken down into services, they usually communicate between each other and not with a shared database. So in those cases, what do you gain by pulling the database out of the "process" and onto the other end of a socket? In most cases, absolutely nothing. So why bother?

Don't get me wrong, I've worked with plenty of server-based databases, including proper dedicated database servers. It's great tech and often the best tool for the job. But not always and I'd argue not in the majority of uses.

reply

upvote

by stingraycharles9 hours ago|

[-]

“Especially in containerised setups, every "app" gets its own database anyways, and if the app is further broken down into services, they usually communicate between each other and not with a shared database. “

You seem to be talking about a vastly different use case.

Containerized apps having their own database? What? Aren’t these types of containers stateless? I always very much try to keep state out of app containers.

What kind of data storage are we talking about?

reply

upvote

by franga20007 hours ago|

[-]

If an app needs a database, it gets a database server container, instead of getting a user and database on a shared database server as things used to be done. Every little django app has its own postgres container. Every wordpress site gets its own mysql container. That is the modern way.

Those database containers get a PVC/volume/mount for their data dirs. The only thing ever connecting to them is their "owner" application container. So at that point, why not drop the postgres container and PVC mount a sqlite directory in the app container? The result is the same.

reply

upvote

by jappgar4 hours ago|

[-]

And when you need to scale to thousands of instances of your microservice?

reply

upvote

by stingraycharles4 hours ago|

[-]

Yeah this is the part I don’t get. It seems like people are talking about 1 distinct app = 1 container and this is the new normal? We’re back to managing cows instead of cattle again?

reply

upvote

by jappgar1 hours ago|

[-]

I just think a lot of people here haven't ever worked on large scale systems. They don't know what the don't know.

reply

upvote

by graerg2 hours ago|

[-]

That's the whole thesis; YAGNI.

reply

upvote

by mxey6 hours ago|

[-]

Yes if you run a database server like an embedded application database, then it won’t be very different from an embedded application database.

reply

upvote

by HDThoreaun8 hours ago|

[-]

Every container gets its own database?

reply

upvote

by franga20007 hours ago|

[-]

Yes? Well, every "app", as I quite explicity wrote. Look up the docker compose file or helm chart for basically any app. I'm running dozens of apps, each with their own postgres, redis and nginx containers alongside the main application server. That's what the stack is designed for.

reply

upvote

by mxey6 hours ago|

[-]

The Compose file is written like that so you can quickly try the app without setting up extra dependencies. Usually not for production use.

Especially since in production you might want to scale the parts separately. I like to have a Postgres cluster to connect where backup is already handled, and the app then doesn’t have any persistent data, doesn’t need any network volume mounts.

reply

upvote

by abtinf21 hours ago|

[-]

There are many cases where SQLite + concurrent front end (like a go net/http server) can handle all the load that a service might ever conceivably have to handle, especially if allowed to scale up hardware over time. You can trivially scale up SQLite to, what, hundreds of thousands of tps?

The only thing you really give up is HA/failover and DR. But there are solutions to deal with those. And single-server systems are generally surprisingly robust (since, in the absence of very complex control planes, uptime goes down with more systems).

reply

upvote

by ummonk12 hours ago|

[-]

Why go through the trouble of shoehorning SQLite into a cloud database by getting solutions for HA/failover and DR, when you can just use Postgres off the shelf?

reply

upvote

by jappgar4 hours ago|

[-]

So you can post about it on HN, obviously

reply

upvote

by kellogah17 hours ago|

[-]

I was thinking of using SQLite on top of k3s/Longhorn to replicate it. Anyone do something similar? Folks mention light steam and aws but Jeff Bezos’s biceps are too much for me to handle.

reply

upvote

by andix16 hours ago|

[-]

A longhorn volume can only be attached to one node at at time. It can share it with other nodes over nfs. I don't think this is going to scale well.

Just use Postgres with ro replicas.

reply

upvote

by horsawlarway15 hours ago|

[-]

I'll echo the other response.

I've had pretty terrible experiences with SQLite and Longhorn/NFS.

It's just not the right database for pretty much ANY network based filesystem, where the locking primatives aren't as robust, and you might get two processes trying to hit it at the same time.

Frankly - they say this themselves: https://sqlite.org/howtocorrupt.html

As someone who runs a fairly big personal cluster backed by a mix of giant NFS storage for media, and relatively large longhorn SSD drives for configs/temp data...

I avoid sqlite backing like the plague. It will get corrupted. Period. It's not the db for this use-case, and I'll take postgres/maria/mysql/mongo/ANYTHING else over it.

If you do it - back it up ALL THE TIME, because it's going to get corrupted.

reply

upvote

by jrockway13 hours ago|

[-]

There is something appealing about "it's just a file" (it really isn't; it has locks and a WAL), but I agree with you.

I think people are afraid to read the documentation for postgres. You can start it up in milliseconds. Fast enough and light enough to run one copy for every test case in your test suite, or whatever you're using it for. (mkdir /tmp/whatever; initdb -D /tmp/whatever --no-instructions -A reject -c listen_addresses= --auth-local=trust --no-sync -c fsync=off -c unix_socket_directories=/tmp/whatever -U postgres --no-locale; postgres -D /tmp/whatever) Now you have a test database that behaves exactly like production because it's exactly like production. (OK, turning fsync off makes it a lot faster than production, so be careful.)

reply

upvote

by tucnak13 hours ago|

[-]

> I think people are afraid to read the documentation for postgres.

Postgres may introduce a single-file embedded filesystem because what the hell, but the irony is all these guys won't even notice it. The same people that say Postgres backups are too hard.

reply

upvote

by peterspath21 hours ago|

[-]

That’s why there are billions of SQLite databases right?

SQLite is likely used more than all other database engines combined. Billions and billions of copies of SQLite exist in the wild. SQLite is found in:

Every Android device Every iPhone and iOS device Every Mac Every Windows 10/11 installation Every Firefox, Chrome, and Safari web browser Every instance of Skype Every instance of iTunes Every Dropbox client Every TurboTax and QuickBooks PHP and Python Most television sets and set-top cable boxes Most automotive multimedia systems Countless millions of other applications

https://sqlite.org/mostdeployed.html

reply

upvote

by mr_toad21 hours ago|

[-]

That’s a comprehensive list of single user devices.

reply

upvote

by ibejoeb20 hours ago|

[-]

Single-user, a single natural person, doesn't striclty mean single-accessor though. I don't think anyone here is suggesting that sqlite is a viable replacement a for any networked client/server postgresql system, but it is certainly capable of handling more than the most basic 1:1 tasks. Beyond that, the point is that you only need a file, so when you have natural data boundaries, a lot of problems decompose to that single user/single concern paradigm.

reply

upvote

by larubbio20 hours ago|

[-]

'production' doesn't equal 'multi-user concurrent access'. There are production uses where sqlite is a valid choice even if it may not be the best choice for multi-user production use cases.

reply

upvote

by therealdrag018 hours ago|

[-]

strawman? I have seen a dozens of these debates and never once have I seen someone questioned the validity of it for embedded usecases.

reply

upvote

by ksd48221 hours ago|

[-]

levkk is talking about concurrency. The list you gave doesn't explain high concurrency requirements for usage.

reply

upvote

by rpdillon20 hours ago|

[-]

My read is that levkk is conflating concurrency with "real production apps" and this whole thread is starting to surface that "real production apps" and "high concurrency" are not measuring the same thing at all.

Sqlite is used in real production apps more than any other database.

Sqlite is also weak at any sort of write concurrency.

Both can be true.

reply

upvote

by sieabahlpark14 hours ago|

[-]

[dead]

reply

upvote

by DANmode1 hours ago|

[-]

Why doesn’t each of your users have a SQLite database writing up to a main?

You can have as many as you want - and one is often plenty.

reply

upvote

by pibaker21 hours ago|

[-]

GP calls out concurrency as a weakness of SQLite. Most of the examples here don't experience the same load even a moderately sized web service experience day to day.

And no, being a part of the python standard library doesn't means it is being used by the average python user. These days I'd say at least half of them are just there for machine learning.

reply

upvote

by OutOfHere18 hours ago|

[-]

SQLite is good for read-concurrency, not great for write-concurrency.

reply

upvote

by simonw18 hours ago|

[-]

SQLite requires writes run sequentially. Most SQLite write operations take single digit milliseconds or even microseconds. If your writes are inexpensive (inserting or updating single small rows) you'll probably never even notice the queue.

reply

upvote

by zarzavat7 hours ago|

[-]

Exactly, people confuse "doesn't scale" with "is a bottleneck". There's many applications whereby hitting the limits of SQLite is either a physical impossibility, or implies that the application has achieved success such that replacing SQLite is the least of anyone's problems.

I visited a piano store once that was running everything off MS Access. If only they had switched to HA technologies, they would be able to sell millions of pianos a day!

reply

upvote

by slashdave11 hours ago|

[-]

I mean... if you count flat files as "databases", there are a heck of a lot more

reply

upvote

by petcat21 hours ago|

[-]

sqlite is great for the contacts app on your phone, but that's it.

Hipp even said that it is not a replacement for a real multi-user, concurrent RDMS. Its primary competitor is "fsync".

reply

upvote

by Rohansi12 hours ago|

[-]

SQLite is able to handle tens of thousands of write transactions per second on modern hardware. That is probably similar to or more than your real, multi-user, concurrent RDBMS.

reply

upvote

by arendtio7 hours ago|

[-]

> SQLite for everything

is just wrong, and I don't think that the SQLite fans are that crowd. Taking a database server for everything is probably possible, but often unnecessary. With experience, one can properly judge when SQLite is sufficient and when it is not.

So arguing that the SQLite crowd is inexperienced feels weird, because inexperienced people have a much harder time judging when to use what and can just use the database server all the time (even when it is overkill).

reply

upvote

by rpdillon21 hours ago|

[-]

Sqlite is good for lots of stuff, but you're probably focusing your days on high-scale webapps that want sharding with networked DBs. That's one domain, and an interesting one, but there are lots of others.

I'm a big fan of re-evaluating prior "best practices" in light of technology changes, especially in ways that improve simplicity. Running my family's social media site off a single sqlite DB on a VPS is great. ~15 users, almost zero maintenance. I run my FreshRSS instance off of sqlite, as well as my "now" page. At work, I used sqlite for all kinds of things over the past decades: as an ad hoc job queue, as a quick way to ingest and query lots of logs locally, and present/filter in realtime with simonw's excellent https://github.com/simonw/datasette.

I don't think it's every "sqlite for everything" as much as it is "sqlite in lots of places you probably didn't think to apply it."

kentonv/Cloudflare's work on sqlite at the edge might have made this thinking a bit more popular, but it was always around. https://blog.cloudflare.com/sqlite-in-durable-objects/

I suspect being aware of all those little neat cases and wanting to leverage sqlite for them may be an indicator of experience, rather than the opposite.

reply

upvote

by droidjj20 hours ago|

[-]

> Running my family's social media site off a single sqlite DB on a VPS is great. ~15 users, almost zero maintenance.

Details, please!

reply

upvote

by jappgar4 hours ago|

[-]

It's touted by the people who use the word "just" a lot.

"Just use postgres" "Just use sqlite" "Juse use a monolith" "Just use sftp" "Just use an ec2 instance"

Usually these people have flunked out of the school of (distributed system) hard knocks. They couldn't hack it and are retreating to familiar.

The funny part is when one of those people fluke themselves into senior management when their saas takes off.

Inevitably they have to suck it up and hire experts in the same technologies that "no one needs".

reply

upvote

by yesb45 minutes ago|

[-]

That may exist but the opposite type of irrationality is much more common.

Scalability = success. We need to be "scalable" because that means we're successful right? Scalability = real engineering. I'm a real engineer so I need to design everything to be "scalable" because I'm so smart

>The funny part is when one of those people fluke themselves into senior management when their saas takes off.

>Inevitably they have to suck it up and hire experts in the same technologies that "no one needs".

Sounds like they were the wise ones to build something simple that achieved a high level of success.

reply

upvote

by andersmurphy18 hours ago|

[-]

Thing is SQLite scales better than both those network databases [1] if you're prepared to stick with one big machine (+ a standby).

This is even more obvious when you start doing transactions processing an row locks across the network limit you to 1-3k TPS that you cannot scale out of (Pareto distribution is merciless).

[1] - https://andersmurphy.com/2025/12/02/100000-tps-over-a-billio...

reply

upvote

by lll-o-lll16 hours ago|

[-]

Seeing as I can get about 200K TPS from a networked DB in my environment, I have to question your setup here.

In the real world we are looking at things like RPO (recovery point objective) and RTO (recovery time objective). You need to consider HA and DR. It’s in these areas where SQLite does not scale.

That’s why I struggle to see the fit for SQLite in any sort of multi-user server environment. If you need the data to be durable, then the bigger DB’s have the tools. If you don’t need the data to be durable, just keep it in memory. I’m sure there are niches I am missing.

reply

upvote

by andersmurphy7 hours ago|

[-]

In this demo each T in TPS is two updates over a billion rows and most importantly skewing high on row lock contention. On a 5 year old macbook, using a dynamic language. Isolation level serializable and synchronous full (so max durability).

You can definitely go faster over less data doing single inserts on a better stack, with weaker guarantees.

RPO litestream even in it's default settings gives you point in time streaming backups to the second, which is considerably better than what RDS five minutes. So the funny thing is the durability guarantees are worse with the "bigger DBs".

RTO again you can have a standby that's warm with a copy of the data through litestream. Regional sharding also becomes trivial.

It's a solid set up for a lot of products/apps. Postgres is still fine if you want things like roles and permissions etc. Or if you don't have experience getting the most out of sqlite.

reply

upvote

by evdubs11 hours ago|

[-]

Wow, what an apples and aliens comparison. You add a bunch of transaction delays to your postgresql case because you can access a database over a network, but you use transaction batching for sqlite? Maybe just compare a local postgresql with/without batching to a local sqlite with/without batching to be much less misleading.

reply

upvote

by andersmurphy9 hours ago|

[-]

Because local postgres is a bad time unlesss it's the only thing running on the server. Even then sqlite will smoke postgres (even with unix sockets).

The point is to survive the Pareto row locking problem you need to move away from a network database (if you want to still have interactive transactions). The network part is the main point of a network database, once you drop that there's not much pointing sticking with the added complexity unless there's another feature you really need.

reply

upvote

by slashdave11 hours ago|

[-]

You know you can host a database like Postgres on the same machine, right?

reply

upvote

by andersmurphy9 hours ago|

[-]

Yes, it's still slower on the same machine, even with unix domain sockets.

It doesn't play nice with other things running with it in practice. JVM and postgres on the same box is a textbook bad time.

reply

upvote

by lanstin21 hours ago|

[-]

I had very good results giving 1 SQL DB per go routine, so the accesses were serialized up front, on a very high volume (130K requests/second) service. Exact transactionality was not a product goal, and the SQLite was just to backup the in memory state. If we lost a little due to abend or something, that was ok (although for normal maintenance it caught SIGTERM and stopped the listen and then waited for in flight calls and then flushed the remaining changes to SQLite; then on startup it would read the SQLite into memory to populate before taking the listen; persistent storage across container runs, and never both reads and writes to the same file at the same time. (It also just closed the DB and opened a new one when it hit some limit of rows, so as not to fill the disk; the max size of the SQLite corresponded to the max size of the LRU map being served from in memory; then it just flipped A / B between "a full memory worth of data stored" and "the currently updating state." A lot easier than having to write out proto bufs to disk or whatever I would have done for transient (during restarts/maintenance) persistence.

reply

upvote

by petcat20 hours ago|

[-]

Woof. That sounds very complicated. If you need that kind of write concurrency, use an unlogged table in postgres [0]. Then you don't have to invent a whole sharded thing yourself.

[0] https://www.postgresql.org/docs/current/sql-createtable.html...

reply

upvote

by nemothekid19 hours ago|

[-]

There are so many unfortunate footguns with unlogged tables, that I'd argue that the goroutine route is preferable.

reply

upvote

by petcat17 hours ago|

[-]

What are the "footguns" with unlogged tables in Postgres?

reply

upvote

by nemothekid25 minutes ago|

[-]

1. If postgres shutsdowns uncleanly, your entire table is truncated; you lose everything.

2. You should check if your backup method backs up unlogged tables. For example, RDS Snapshots on AWS do not backup unlogged tables.

These 2 are a double whammy where if you aren't aware of these tradeoffs you can find that a bad restart has deleted all your data, plus your unlogged tables were never backed up.

reply

upvote

by jeltz19 hours ago|

[-]

Such as?

reply

upvote

by glzone111 hours ago|

[-]

Running postgresql is an order of magnitude more complicated than sqlite.

130k tps even with unlogged is not always super easy especially if getting hit concurrently. Postgresql connection overhead alone can be pretty brutal if you are setting up and tearing down connections or have 1,000 writers etc.

Postgresql generally requires good network connectivity. Folks doing sqlite distributed tend to have everything independent, you literally don't need to worry about connection / security / firewall / permissioning / internode escape or data leaking etc, can even have problems in local side networking and services can still serve.

reply

upvote

by jklmnopqrstuvw6 hours ago|

[-]

even with wal, postgresql can easily reach 130k tps in pipeline mode.

reply

upvote

by lanstin1 hours ago|

[-]

That was per container, with 16 containers per data center, so would be a lot of DBA tickets to get something that large; SQLite scaled with the horizontal scaling of the app; and we did have a flaky network - something like one in 100,000 tcp connections would fail. And occasionally the whole network would just go away for a number seconds. And the persistent container storage was managed by the same storage team that managed storage for the DB team, so base scalability and availability high.

reply

upvote

by bambax6 hours ago|

[-]

> SQLite is an embedded database

Yes, but that's not its main selling point. An SQLite database is also a single file, which makes it incredibly easy to replicate, backup, transfer, restore, etc.

reply

upvote

by mxey6 hours ago|

[-]

SQLite in WAL mode which you want for server apps is multiple files.

Files which you cannot just copy while your application is running if you want a correct backup.

reply

upvote

by bambax6 hours ago|

[-]

Vacuum into or .backup work perfectly with a running, WAL enabled db.

reply

upvote

by mxey5 hours ago|

[-]

At which point there’s little difference from any other database’s backup commands.

reply

upvote

by bambax5 hours ago|

[-]

I don't think you can install "any other database" by pasting one file in a direcory somewhere? Even if you can produce such a backup with the same command.

reply

upvote

by roryirvine39 minutes ago|

[-]

Pretty much every embedded database since about 1988 has worked like that.

You say that being an embedded database isn't the main selling point, being contained within a single file is. But that's a completely normal feature of an embedded db, to the point that the one implies the other.

reply

upvote

by zaptheimpaler12 hours ago|

[-]

Personally I like Postgres for this reason too. Its extremely easy to run with Docker, I can dump data from all kinds of apps in there and I know it's not going to take any rearchitecting as soon as I need multiple concurrent writers.

I think docker is still super underappreciated so setting up any kind of server is seen as a chore. In my eyes it makes running tons of services like this very easy, so ill take the extra functionality, extensibility etc of postgres.

reply

upvote

by emehex18 hours ago|

[-]

I think you'd be surprised to learn how many real production apps are actually running on top of SQLite (by way of Cloudflare D1).

reply

upvote

by therealdrag018 hours ago|

[-]

Many DB servers are built upon embedded DB primitives (like RocksDB), that doesn’t mean the primitives are sufficient on their own.

reply

upvote

by emehex3 hours ago|

[-]

I'm not sure what this has to do with my comment? D1 is pretty much sufficient on it's own...

reply

upvote

by therealdrag03 hours ago|

[-]

My point is D1 is not sqlite, it’s a serverized architecture of it, including building things like replication, etc.

Plus, D1 has a 10gb limit which is wild to call “sufficient”.

reply

upvote

by UltraSane30 minutes ago|

[-]

SQLite also gets really slow at around 50 million rows.

reply

upvote

by O3marchnative21 hours ago|

[-]

> This is a foundational principle of computer science

How exactly is this a foundational principle of computer science?

reply

upvote

by jpollock18 hours ago|

[-]

If your data is naturally sharded (users) with writes happening within a single shard, parallelism becomes easy. The request is routed to the shard hosting the user's data and reads/writes locally.

This makes scalability _much_ easier to reason about. It's cut-paste, cut-paste. Every N users needs another shard.

It does buy you a _different_ set of problems, like cross-shard querying (analytics) and how to do load leveling as users age out.

But it avoids the whole shared index scaling problems from inserts/updates with large user counts.

It becomes a hierarchical instead of a relational database.

reply

upvote

by 4 hours ago|

[-]

deleted

reply

upvote

by rsalus16 hours ago|

[-]

there is a difference between concurrency in a distributed environment and concurrency on a single machine across processes. SQLite is incredibly useful for the latter.

you seem like the inexperienced one to me..

reply

upvote

by zaptheimpaler12 hours ago|

[-]

SQLite does not support concurrent writes at all (on a single machine), a single writer process locks the entire database.

reply

upvote

by slashdave11 hours ago|

[-]

> you seem like the inexperienced one to me

There is irony here

reply

upvote

by bborud21 hours ago|

[-]

Computer science no more get its hands dirty with concrete software than physics primarily being about building bridges.

It is not «a foundational principle of computer science».

reply

upvote

by mxey6 hours ago|

[-]

I think the SQLite website itself says it best:

> SQLite does not compete with client/server databases. SQLite competes with fopen().

reply

upvote

by 4 hours ago|

[-]

deleted

reply

upvote

by meszmate11 hours ago|

[-]

Most apps do not actually need the concurrency capacity that Postgres or MySQL are designed for.

reply

upvote

by BoredPositron21 hours ago|

[-]

I worked on an app that had sqlite databases per user... it was fine.

reply

upvote

by sevenzero21 hours ago|

[-]

Isn't concurrency also limited by your machines disk speed for writes, what difference does it make if you write sequentially vs concurrently? Why does concurrency even matter for databases?

reply

upvote

by malisper21 hours ago|

[-]

> Isn't concurrency also limited by your machines disk speed for writes, what difference does it make if you write sequentially vs concurrently? Why does concurrency even matter for databases?

For a simplified example, having three processes reading blocks X, Y, Z in parallel is much faster than having a single process read block X, wait for the read to finish, read block Y, wait for the read to finish, read block Z and wait for the read to finish.

reply

upvote

by refulgentis21 hours ago|

[-]

> Isn't concurrency also limited by your machines disk speed for writes

Yes, in theory: given a large enough database, and a disk that can only do one operation at a time, and a large enough operation that touches enough of the database. In practice, in a SQLite single tenant scenario? No, not at all.

> what difference does it make if you write sequentially vs concurrently. Why does concurrency even matter for databases?

As soon as your codebase involves reacting to events independently of a user taking action it becomes a practical concern. Generally, this is a broad question and has 1,000,000 answers.

EDIT: Originally I had "I think you understand generally, no?" appended but realized that's not helpful at all, if you did, you wouldn't be asking.

Something that may help is imagining what'd happen if a DB wasn't thread safe / didn't allow multiple writers. Ex. in SQLite's case, it allows multiple write operations to take place but there's a one-at-a-time queue. If we didn't have databases that were able to execute multiple writes simultaneously, you'd need a separate database for each concurrent writer you expect, and you'd effectively have a global lock. Orderly scaling would be ~impossible unless you did something crazy like have a single server per user

reply

upvote

by sevenzero21 hours ago|

[-]

I guess I need to dive deeper into this as I do not understand the implications you gave me, but I appreciate the attempt. Generally I understand why concurrency is good in many cases, I just dont get why its important for database stuff too.

Edit: thanks for clarifying in the edit, makes a lot more sense.

reply

upvote

by strbean21 hours ago|

[-]

Imagine if every tweet had to go through a one-at-a-time queue before being persisted. There's about 6000 tweets per second, so you would have to be able to save them at <0.17ms per tweet or else you would become backlogged. If you are getting backlogged, you have to buffer those incoming tweets somewhere until they can be writted, and eventually that buffer gets full and you start losing tweets.

reply

upvote

by goobatrooba19 hours ago|

[-]

Maybe that too is a native question, but there's a large scale between single user and 6000 tweets per second - most of our apps will never reach anything approaching even one save a second. So where to draw the line? I do far have gone the sqlite route for my hobby apps as it's so easy to handle and doesn't require setting up two docker containers for a single app. Am I drawing myself in a corner in case my apps ever do become relevant?

reply

upvote

by refulgentis18 hours ago|

[-]

Excellent question, and I spent so many years asking myself it, this over and over. You asking it made me realize I just...don't anymore. So allow me to blather a bit / free associate because I won't be sure why myself until I've written it out.

TL;DR: whatever works for you is the right decision. (which isn't helpful, I heard this so many times and as the recipient, I thought "That's nice. Now how do I choose what works for me?")

I finally had to use Postgres a couple years ago after a career of only SQLite - startup founder & iOS app developer using SQLite, turned Googler on Android, turned doing-my-own-thing.

In retrospect, I have made only one bad decision:

I went way out of my way to make SQLite work at my 2009-iOS-startup. It was a restaurant point of sale system, and to allow a networked system, one of the iOS devices would act as a server. This was a really cool trick, even an advantage in marketing that was appreciated by users. It meant the restaurant could continue to operate if the internet went down. But it eventually became clear owners loved having internet-based access too, ex. to do reporting/financial analysis over the data. And I kept contorting, instead of moving past my fear of getting into things I didn’t know, I instead did some like rudimentary thing over port forwarding. The bad decision here was riding one horse for so long and letting it affect the product, having a real server database would have allowed for a lot more features, think, first party gift cards, and a 100 others.

After leaving Google I needed server-side storage and fought and fought to avoid it. Then it turned out Postgres is easy and, just like SQLite, 99.999% of the time I don’t even know I’m using it.

In retrospect, there’s ~0 switching cost to these, particularly in age of LLMs. If you do need something more one day, it’ll be easy to do, and if you have to do it in a rush because you’re successful, you’re in Good Problem territory.

Hope that helped, after writing it out, dunno how convincing it is. Feel free to follow up, I appreciate the curiosity/framing because I had the same thought for so long.

reply

upvote

by password43217 hours ago|

[-]

Thank you for sharing a detailed anecdote from production; there's not many of those around here.

reply

upvote

by AlotOfReading18 hours ago|

[-]

If we imagine 1 tweet = 1 transaction, that's only 6k tps. 6k tps is completely achievable, dare I say even pedestrian for an optimized database. And most systems are operating far below the scale of Twitter/X.

reply

upvote

by Scaevolus18 hours ago|

[-]

Sqlite can quite easily do 5000+ insert+commits per second on typical NVMe drives.

Speed is rarely the constraint that makes it unsuitable for an application.

reply

upvote

by klabb315 hours ago|

[-]

Round trip time is actually much faster than Postgres, since there’s no need to touch the network. You can get massive single threaded throughput. In order to achieve comparable throughput in Postgres you need a large amount of concurrent connections, since each conn spends most of its time passing messages, deserializing etc (with a much larger total amount of overhead). There are a surprising amount of bottlenecks and misconfiguration that can tank performance of networked systems, particularly DBs.

Like you suggest, the reason for not picking SQLite is not reliability, speed, etc. Networked DBs allow decoupling between app and db servers, which have operationally different characteristics. But most importantly, you can have multiple apps access the same DB at the same time. Eg analytics, one off queries, any 3p app that interacts with your data directly.

reply

upvote

by sevenzero20 hours ago|

[-]

While I understand your point and like the explanation, I gotta make the joke that some Tweets should be lost

reply

upvote

by teaearlgraycold22 hours ago|

[-]

Well if you run a tiny single-threaded app then SQLite is a nice simplification over spinning up a separate machine for Postgres.

reply

upvote

by ai_fry_ur_brain21 hours ago|

[-]

I use postgres for very simple apps. I have a Dockerfile I use in my boilerplate repo. It takes a single make cmd for me to build, start and run migrations. Its as simple as using sqlite.

reply

upvote

by tasuki19 hours ago|

[-]

But now you have another process to babysit. How do you keep it healthy? And you have to ensure the client-server communication won't break.

For me the main benefit of sqlite is that it's a library rather than an app.

reply

upvote

by not_kurt_godel17 hours ago|

[-]

> But now you have another process to babysit. How do you keep it healthy?

I've been assured by many HN users that running apps/sites on a single VPS requires near-zero maintenance or monitoring to achieve acceptable uptime 24/7/365 for years on end, sooooo...just pretend it will never fail like your main server process?

reply

upvote

by ai_fry_ur_brain16 hours ago|

[-]

Ive been assured by many HN users that you must have 24/7/365 uptime for everything in case one of your 10 bi-monthly users decides to log on.

reply

upvote

by not_kurt_godel15 hours ago|

[-]

Call me old-fashioned and quaint, but I don't like to build software that doesn't work all the time if I can help it, whether it's for 10 users or 10 million.

reply

upvote

by bdangubic16 hours ago|

[-]

24/7/365 is needed (or achieved) just about never. our big tech is proving 90% will soon be utopia as well. being down has always been fine for 99.999975% of all projects on the planet.

reply

upvote

by not_kurt_godel15 hours ago|

[-]

Ok, now tell me the stat by percentage of overall market revenue rather than project count

reply

upvote

by ai_fry_ur_brain19 hours ago|

[-]

I have boilerplate for client-server communication that makes it pretty trivial to build on top of.

Im not saying that sqlite isn't useful, im mostly saying that using postgres doesnt have to be complicated.

reply

upvote

by turtlebits19 hours ago|

[-]

Its 2x the infra. You have to manage an additional process, auth, backups, logging, etc.

reply

upvote

by eterm20 hours ago|

[-]

Or you can run postgres on the same machine as the application, which lets you much more easily migrate if the time comes when you need to scale to multiple application servers.

There's a world between "local file" and "network DB server", running a DB server locally has lots of benefits from being able to easily query from outside if needed to forcing you to consider concurrency without the latency overhead of a network hop.

reply

upvote

by s_ting76520 hours ago|

[-]

This decision tree doesn't make much sense to me. Why you someone forego performance today in favor of adding a completely unnecessary network layer to every DB query in order to "satisfy" future imaginary "scaling concerns"?

reply

upvote

by eterm17 hours ago|

[-]

Because you don't add a network layer by running a database locally.

reply

upvote

by eddd-ddde20 hours ago|

[-]

That's still orders of magnitude more complexity for no real benefit. A migration from sqlite to postgres, if really required, is not that hard.

reply

upvote

by teaearlgraycold18 hours ago|

[-]

Yes, postgres should support a superset of SQLite functionality.

reply

upvote

by wat1000018 hours ago|

[-]

Now you've added a substantial dependency, and annoying setup requirements. Good luck doing this for a native app on mobile or desktop.

reply

upvote

by ummonk12 hours ago|

[-]

If someone is talking about "spinning up a separate machine" for Postgres, they're not talking about a desktop or mobile app...

reply

upvote

by eterm17 hours ago|

[-]

Obviously SQLite is the best choice for a mobile or desktop app, that's not what's being discussed here.

reply

upvote

by onlyrealcuzzo22 hours ago|

[-]

It's almost as if Postgres isn't perfect, and one size shoe doesn't fit all.

Some people want some of the benefits you get from SQLite.

SQLite is obviously not perfect, but it's an incredible piece of software, and people regularly find good ways to make use of an excellent pieces of software.

reply

upvote

by switchbak20 hours ago|

[-]

I mean - I agree for the typical multi-user, SaaS webapp. But I don't think that's what these folks are proposing. If they are - yeesh, count me out.

If on the other hand they're talking about single-user, software in the small - hell yeah. In fact, I'd also promote DuckDB in this regard (mostly for analytics) - with the power of a single machine these days, you can do a surprising amount and never have to worry about distribution. Unless you know you'll have to, in which case you're probably just digging yourself a hole?

reply

upvote

by nitwit00518 hours ago|

[-]

The reason the parent post is complaining that it doesn't make sense, is because people have indeed pushed the idea of using SQLite as an alternative for web apps like that.

reply

upvote

by lunar_mycroft20 hours ago|

[-]

The typical multi-user SaaS webapp doesn't have anywhere near enough users to overwhelm a single SQLite instance. Of the few that do succeed to the point where that's no longer true, a significant fraction can use techniques like sharding to stretch SQLite further.

reply

upvote

by faizshah19 hours ago|

[-]

Scale to zero is very useful.

reply

upvote

by 21 hours ago|

[-]

deleted

reply

upvote

by fragmede21 hours ago|

[-]

So teach them. If you want to bring up computer science fundamentals, the question is where does SQLite sit with regards to the CAP theorem. Consistency, Availability, and Partition tolerance. SQLite isn't a distributed system, so there are no partitions to tolerate, so it's a CA system. Other databases make different tradeoffs. For systems that don't need concurrent writes, SQLite is pretty great! There are no users to manage, no permissions, no daemon to run, no server and port to mix up. Just open a file on disk using a library.

reply

upvote

by refulgentis21 hours ago|

[-]

Strawman, no? "run an Obelisk server with a SQLite database", now we're distributed.

SQLite is a nice local store. It's this server stuff that I don’t grok, well, yet. :)

reply

upvote

by 9rx20 hours ago|

[-]

In the beginning apps and SQL were co-mingled. Oracle eventually came along and noticed that people wanted SQL on the network so that many different apps, running on different computers, could all access the same data. But then people realized that clients really want rich, 'tree'-like data, not simple rows and columns, so people started sticking networked databases in front of networked databases to serve as a transformation system. And now people are realizing that the second networked database layer is redundant and never used beyond what is required for the client-facing network database, so they are moving the storage back into the first network database layer, just like Oracle did all those years ago. What is old is new again.

reply

upvote

by fragmede21 hours ago|

[-]

What changed is SSDs. SSDs means that local access is faster than hitting the network. An expensive SAN stopped making sense because of this in specific cases. So for read heavy, or even read only database loads, you copy the SQLite file to the node that's processing the file, and just update that file whenever the data does get changed.

reply

upvote

by dboreham20 hours ago|

[-]

And of course there are now several responses proving your point.

reply

upvote

by MagicMoonlight20 hours ago|

[-]

How many production apps do you think have enough users to justify these huge DB servers?

reply

upvote

by mxey19 hours ago|

[-]

Huge?

reply

upvote

by ihateolives9 hours ago|

[-]

Everything is huge compared to sqlite.

reply

upvote

by bastardoperator20 hours ago|

[-]

Are you one of my enterprise customers? What if your workload does not require write concurrency?

reply

upvote

by wat1000018 hours ago|

[-]

Someone with experience would know that concurrency isn't a universal requirement.

reply

upvote

by pstuart20 hours ago|

[-]

Sure, SQLite doesn't solve every problem -- but in many cases it solves the need at hand with the reward of one less piece of infra required to support it.

I see obsessions with tooling/solutions constantly from experienced devs who fall in love with the original solution and think it's the only way to do things -- so the experience part cuts both ways.

reply

upvote

by doctorpangloss21 hours ago|

[-]

sqlite is more like a file format than a database. it competes with .xlsx.

> "SQLite for everything" crowd is a little bit inexperienced.

every time i see it in a real application, it becomes a huge focus of issues (for example: jellyfin, hermes, openwebui, comfyui)

reply

upvote

by fragmede21 hours ago|

[-]

What kind of issues commonly arise?

reply

upvote

by doctorpangloss19 hours ago|

[-]

anything that requires more than 1 user or not being down all the time

reply

upvote

by refulgentis21 hours ago|

[-]

I absolutely 100% do not understand it either. At all. Every time I try to over the last year or two I come away with the conclusion its something that sounds cool (to me too!) but is guaranteed to cause more problems than more obvious solutions.

That being said I'd kill for someone who used it and benefited to explain it to me in a practical sense. (specifically where syncing is involved, and syncing a subset of the SQLite is necessary. If it's "just" a document store thats treated like a blob for syncing/backup, that's familiar. If it's all in one storage but only local, that's familiar.)

Re: TFA, I guess it would have helped if I knew what Obelisk was, which is on me, and a more in-depth explanation of how this ties into AI/agents, which is on the industry/writer.

reply

upvote

by wat1000018 hours ago|

[-]

It's very likely that you have multiple SQLite databases in your pocket right now. It's one of the most widely deployed pieces of software on the planet. If your conclusion is that it's guaranteed to cause more problems than other solutions, then that's on you.

reply

upvote

by refulgentis18 hours ago|

[-]

Correct! I'm not "worried" about it, I've been putting SQLites in your and my pocket for the last 17 years.

I don't want to be glib and leave it there, even though I'm slightly annoyed you missed several sigils in my post that I was well past that.

The point is, for the not in your pocket case, for the not a singular document store case, I'm curious what the use case is.

reply

upvote

by pluralmonad36 minutes ago|

[-]

I use it to keep infra spend low for some systems I built/maintain for a handful of volunteer orgs. These systems have multiple users, dozens to a couple hundred. I just serialize writes in app code. Backup the db files to blob storage every so often and don't think about it much more.

reply

upvote

by faangguyindia15 hours ago|

[-]

I've replaced all of these with Go + SQLite:

1. Intercom 2. Zendesk 3. Email marketing 4. Kanban 5. Todo 6. Our billing stack 7. Our issue tracker 8. Our forum 9. Uptime monitor 10. PagerDuty (clone)

I have dozens of products I sell, so I thought: why not build everything ourselves?

All of these run on the same server and use very little memory. I replaced all the SaaS tools we used with these.

I also moved to dedicated servers and dropped costs to about 1/10th of what we were paying for managed cloud solutions, while maintaining the same HA and even achieving lower latency (partly because noisy neighbors on VPSes were increasing tail latency).

We used to spend a ton on this stuff. These have now been in production for four months and have only needed minor updates.

Deployment is dirt simple. No Docker, no Kubernetes—just a systemd service and a binary built on the dev machine and deployed.

We also used to pay for services like MaxMind and IPData. I ended up hand-rolling my own IP geolocation service, which, in my tests, outperforms most existing solutions.

It all started with replacing Uptime Robot. Then I got more confident and replaced PagerDuty. After that, I replaced Intercom.

Finally, I had always heard people say, "Don't build your own billing stack." But I said YOLO, let me make that mistake myself. So I studied our existing billing solution, developed my own, and rolled it out. So far, we've had zero issues with it.

Caddy in front.

I found that we only use maybe 1–5% of the features most SaaS products offer, while the features we actually need keep getting buried deeper and deeper inside these "enterprise-grade" platforms, making our workflows more difficult.

I won't show my commercial products because our partners and clients probably wouldn't appreciate knowing how cheap I am—but I call it being resourceful.

I can show my free app, though, which has 20,000+ users and was launched recently: https://macrocodex.app/

It only uses the Zendesk clone. Email is handled through Cloudflare routing, so we pay almost nothing to run the app.

reply

upvote

by Xeoncross1 hours ago|

[-]

Thank you for sharing, if you don't mind can you share some Go+SQLite leanings as someone actually pulling this off?

1) How do you do backups? Do you use github.com/benbjohnson/litestream? CRON job backup with rsync?

2) Any issues with large databases and many clients? Is there a TPS or DB size where SQLite becomes problematic?

3) How do you deploy new binaries and safely shutdown the old instance? Caddy change to route to new binary + Go's HTTP server graceful-shutdown on old instance?

4) Do you use a pure-Go SQLite lib or one of the CGO libs?

reply

upvote

by password43217 hours ago|

[-]

This is the way, with "bus factor" as the one downside.

What happens when you get hit by a bus or someone with higher annual revenue tracks you down because of this comment and hires you away from this custom software stack with a bigger slice of the profits?

It sounds like there's also a bus factor for your server but I'm sure you're aware of that though it sounds like your clients aren't.

reply

upvote

by FR102 hours ago|

[-]

I'm also a Go+SQlitemaxxer. For my first SaaS I focused on keeping the list of external services very short, as I basically couldn't afford it. I only picked those services that were a) vital b) could be replaced with a homegrowned/FOSS alternative.

Besides keeping the costs down, I find that this approach makes it way more enjoyable to build and easier to manage than having 10 services/subscriptions.

*PS: I think that you could add a small QR Code for the iOS/Apple installation on your app's website.

reply

upvote

by kukkeliskuu13 hours ago|

[-]

I am doing approximately the same for some of these as well, plus CRM. For me, CRM is just a list of contacts with a state and some other fields, plus record who was contacted and when. But I can directly pull information that is relevant to me. Integrating to an external CRM is more work and arguably more risky than rolling your own.

reply

upvote

by jedimastert4 hours ago|

[-]

Open source it all, I double dog dare you

reply

upvote

by imhoguy8 hours ago|

[-]

Thanks for your words of wisdom. I share same sentiment of simple apps. I self-learned coding thru 1990s and I am sick of herding distributed overgrown software at work.

Quick question: how do you inspect database contents of your deployed (micro)apps? I used SQLite in one production app and what I didn't like was to use terminal sqlite client on the server or having to copy DB to my laptop to query it with sqlitebrowser. With my Postgres server it is much simpler to just query by rich GUI client with SSH tunnel.

reply

upvote

by tonymet15 hours ago|

[-]

I rolled my own uptime but how do you vary region or do residential testing ?

reply

upvote

by faangguyindia15 hours ago|

[-]

I actually don't.

I just have uptime service hosted outside of our main infra. It connects to my service called Siren, which alerts me on my phone with an alarm on full volume with SWAT cat intro.

It's good enough for what we do, barely have any downtime. But it helped me figure out 6s downtime we would experience when our spot instances get knocked out, so it helped me increase health check frequency

6s downtime is a lot when you are getting hammered at 100 RPS.

reply

upvote

by shukantpal22 hours ago|

[-]

SQLite is surprisingly performant for single node applications even when comparing to Postgres. Postgres consumes a lot more memory and requires IO to hop through IPC whereas you can keep everything in process in SQLite with a shared connection pool.

I've been testing different storage engines for my agent harness and I can get up to 7.5k concurrent sessions on a single vCPU with SQLite whereas Postgres crashes or runs out connections.

[0] https://github.com/impalasys/talon/pull/23#issuecomment-4577...

reply

upvote

by bob102921 hours ago|

[-]

When used properly, SQLite is effectively an in-process method invoke. If the only remaining things in the way are your runtime, kernel, file system and a local NVMe storage device, you may find it massively outperforms hosted alternatives.

Leaving the current thread is where you lose the game in terms of latency. SQLite can work on timescales measured in microseconds if you don't force interthread communication.

reply

upvote

by themafia18 hours ago|

[-]

> an in-process method invoke.

Pedantically it's an in process virtual machine for operating on structured data. Which is precisely where it shows it's weakness, in my experience, when you end up with complicated table structures and complex join mechanics you then need to start thinking ahead of the query planner and VM code a bit in order to maintain reasonable performance.

There are more than a few unusual things worth knowing:

https://sqlite.org/optoverview.html

reply

upvote

by onlyrealcuzzo22 hours ago|

[-]

> SQLite is surprisingly performant for single node applications even when comparing to Postgres.

In the context of SQLite being understood to be a quite excellent piece of software - shouldn't we expect it to be?

In the context of a single-node, Postgres is overkill. It should not be expected to be competitive with SQLite.

This is almost like benchmarking an in-memory HashMap to Redis and being surprised that it performs well in ideal conditions.

reply

upvote

by shukantpal22 hours ago|

[-]

Yes, agreed on SQLite/Postgres. But I'm going to benchmark RocksDB next and see what the performance characteristics are. I suspect the LSM tree storage engine of RocksDB might perform better since agents are so write heavy when running highly concurrent workloads. After all, you are streaming LLM tokens into disk and fanning them out to subscribed clients.

reply

upvote

by onlyrealcuzzo22 hours ago|

[-]

You might want to start here: https://docs.cozodb.org/en/latest/releases/v0.3.html

reply

upvote

by password43216 hours ago|

[-]

Thank you for sharing a benchmark pretty much exactly like the parent comment is planning to do.

Also thanks for the incidental exposure to a DB I'd never heard of before... with a browser-based demo CozoDB may be a good way to start experimenting with Datalog.

reply

upvote

by andriy_koval21 hours ago|

[-]

That project has 0 commits for 2 years.

reply

upvote

by onlyrealcuzzo20 hours ago|

[-]

What does that have to do with their research on the exact topic OP was looking into?

reply

upvote

by andriy_koval20 hours ago|

[-]

Abandoned research of unknown quality is strong signal to downprioritize that direction

reply

upvote

by recursive21 hours ago|

[-]

Sounds pretty stable

reply

upvote

by password43217 hours ago|

[-]

v0.7 with the following disclaimer:

> Versions before 1.0 do not promise syntax/API stability or storage compatibility.

reply

upvote

by m2f220 hours ago|

[-]

There's a wide gap from files to multipartition databases. Running databases in a container is not for me sorry whenever real production stuff is on the table.

Personally, lots of ETL can just be taken care of locally without involving enterprise databases. In such cases, DuckDB is 5x-10x better than SQLite and orders of magnitude simpler/faster than spinning up a dedicated Postgres database.

For general scripting, there's no match between a 20-lines awk script and a much cleaner, robust, maintainable equivalent SQL script based on DuckDB.

I just hope MotherDuck don't need to pump/dump for IPO - it would be sad losing that tool for the usual corporate greed.

reply

upvote

by szarnyasg20 hours ago|

[-]

Hello, DuckDB devrel here. First, thanks for the kind words :)

Second, it's funny you should mention the 20-line awk script. I was making a very similar argument yesterday at the Ubuntu Summit: at some point, using shell scripts with GNU coreutilus becomes impractical, while DuckDB SQL scripts scale better in terms of complexity and maintainability (and often also performance). My slides are here: https://blobs.duckdb.org/slides/duckdb-ubuntu-summit-2026.pd... (pages 32 to 36)

Third, MotherDuck develops a closed-source DBaaS on DuckDB. They build on DuckDB, and you connect to MotherDuck with DuckDB but they are a separate VC-funded company headquartered in Seattle. DuckDB is developed by DuckLabs, a bootstrapped (revenue-funded) company in Amsterdam. And the IP of the project is in a third organization: a Dutch non-profit called the DuckDB Foundation. For details, see https://duckdb.org/faq#how-are-duckdb-the-duckdb-foundation-...

reply

upvote

by infinet3 hours ago|

[-]

I use DuckDB and like it. Since many mentioned GB level json in this post, so they have large amount of data. Been column based, DuckDB uses more RAM as row count grows. It can be an advantage or disadvantage depends whether memory is constrained. Traditional row based DB such as SQLite can deal with large database with less memory.

reply

upvote

by Xeoncross1 hours ago|

[-]

So is https://github.com/obeli-sk/obelisk the Rust version of https://github.com/temporalio/temporal (Go)? Can you guys add a comparison between them on the site?

reply

upvote

by prmph19 hours ago|

[-]

> Postgres ... is the right choice when you need higher availability, broader shared scalability, or other deployment properties that are better served by a network database. It is also the better fit when asynchronous replication to object storage is not the durability model you want... Many workflow systems do not need that on day one and should not start with more infrastructure than their state actually demands.

------

I see this kind of YAGNI thinking a lot, but in my view, it must be balanced against the effort you'd put into resolving any edge cases and adapting current architecture to your use case.

Imagine you deploy Sqlite, and thought it fine by itself, you keep running into some unforeseen challenges with the use to which you are putting. YOu'd need to sink valuable time and effort into addressing those. Then, when you have outgrown it, you'd beed to spend additional valuable times dping the same with Postgres.

This is why, when it comes to Architecture, I increasingly find my myself over-enigneering a bit. Assuming there is a good chance you might need to upgrade your architecture in the not too distant future, that approach is actually kind of very efficient. I find that I am able to uncover a lot of potential gotchas, which feeds back into the what the simplified current architecture should be, and helps me understand the roadmap I'm facing very well. I also avoid wasting too much time going too deep in directions that make sense now, but need a lot of plumbing to get right, when I can see that I'd likely have to throw it all out in a few years. Going from A -> B -C -> D, where each step is the optimal good-enough-for-now architecture but which requires a lot of work to stabilize and iron out the kinks of, is much less efficient than exploring D well enough to know whether you should build A, B, or C now.

Basically, some over-engineering, if done right, is not wasted. It cuts right to the heart of what you are dealing with, efficiently, and allows you to make (maybe) simpler but informed choices now as to how best to allocate your development resources now.

reply

upvote

by narnarpapadaddy18 hours ago|

[-]

My version of this is the “N+1” principle. Build for one more foreseeable use case than you currently have. The domain model will click in when you need to generalize a solution, and you’ll gain the ability to see your particular solution as one of several to the problem, and thus evaluate fit and tradeoffs more clearly.

Don’t do N+2. The goal isn’t to predict the future, nobody can do that. The goal is a durable understanding of the domain and the best fit implementation you can get with that current understanding and resources.

That said, SQLite passes that bar for me in most use cases.

reply

upvote

by bob10297 hours ago|

[-]

> This is especially attractive for AI agents and AI-generated workflows. Those systems are often bursty, experimental, and easier to reason about when each agent or tenant has a small self-contained unit of state.

I am finding that the most important thing is one big, consistent data warehouse that is updated with the state of the business as close to real time as we can get.

SQLite is not really great at this particular problem. Something like Postgres or SQL Server would be much more suitable for an OLAP data warehouse that can serve clients (AI agents) while simultaneously merging massive record sets from upstream business systems. These products also offer intricate permissions control. You can prove to an auditor that your AI solution will never see tables or rows it's not supposed to. SQLite doesn't even have a concept of a user, role or login.

> The compute can stay cheap and disposable.

Again, hosted sql is better aligned. The alternative is DIY hosted sql (SQLite + some other magic) which immediately violates this rule.

reply

upvote

by throwaway586701 hours ago|

[-]

Holy sticky header... It takes literally half the screen on mobile. Shit like this makes me wonder: do you even look at your website? At least sometimes?

reply

upvote

by psanford16 hours ago|

[-]

I wrote a library[0] to let you concurrently update a sqlite db in s3 safely. It uses the little known sqlite sessions extension plus s3 compare-and-swap on a small metadata file to make this work reasonably efficiently and safely. I have been enjoying it for a bunch of small projects where I want a lambda function to have a db for state but I don't want to pay for a full database instance.

[0]: https://github.com/psanford/s3db

reply

upvote

by password43216 hours ago|

[-]

This is exactly what I had dreamed of except without being aware of the session extension... I put it off because I thought I would have to create a layer opening sequential partial sqlite dbs and somehow filtering to the most recent version of each record in-between compactions.

It sounds like sqlite sessions handles the hardest part and you've sanded down some of the rough edges while implementing the glue bits for s3 (generously licensed MIT); thanks for the heads-up!

reply

upvote

by stephenlf21 hours ago|

[-]

Can’t wait to see the next iteration of this idea with “Logs are all you need for durable workflows.”

reply

upvote

by mrkeen10 hours ago|

[-]

Yep. But we all know that one machine can and will fail (or be patched and restarted), so the log needs to be distributed.

Different workflows should probably go in different buckets or "topics" for clarity. Since it's distributed, the system must guarantee that the log items are stored in the same ordering ("offsets") among the nodes.

Not a bad way to do things.

reply

upvote

by _karie_15 hours ago|

[-]

Wait no further. It's already happening.

One reason why a "logs are all you need" solution may fail: untrusted-log-as-injection[1].

Check those SBOM, and don't forget to include their CICD pipelines[2].

[1] https://news.ycombinator.com/item?id=48315440

[2] https://github.com/jqwik-team/jqwik/issues/708#issuecomment-...

reply

upvote

by friendly_deer17 hours ago|

[-]

In all seriousness, I’d take a “s3 is all you need for durable workflows” and use it in data processing applications that move data from s3 -> s3 with no other dependencies.

reply

upvote

by gchamonlive21 hours ago|

[-]

Are logs all you need for durable workflows? I'm confused here. How'd persist and query nested or related data over logs? By logs I assume you mean something like elasticsearch or meilisearch?

reply

upvote

by wolttam20 hours ago|

[-]

Pretty much every durable system has an intent log of some sort. The log provides durability, the database system just integrates that log into a more queryable format.

reply

upvote

by gchamonlive18 hours ago|

[-]

I swear it didn't occur to me that that mean WAL, makes much more sense now LOL

reply

upvote

by notawhitemale20 hours ago|

[-]

[dead]

reply

upvote

by deathanatos20 hours ago|

[-]

I assume they meant a log like a WAL. A WAL should be (quite literally?) all you need for durable workflows.

A distributed WAL (to survive a machine death) would also probably be something I'd want, and … something I'm not sure you're getting directly from SQLite.

reply

upvote

by gchamonlive18 hours ago|

[-]

Is it common to use logs as a proxy for write-ahead logs?

reply

upvote

by gchamonlive5 hours ago|

[-]

Folks this is meant to be an honest question, not a snarky comment. I'm not a DBA, I'm DevOps/SRE and logs for me always meant execution logs. I'm just curious if between those involved in database domain logs is used to refer to WAL.

reply

upvote

by fourside20 hours ago|

[-]

I read the parents comment as sarcasm and not a serious suggestion.

reply

upvote

by Rapzid19 hours ago|

[-]

Log as in the structure.

reply

upvote

by password43216 hours ago|

[-]

Pardon my ignorance trying to follow up on what is most likely sarcasm but is this not Kafka's claim to fame?

I am joining a new project and need to know to what extent Kafka is still a part of the future for new big data projects. It doesn't seem like there are alternatives at the high end but instead the question is when other technologies (that are easier to manage, require less compute, etc.) max out.

reply

upvote

by this_user20 hours ago|

[-]

Shortly followed by:

"Sockets are all you need for durable workflows" and then finally "Kernel primitives are all you need for durable workflows."

But seriously, part of being a professional is using the right tool for the job.

reply

upvote

by freakynit5 hours ago|

[-]

SQLite backed with Raid-10 NVME disks and periodic backups to cloud storage is generally more than enough to run majority of the production workloads of startups.

Writes are single threaded, but, you can still easily do thousands per second.

DuckDB offers similar qualities, on the OLAP side.

This is not to say this is best combination.. but, when you consider the simplicity of setup, usage, operations, and backups, and cost element, this indeed offers one of the best, if not the best combination.

reply

upvote

by golem1421 hours ago|

[-]

Litestream releases 5.9 and newer have a bug that causes instances to sync an insane amount of data. a DB with <10K of data in it and practically no writes/reads causes something like 10GB of daily replication traffic. For my toy project that got needlessly expensive.

reply

upvote

by Wilduck2 hours ago|

[-]

I've been following litestream for a while, and it seems like the project has been hijacked by a vibe coder. I wouldn't trust it for critical tasks anymore.

reply

upvote

by https4433 hours ago|

[-]

Is this bug logged?

reply

upvote

by irons1 hours ago|

[-]

Looks like https://github.com/benbjohnson/litestream/issues/1197. Still open as of now, with a potential cause noted in the latest comment.

reply

upvote

by Wilduck55 minutes ago|

[-]

This is one of a bunch of issues that have been popping up since a vibe coder took over the bulk of development on this project. There's a (probably also AI generated) list of a big portion of the issues here: https://github.com/benbjohnson/litestream/issues/1221. That proposal has been open for a few months, and it seems (from my POV) unlikely to be resolved any time soon.

reply

upvote

by PUSH_AX19 hours ago|

[-]

I went from using the various big player postgres clusters to SQLite, we have an MAU in 7 figures, all backed by SQLite durable objects. We have to think differently about the access patterns but the benefits have been worth it.

reply

upvote

by Thaxll19 hours ago|

[-]

I started using SQLite for a home project after years of reading about it, I was shocked at the poor type system coming from Postgres. It is really inferior, not sure why it gets so much praise.

https://sqlite.org/datatype3.html

https://www.postgresql.org/docs/current/datatype.html

Working with date/time feels like using a 30years old database, nothing is enforced at insert. Really someone needs to explain why so many people like it.

reply

upvote

by zimmi19 hours ago|

[-]

You can use strict tables: https://sqlite.org/stricttables.html

reply

upvote

by chrismorgan12 hours ago|

[-]

I don’t like strict tables, because it conflates two concerns, with one somewhat good and one distinctly bad effect (in my assessments).

The somewhat good: it gets rid of most of the weak typing. It still coerces, in line with other SQL databases, but at least a column will only store values of one type. Personally I’d prefer to opt out of the coercion. And I don’t think most ways of writing SQL (in applications especially, but also manually) will ever actually trigger the strict differences. So it doesn’t feel like it’s actually particularly useful.

The distinctly bad: you’re limited to six datatype names. You may well now want external documentation or load-bearing comments in your schema, and your application code may be hobbled, if it liked to infer types based on the datatype name. For example, in sqlx, SQLite datatype BOOLEAN can automatically map to Rust type bool <https://github.com/transact-rs/sqlx/blob/75bc0487eb661da811b...>. Without that, you have to resort to a variety of less-pleasant techniques, such as selecting `done as "done: bool"` or overriding things in sqlx.toml.

I really, really wish they’d implement some form of CREATE TYPE and let that work with strict tables. If I could `CREATE TYPE BOOLEAN FROM INTEGER` and such, I’d be all in on strict tables.

reply

upvote

by pseudalopex19 hours ago|

[-]

This could enforce dates are strings. They wanted to enforce dates are dates I thought.

reply

upvote

by simonw18 hours ago|

[-]

  create table events (
    id integer primary key,
    name text not null,
    event_date text not null check (
      -- YYYY-MM-DD
      event_date glob '[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]'
      and date(event_date) is not null
      and date(event_date) = event_date
    )
  );

In Python that raises this error if the date is invalid:

  sqlite3.IntegrityError: CHECK constraint failed:
    event_date glob '[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]'

reply

upvote

by pseudalopex16 hours ago|

[-]

I see. The strict tables page did not mention the date and time functions.

Python would show the 1st line always? Or the failed part?

This is unreasonable for a very common type I think.

reply

upvote

by 17 hours ago|

[-]

deleted

reply

upvote

by zimmi18 hours ago|

[-]

Storing dates as INTEGER (year * 10000 + month * 100 + day, e.g. 20260530) is not so bad. Proper date / timestamp types would be great though.

reply

upvote

by mollerhoj18 hours ago|

[-]

"feels like using a 30years old database"

reply

upvote

by RivieraKid19 hours ago|

[-]

Yes, this is basically my only issue with SQLite. SQLite with a strict type system would be great.

reply

upvote

by 19 hours ago|

[-]

deleted

reply

upvote

by 19 hours ago|

[-]

deleted

reply

upvote

by formerly_proven18 hours ago|

[-]

This is the fault/price of backwards compatibility. Most users of SQLite should just fire off a few pragmas on each connection:

    PRAGMA journal_mode = WAL
    PRAGMA foreign_keys = ON
    # Something non-null
    PRAGMA busy_timeout = 1000
    # This is fine for most applications, but see the manual
    PRAGMA synchronous = NORMAL
    # If you use it as a file format
    PRAGMA trusted_schema = OFF

You might need additional options, depending on the binding. E.g. Python applications should not use the defaults of the sqlite3 module, which are simply wrong (with no alternative except out-of-stdlib bindings pre-3.12): https://docs.python.org/3/library/sqlite3.html#transaction-c...

Also use strict tables. https://www.sqlite.org/stricttables.html

While it has bad ergonomics, you can also use CHECK constraints. For example, using sqlite's built in date support, it's possible but awkward:

    CHECK (
      date(my_date_col) IS NOT NULL
      AND my_date_col = date(my_date_col)
    )

The IS NOT NULL is needed because date returns NULL for invalid dates; the other check because it also accepts Julian days (date('2026') is sometime during year 4707 BC).

reply

upvote

by pseudalopex16 hours ago|

[-]

The price of compatibility could be a pragma.

reply

upvote

by formerly_proven6 hours ago|

[-]

It literally is? Changing the defaults shown in the PRAGMAs above would break backwards compatibility. SQLite is strictly semantically versioned and does not break backwards compatibility.

https://sqlite.org/versionnumbers.html

reply

upvote

by pseudalopex27 minutes ago|

[-]

Their complaint was SQLite's type system was poor. You said this was the price of compatibility. And the documentation of your recommended pragmas said nothing of types. They seemed unrelated helpful advice seemingly.

New types would break forward compatibility in SQLite's terms. 3.7.0 added WAL mode was their example of a forward compatibility break.[1] 3.y.0 could add better type system mode.

[1] https://sqlite.org/formatchng.html

reply

upvote

by ThatMedicIsASpy19 hours ago|

[-]

it's a single file.

reply

upvote

by IshKebab19 hours ago|

[-]

It gets praise because of stuff other than the type system.

I agree it is disappointing, especially before strict tables.

You should check out DuckDB which is basically SQLite but with proper types. Although it is also OLAP (struct of arrays) rather than OLTP (array of structs) which may have worse performance for typical SQLite loads. In practice I doubt it matters if you have an application where you're considering either.

reply

upvote

by nickpeterson4 hours ago|

[-]

A lot of oltp databases have modeling conventions for making read only reporting tables. Do tabular dbs have an inverse for transaction heavy data, that later gets batched to a read optimized structure? I kind of think most databases (even oltp workloads) really are read dominated. I feel like DuckDB is really close to working as the ‘main db’ for such systems, but my lack of knowledge for how to handle quick mutations bothers me. It feels like some form of temporal data modeling would solve it but I don’t know.

reply

upvote

by grodes19 hours ago|

[-]

Read their docs

reply

upvote

by jessmartin1 hours ago|

[-]

Waiting for “JSONL is all you need for durable workflows.”

reply

upvote

by teravor19 hours ago|

[-]

if you have an application that needs to maintain state in a non-critical section or if you discover that using SQL is actually a good idea for some tasks (even in critical sections), SQLite is not only a good choice but it will save you a lot of time coming up with a brittle custom solution.

maintain an in-memory SQLite db and work it with SQL commands, and if you also want to preserve state across application restarts you can routinely save to disk or load from it: <https://www.sqlite.org/backup.html#example_1_loading_and_sav...>

this also happens to be the most convenient file-format (aka. application-format) I ever worked with.

reply

upvote

by kubik36922 hours ago|

[-]

Meta comment: This is a domain under my countries TLD (Slovakia) and it is one of the handful of words that are a word with the TLD in my language (and coincidentally) also in English. Every now and then, I will check on the domains with a retrograde dictionary for domains that have this property and root of this particular domain had a roundcube email server on it (can be checked on archive.org). After further checking, the local company actually named themselves Obeli s.r.o. (s.r.o. is Ltd), presumably so that they could use a domain that is a real word when said together with the TLD. (EDIT:) Forgot to write the thing I wanted to mention in the first place: it appears the domain must have lapsed and/or the author bought it from the company that was using it.

Another fascinating fact: our countries TLD has been stolen Ocean's 11 style (I am not kidding). After Czechoslovakia split into Czech Republic and Slovak Republic, the newly created Slovak .sk TLD has been under the care of people from the local university. The university also had some offices that they were leasing out. Someone had leased this office space (EDIT: this is important as this means they had the same physical address), created a company that had the same name as the NGO that was taking care of the domain, so e.g. the NGO was named "My Company o.z." and the perpetrator created a "My Company s.r.o." (our countries version of the american Ltd). This person then wrote to ICANN to change the address to the "My Company s.r.o." presumably under the pretense that this was just an administrative error and from this point, they have functionally taken custody of the TLD. I was not able to find how they did it technically, but I presume they persuaded ICANN to then point to their servers instead of the real ones. After this happened, it seems that no one noticed for some time. When they noticed, they tried taking it back, but they weren't able to. For some inexplicable reason, the government during that time (Šuster era, early 2000s) gave the new company a contract that was functionally uncancellable from the government side. Later governments made this even more uncancellable and in 2017, then Minister of IT (and as of this day president!) Pellegrini made the contract literally uncancellable. As a result of this, we have one of the most expensive domains around (18e/year, rising each year for no good reason). (EDIT:) The company running our countries TLD is now a foreign entity that the whole thing has been sold to (multiple owners over time) and we as a country have no control over if I understand it correctly.

I might have gotten some details wrong as I am writing this from my memory of researching it a couple of years back, but you get the idea, crazy stuff. Here is an article in Czech [0] that tells the story a bit better, but you have to translate it.

[0] https://www.root.cz/clanky/pribeh-domeny-sk-aneb-kradez-za-b...

// EDIT: I have found that the article actually links the movement to return the TLD back [1]. It also has a story tab [2], so they have something much more precise than the paraphrasing I wrote.

[1] https://www.nasadomena.sk/

[2] https://www.nasadomena.sk/historia/

reply

upvote

by ymolodtsov18 hours ago|

[-]

That's a crazy story. National TLD is a weird business from the beginning.

reply

upvote

by halamadrid8 hours ago|

[-]

Operators of Unmeshed here, which is basically a rewrite of Netflix Conductor. In this orchestrator we heavily use a uniquely scaled version of SQLite and also offers “managed” SQLite instances for managing user data. Combining the durable executions of Unmeshed and along with workflow primitives like sleep, workers, etc you can actually build complex systems with a lot less code than ever.

Check it out here: https://unmeshed.io

reply

upvote

by jackzhuo4 hours ago|

[-]

100% this. I used to default to Postgres for everything. But seeing SQLite handle concurrency so well now—plus having built-in BM25 search and vector support—it really is all you need for these kinds of architecture.

reply

upvote

by vixalien3 hours ago|

[-]

Has anyone actually used PGLite[0]?

[0]: https://pglite.dev/

reply

upvote

by sgloutnikov22 hours ago|

[-]

It's close enough that DBOS does support SQLite. [0] The default for prototyping is SQLite, but sure you can run it in production if you wanted.

Obligatory list of workflow engines and libraries because it's such a common need that a lot have rolled their own. [1]

[0] https://docs.dbos.dev/python/tutorials/database-connection

[1] https://github.com/meirwah/awesome-workflow-engines

reply

upvote

by oulipo237 minutes ago|

[-]

What would be the main differences between DBOS and Obelisk?

reply

upvote

by tomasol21 minutes ago|

[-]

Hi, I wrote a comparison blog post between Obelisk and the Java version of DBOS a couple of months ago: https://obeli.sk/blog/comparing-dbos-part-1/

reply

upvote

by Xcelerate22 hours ago|

[-]

Haha, I just started doing this on my own. Found it helps the agents preserve state better. I typically ask them to design a DAG first based on a set of specifications and then execute it (each step stores something in a SQLite DB). Iteration is pretty simple then because I just ask for a tweak to one or two steps of the DAG, and then to re-run.

Funny how people are independently converging on similar patterns of "what works" here. Still feels like we're in the wild west with all these ad-hoc patterns of agent orchestration that people are coming up with.

reply

upvote

by zrail21 hours ago|

[-]

Same. The prompt was essentially, every checkbox in this PLAN.md should be task in SQLite.

reply

upvote

by oulipo241 minutes ago|

[-]

Obelisk also supports Postgres. When you're using it with Postgres, what are the differences with DBOS? Are there any that would be significant

(I'm already using Postgres, so I don't really need a sqlite-based durable workflow engine, so looking to know how to choose between DBOS and Obelisk)

reply

upvote

by tomasol19 minutes ago|

[-]

Check out https://obeli.sk/blog/comparing-dbos-part-1/

reply

upvote

by yokoprime21 hours ago|

[-]

If you're just doing workflows from a single node, i guess it can be ok as long as theres a single writer. But scaling across multiple servers it clearly is not all you need.

reply

upvote

by dev_l1x_be9 hours ago|

[-]

I start to think the SQLite is all i need to store data. When there is a chance of non-coordinated writes that I can distribute among servers (or even a range based ID) SQLite is my first idea. With durable storage backups this works amazingly well.

reply

upvote

by aykutseker10 hours ago|

[-]

Storage never ended up being the thing we worried about. The painful bits started once a workflow could touch external systems. Replaying state is one thing. Replaying a charge or an email is another. How are you dealing with that?

reply

upvote

by localhoster22 hours ago|

[-]

Idk if this article was vibe written or the author just "got adjusted" but it's clearly is, and it's unreadable. Man this becomes anmoying

reply

upvote

by mburaksayici20 hours ago|

[-]

Agreeing on the point, I needed NoSQL version on the similar uses, I've used TinyDB : https://mburaksayici.com/blog/2024/09/21/easy-to-use-nosql-p...

reply

upvote

by skybrian20 hours ago|

[-]

Instead of "just use Litestream," I'd like to see a review of different object stores one could use and which ones work well with Litestream. Is there a nice object store I could run in another Linux VM? As a hobbyist, which services providing an S3-like API make the most sense?

reply

upvote

by chrsstrm20 hours ago|

[-]

Litestream is just the replication layer, it works with any S3 compatible storage, and all of these in their guide as well https://litestream.io/guides/#replica-guides

reply

upvote

by skybrian18 hours ago|

[-]

No, I know that already. That's not a recommendation. Some storage layers have to be better than others on price, reliability, and so on?

Think Wirecutter, not install guide.

reply

upvote

by emodendroket10 hours ago|

[-]

SQLite is an underrated tool for how powerful it really is and probably people don't think of it often enough.

reply

upvote

by vkaku15 hours ago|

[-]

Back in the day, I wrote a simple job queue with SQLite

https://github.com/guilt/squeue

It did the job, was fairly easy to use.

reply

upvote

by vultour17 hours ago|

[-]

The GitHub statistics for the project this website represents are insane. It has a sole author that has averaged approximately 20,000 lines of code every week in the past month. How do you even maintain that alone?

reply

upvote

by gunnarmorling19 hours ago|

[-]

Related piece I wrote some time ago: https://www.morling.dev/blog/building-durable-execution-engi...

reply

upvote

by sharts10 hours ago|

[-]

For something so widely deployed you’d think it’d be included in Claude/Codex/etc for

reply

upvote

by dannypdx15 hours ago|

[-]

All this SQLite hate from big vector db... leave SQLite alone!

reply

upvote

by flying_sheep18 hours ago|

[-]

Cloudflare durable object is implemented with SQLite (or some variant of it)

reply

upvote

by orliesaurus20 hours ago|

[-]

Surprised no one has mentioned Turbopuffer yet [1] which natively supports dense vector similarity and BM25 keyword indexes out of the box

[1]. https://turbopuffer.com/

reply

upvote

by schmookeeg18 hours ago|

[-]

[dead]

reply

upvote

by simplestates15 hours ago|

[-]

Good framing. SQLite is often enough when the main problem is making workflow state durable, inspectable and easy to recover.

reply

upvote

by 0x5921 hours ago|

[-]

Big complex data model with ambiguous query patterns? Postgres

Small, well defined, data model with known query patterns? Bespoke model

There probably is a place for sqlite and my project space so far hasn't yet well-aligned with it.

reply

upvote

by asdff21 hours ago|

[-]

Probably going to get some winces for this but I do everything with flat files. Maybe my data aren't massive enough, but I mean I can do the relational thing by just having these metadata in some column, and returning rows that contain my desired information in these columns. Even if the file were too big to fit into memory one could just subset chunks of it and chew through. All this can be done with no dependencies, just base libraries of a lot of languages.

reply

upvote

by ryanisnan10 hours ago|

[-]

Sweet I get to tell my team we can move off of dapr workflows

reply

upvote

by fathermarz18 hours ago|

[-]

Excellent write up and inspired me for our next IA design run. After reading Fly’s Litestream work it makes me think this is a solid option.

reply

upvote

by delduca16 hours ago|

[-]

No, mmap is all you need.

reply

upvote

by netik21 hours ago|

[-]

Until you scale past one machine…

reply

upvote

by bze1221 hours ago|

[-]

Isn’t this very similar to cloudflare durable objects & workflows?

reply

upvote

by dnnddidiej14 hours ago|

[-]

Butchering the Beatles song again.

reply

upvote

by tomasol16 minutes ago|

[-]

sorry about that, I do love Beatles!

reply

upvote

by nodesocket19 hours ago|

[-]

The biggest annoyance about SQLite for me is no ability to:

    ALTER TABLE users MODIFY COLUMN…

    ALTER TABLE users ALTER COLUMN…

    ALTER TABLE users ADD CONSTRAINT…

You have to create a new temporary table with correct schema, copy data into this new table, drop the old table, and then rename the temporary table.

reply

upvote

by simonw18 hours ago|

[-]

They've been improving that recently:

2026-04-09 (3.53.0) - "Enhance ALTER TABLE to permit adding and removing NOT NULL and CHECK constraints"

I use my own sqlite-utils CLI/Python library to work around these limitations: https://sqlite-utils.datasette.io/en/stable/python-api.html#...

reply

upvote

by nodesocket18 hours ago|

[-]

Ahh that's very nice. Unfortunately the default version of sqlite3 provided by Debian Trixie is 3.46.1.

reply

upvote

by shevy-java9 hours ago|

[-]

Hmmm. SQLite is great, but I remember years ago, at a university cluster, I had to populate a SQL database via tons of INSERT statements from genomic/meta-genomic workflows. Postgresql was so much faster just at that particular action (inserting data) that it convinced me that SQLite may be useful for many, many applications, but for "big data"(sets), Postgresql is simply better.

reply

upvote

by 3dedb728-3f7718 hours ago|

[-]

Is this just a AWS ads?

reply

upvote

by EGreg22 hours ago|

[-]

Files is all you need.

https://xkcd.com/378/

reply

upvote

by tclancy22 hours ago|

[-]

Post It Notes will do if you have a good system.

reply

upvote

by contingencies22 hours ago|

[-]

Those who don't understand Unix are condemned to reinvent it, poorly. - Henry Spencer .. via https://github.com/globalcitizen/taoup

reply

upvote

by lvl15518 hours ago|

[-]

And all you need is pen and paper to do calculations.

reply

upvote

by ChrisArchitect21 hours ago|

[-]

Related:

Building durable workflows on Postgres

https://news.ycombinator.com/item?id=48313530

reply

upvote

by momojo19 hours ago|

[-]

Surfacing for the thread:

Armin Ronacher's "Absurd Workflows: Durable Execution With Just Postgres" https://lucumr.pocoo.org/2025/11/3/absurd-workflows/

reply

upvote

by vatsachak18 hours ago|

[-]

Postgres doesn't cost any extra lol

reply

upvote

by faangguyindia14 hours ago|

[-]

it does cost network stack

reply

upvote

by unnouinceput8 hours ago|

[-]

Quote 1: "DBOS recently argued that Postgres is all you need for durable execution...SQLite is all you need."

Quote 2: "SQLite State backed up to S3". Yeah buddy, if you think S3 isn't Postgres I'll eat my foot.

reply

upvote

by orf22 hours ago|

[-]

> The caveat is that Litestream replication is asynchronous. A restore can miss the newest local writes if the SQLite volume disappears before they are copied. That is fine for many AI and experimentation workflows

In short: SQLite is not all you need, unless you’re just experimenting don’t actually care about durability, in which case you also need litestream + object storage.

Right.

reply

upvote

by gwking22 hours ago|

[-]

The suitability of Litestream for production disaster recovery is also an open question in my mind. I used 0.3.x for several years and when I tried to upgrade to the 0.5.x series there were runaway disk usage problems that would have caused downtime had they made it to prod. As far as I can tell these have not been entirely addressed, although recent bug reports suggest that they might be getting closer.

I want to love it, and I don't take open source projects like this for granted. But during my last production upgrade I chose to decommission Litestream in favor of a dumber, less granular solution using sqlite3_rsync and nightly backups because there is no point in using a backup system that is not rock solid.

reply

upvote

by 0cf8612b2e1e22 hours ago|

[-]

Postgres also does not synchronously replicate for free. You can setup both to get a confirmation write if you require that durability.

reply

upvote

by orf22 hours ago|

[-]

> postgresql also does not synchronously replicate

By default. Generally your primary database is in a completely different failure category than a kubernetes node running an ephemeral workflow pod.

reply

upvote

by 0cf8612b2e1e22 hours ago|

[-]

Either you have durable storage or you do not. SQLite and Postgres can both ensure local durability of commits. If you want distributed durability, you need to ship that data elsewhere. That is another Postgres node, object store, whatever that’s still an external dependency.

reply

upvote

by paulddraper22 hours ago|

[-]

Not for free, but without the needing additional software.

  synchronous_commit = on

reply

upvote

by 0cf8612b2e1e22 hours ago|

[-]

That’s about the local transaction, not replication. SQLite WAL also gives you strict durability.

  PRAGMA synchronous = full

reply

upvote

by bootsmann22 hours ago|

[-]

S3 is strongly consistent, if you need it anyways you can just use s3 keys to deconflict and store the workflow state.

reply

upvote

by orf22 hours ago|

[-]

Yes, but directly using s3 as a key-value database is completely different from using SQLite + litestream.

reply

upvote

by paulddraper22 hours ago|

[-]

"Durable workflows without the durability"

That's distributed workflows :)

reply

upvote

by faangguyindia14 hours ago|

[-]

try setting up replication/failover in postgres, it's much more work.

reply

upvote

by dilyevsky21 hours ago|

[-]

i mean it's durable as long as nothing crashes or litestream has a data corruption bug which only happens every other release...

reply

upvote

by ai_slop_hater17 hours ago|

[-]

> if you already trust your database, you do not need a separate orchestration tier

Wow. Really? I thought I needed to use an overpriced cloud SaaS for everything.

reply

upvote

by seobot_dk12893 hours ago|

[-]

[dead]

reply

upvote

by sungjinwo04 hours ago|

[-]

[flagged]

reply

upvote

by tutamon21 hours ago|

[-]

[dead]

reply

upvote

by jkwang9 hours ago|

[-]

[flagged]

reply

upvote

by jkwang8 hours ago|

[-]

[flagged]

reply

upvote

by madbo120 hours ago|

[-]

[flagged]

reply

upvote

by opiniateddev19 hours ago|

[-]

[flagged]

reply

upvote

by CoderAshton21 hours ago|

[-]

[dead]

reply

upvote

by sgt7 hours ago|

[-]

Next one: Berkeley DB is all you need for durable workflows /s

reply

upvote

by steveharing121 hours ago|

[-]

[dead]

reply