upvote
I was about to ask where you work that you’re implementing new CRDTs and then I noticed your username! Thanks for all that you do!

I work on the live collab at my company, and using AI while coding has into recently sort of “clicked” for me. We use an (I’m pretty sure) unheard of algorithm for collaborative editing, and I’ve had a long term goal of turning it into an implementation of EG Walker, but our document model is very complex and most out of the box CRDTs don’t quite fit. Maybe Fable will be what gets me over the hump.

reply
Long shot here because I'm not knowledgeable enough about CRDTs but maybe something like DSON would help? I saw a talk about it a while ago and it might be useful.

https://blog.helsing.ai/posts/dson-a-delta-state-crdt-for-re...

https://www.youtube.com/watch?v=4QkLD7JhD_I&pp=ygUJZHNvbiBjc...

reply
Ty, checking this out!
reply
I’d be fascinated to hear more if you’re willing to share. What is special about your document model which makes existing tools like automerge a bad fit?
reply
We have cross-field invariants that merging at the data structure level can't ensure (in an obvious way, at least), and "lose the semantic meaning of a conflict". The main idea behind their approach is that certain parts of the model can have custom "mergers" that are able to run business logic to maintain these invariants.

Worth noting, the decision to eschew CRDTs predates my time here, and I've pushed for a CRDT rewrite quite a bit since I believe it could be done. The other main concern they had was memory usage, but it seems like EG Walker would solve that. Our system uses a "Commit DAG", (an Event DAG by another name), and does a three-way merge using a common ancestor of the diverged documents, and so a lot of the bones of EG Walker are there, and I'm exploring ways in which we could gradually move to it.

reply
Hello joseph,

I saw scanning the comments and saw you mentioned CRDT. Just wanted to mention that I implemented a CRDT-flavoured sync engine for the product I'm working on a while ago, I think it was with Opus 4.6 if I'm not mistaken (or earlier) so it's not something new to Fable 5, just fyi.

reply
Yeah, you've certainly been able to get Opus to write a CRDT. It just needs a lot of hand-holding to make it correct. Opus always seems pretty bad at coming up with invariants and using them to make a piece of software correct. Without invariants, you end up with lots of hacky workarounds to avoidable problems.

So far at least - and its been less than a day - Fable seems better at this.

I think I also do my CRDTs differently from others. I've grown to like the pure-oplog approach after making eg-walker. LLMs are much worse at this!

reply
[flagged]
reply
> wrote a fuzzer for its prototype which explored and verified that its reasoning was correct. It absolutely nailed it.

For such a data structure, "nailing it" means a formal proof of correctness. Fuzzing, as useful as it is, is merely throwing dirt at the wall and seeing if anything sticks.

reply
I’ll ask it for a formal proof when I get home and see how it goes.

I’ve read plenty of papers with “formal proofs of correctness” that turned out to have huge flaws. Machine verifiable proofs I trust. But I’ve personally found more bugs with fuzzing than I have via proofs.

reply
Oh I actually mean machine checked. Indeed, formal pen-and-paper proofs can have flaws, since they are essentially code without test coverage.
reply
In the real world, many of us don't have the time to create formal proofs. But our instinct in testing where edge cases may exist in code that we wrote is a type of refactoring that happens in our brains during the coding process. Hand the coding off to a machine and you have no idea where to start looking for the flaws.
reply
> Hand the coding off to a machine and you have no idea where to start looking for the flaws.

I have found this quickly becomes false. I have learned I cannot review llm generated code as if it is written by a trusted senior developer (where I often just do a quick look, see nothing obvious and hit approve). Once you start reading the code in depth with the goal of understanding you quickly see the places where flaws are likely. Sure I start with no clue where to look, but it doesn't take long to see things.

reply
Yes but it takes much longer to trace them. Because the LLM code almost always gravitates toward data blobs and highly dynamic objects and spaghetti that takes a ton of cognitive load to understand what their failure modes are. Even when it does document them.
reply
> In the real world, many of us don't have the time to create formal proofs

Of course not. That's why they are so rare. But I thought we live in an AI era now where this kind of stuff can be done by a machine.

reply