Surprised how little comment this post has, this is an insane improvement.
I've been using Electric SQL but Automerge 3.0 seems to be the holy grail combining local first approach to CRDT?
Wondering if I should ditch Electric SQL and switch to this instead. I'm just not sure what kind of hardware I need to run a sync server for Automerge and how many users reads/writes it can support.
ElectricSQL is pretty good too but its still not quite there and implementing local first means some features related to rollback are harder to apply.
I'm still very new to this overall but that 10x memory boost is welcome as I find with very large documents the lag used to be very noticeable.
> there are so many approaches out there right now
I'm almost to the point where I'll need one of these solutions. I'm fleshing out the corner cases now. I'd appreciate if you mention some of the solutions I should be looking at, and the trade offs. I'd also appreciate if you could mention non-obvious pitfalls.
The use case is a voice note aggregation system, the notes are stored on S3 and cached locally to desktops and mobile applications. There are transcriptions, AI summaries, user annotations, and structured metadata associated with each voice note. The application will be used by a single human, but he might not always remember to sync or even have an internet connection when he wants to.
> In Automerge 3.0, we've rearchitected the library so that it also uses the compressed representation at runtime. This has achieved huge memory savings. For example, pasting Moby Dick into an Automerge 2 document consumes 700Mb of memory, in Automerge 3 it only consumes 1.3Mb!
> Finally, for documents with large histories load times can be much much faster (we recently had an example of a document which hadn't loaded after 17 hours loading in 9 seconds!).
I wonder if this is accomplished using controlled buffers in AsyncIterators. I recently built a tool for processing massive CSV files and was able to get the memory usage remarkably low, and control/scale it almost linearly because of how the workers (async iterators) are spawned and their workloads are managed. It kind of blew me away that I could get such fine-tuned control that I'd normally expect from Go or Rust (I'm using Deno for this project).
I'm well above 1.3mb, and although I could get it down there, performance would suffer. I'm curious how fast they sync this data with such tiny memory usage. If the resources were available before, despite using 700mb of memory, was it still faster?
These people are definitely smarter than I am so maybe their solution is a lot more clever than what I'm doing
edit: Oh, they did this part with Rust. I thought it was written in JS. I still wonder: how'd they get memory usage this low, and did it impact speed much? I'll have to dig into it
They say: "In Automerge 3.0, we've rearchitected the library so that it also uses the compressed representation at runtime. This has achieved huge memory savings."
Right, this didn't click at first but now I understand. I can actually gain similar benefits with my project by switching to storing the data as parquet/duckdb files; I had no idea the potential gains from compressed representations are so significant, so I'd been holding off on testing that out. Thanks for the nudge on that detail!
1. I can see there's an example of using it with React and Prosemirror, what's the gap to using it with Tiptap (for those who don't know, it's an abstraction on top of Prosemirror that aims to streamline the task of building editors)?
2. Is there any prior art or room in the design for supporting permissioned blocks of content _within_ a document? i.e things which some users aren't allowed to view (or edit)
What sort of applications is this used for? I'm a technical writer, and my team is facing versioning challenges for sections of documents. I'm wondering if this could be useful.
a number of these sync engines have been growing popular, most notably convex and zero (altho both of course are very different from automerge)--this one's rust/c api makes it more interesting, i wonder if an implementation for terminals uis could be possible?
Related. Others?
Show HN: Pg_CRDT – CRDTs in Postgres Using Automerge - https://news.ycombinator.com/item?id=43655920 - April 2025 (4 comments)
Automerge: A library of data structures for building collaborative applications - https://news.ycombinator.com/item?id=40976731 - July 2024 (58 comments)
Automerge-Repo: A "batteries-included" toolkit for local-first applications - https://news.ycombinator.com/item?id=38193640 - Nov 2023 (43 comments)
Automerge 2.0 - https://news.ycombinator.com/item?id=34586433 - Jan 2023 (89 comments)
Automerge CRDT – Build local-first software - https://news.ycombinator.com/item?id=30881016 - April 2022 (8 comments)
Automerge: A JSON-like data structure (a CRDT) that can be modified concurrently - https://news.ycombinator.com/item?id=30412550 - Feb 2022 (69 comments)
Automerge: a new foundation for collaboration software [video] - https://news.ycombinator.com/item?id=29501465 - Dec 2021 (29 comments)
Automerge: A library [..] for building collaborative applications in JavaScript - https://news.ycombinator.com/item?id=24791713 - Oct 2020 (1 comment)
Automerge: JSON-like data structure for building collaborative apps - https://news.ycombinator.com/item?id=16309533 - Feb 2018 (98 comments)
Surprised how little comment this post has, this is an insane improvement.
I've been using Electric SQL but Automerge 3.0 seems to be the holy grail combining local first approach to CRDT?
Wondering if I should ditch Electric SQL and switch to this instead. I'm just not sure what kind of hardware I need to run a sync server for Automerge and how many users reads/writes it can support.
ElectricSQL is pretty good too but its still not quite there and implementing local first means some features related to rollback are harder to apply.
I'm still very new to this overall but that 10x memory boost is welcome as I find with very large documents the lag used to be very noticeable.
It really depends on your use case. If you want people collaborating on a rich text document, Automerge or yjs are probably great.
If you want to have local first application data where a server is the authority, ElectricSQL is probably going to serve you best.
That said there are so many approaches out there right now, and they're all promising, but tricky.
The use case is a voice note aggregation system, the notes are stored on S3 and cached locally to desktops and mobile applications. There are transcriptions, AI summaries, user annotations, and structured metadata associated with each voice note. The application will be used by a single human, but he might not always remember to sync or even have an internet connection when he wants to.
Thank you!
The performance improvements are impressive:
> In Automerge 3.0, we've rearchitected the library so that it also uses the compressed representation at runtime. This has achieved huge memory savings. For example, pasting Moby Dick into an Automerge 2 document consumes 700Mb of memory, in Automerge 3 it only consumes 1.3Mb!
> Finally, for documents with large histories load times can be much much faster (we recently had an example of a document which hadn't loaded after 17 hours loading in 9 seconds!).
I wonder if this is accomplished using controlled buffers in AsyncIterators. I recently built a tool for processing massive CSV files and was able to get the memory usage remarkably low, and control/scale it almost linearly because of how the workers (async iterators) are spawned and their workloads are managed. It kind of blew me away that I could get such fine-tuned control that I'd normally expect from Go or Rust (I'm using Deno for this project).
I'm well above 1.3mb, and although I could get it down there, performance would suffer. I'm curious how fast they sync this data with such tiny memory usage. If the resources were available before, despite using 700mb of memory, was it still faster?
These people are definitely smarter than I am so maybe their solution is a lot more clever than what I'm doing
edit: Oh, they did this part with Rust. I thought it was written in JS. I still wonder: how'd they get memory usage this low, and did it impact speed much? I'll have to dig into it
> I recently built a tool for processing massive CSV files and was able to get the memory usage remarkably low
is it OSS? i'd like to benchmark it against my csv parser :)
They say: "In Automerge 3.0, we've rearchitected the library so that it also uses the compressed representation at runtime. This has achieved huge memory savings."
Right, this didn't click at first but now I understand. I can actually gain similar benefits with my project by switching to storing the data as parquet/duckdb files; I had no idea the potential gains from compressed representations are so significant, so I'd been holding off on testing that out. Thanks for the nudge on that detail!
High upvote/comment ratio is a sign of a quality post, honestly. Sometimes all you can do is upvote.
A few questions:
1. I can see there's an example of using it with React and Prosemirror, what's the gap to using it with Tiptap (for those who don't know, it's an abstraction on top of Prosemirror that aims to streamline the task of building editors)?
2. Is there any prior art or room in the design for supporting permissioned blocks of content _within_ a document? i.e things which some users aren't allowed to view (or edit)
1. You can use TipTap with it: just have to wrap your existing schema with automerge attributes. Undo redo would also swap out.
Is there info anywhere on the structure of the semi-lattice they are using for their CRDT?
Is the map based on a multi-value register or a last-writer-wins register?
See the docs: https://automerge.org/docs/reference/documents/conflicts/
What sort of applications is this used for? I'm a technical writer, and my team is facing versioning challenges for sections of documents. I'm wondering if this could be useful.
can you elaborate on what versioning issues you are facing?
Is this Javascript only?
It's written in Rust, but JavaScript is the primary friendly interface. https://github.com/automerge/automerge
There is also a C api wrapper, not sure the state of it wrt this latest release.
a number of these sync engines have been growing popular, most notably convex and zero (altho both of course are very different from automerge)--this one's rust/c api makes it more interesting, i wonder if an implementation for terminals uis could be possible?
Needs benchmarks with yjs
If you are after performance see jsonjoy.
are move operations for trees implemented now?
IIRC, Kleppmann built a prototype for it but it’s not included in Automerge yet.
[dead]