We rewrote our Rust WASM parser in TypeScript and it got faster

upvote

We rewrote our Rust WASM parser in TypeScript and it got faster

(www.openui.com)

209 points

by zahlekhan12 hours ago |

upvote

by rented_mule10 hours ago|

[-]

Something not unlike this happened to me when moving some batch processing code from C++ to Python 1.4 (this was 1997). The batch started finishing about 10x faster. We refused to believe it at first and started looking to make sure the work was actually being done. It was.

The port had been done in a weekend just to see if we could use Python in production. The C++ code had taken a few months to write. The port was pretty direct, function for function. It was even line for line where language and library differences didn't offer an easier way.

A couple of us worked together for a day to find the reason for the speedup. Just looking at the code didn't give us any clues, so we started profiling both versions. We found out that the port had accidentally fixed a previously unknown bug in some code that built and compared cache keys. After identifying the small misbehaving function, we had to study the C++ code pretty hard to even understand what the problem was. I don't remember the exact nature of the bug, but I do remember thinking that particular type of bug would be hard to express in Python, and that's exactly why it was accidentally fixed.

We immediately started moving the rest of our back end to Python. Most things were slower, but not by much because most of our back end was i/o bound. We soon found out that we could make algorithmic improvements so much more quickly, so a lot of the slowest things got a lot faster than they had ever been. And, most importantly, we (the software developers) got quite a bit faster.

reply

upvote

by ameixaseca5 hours ago|

[-]

My experience is the exact opposite.

This was particularly true for one of the projects I've worked with in the past, where Python was chosen as the main language for a monitoring service.

In short, it proved itself to be a disaster: just the Python process collecting and parsing the metrics of all programs consumed 30-40% of the processing power of the lower end boxes.

In the end, the project went ahead for a while more, and we had to do all sorts of mitigations to get the performance impact to be less of an issue.

We did consider replacing it all by a few open source tools written in C and some glue code, the initial prototype used few MBs instead of dozens (or even hundreds) of MBs of memory, while barely registering any CPU load, but in the end it was deemed a waste of time when the whole project was terminated.

reply

upvote

by wiseowise2 hours ago|

[-]

> but in the end it was deemed a waste of time when the whole project was terminated.

The main lesson of the story. Just pick Python and move fast, kids. It doesn’t matter how fast your software is if nobody uses it.

reply

upvote

by stephantul1 hours ago|

[-]

This is it. Getting something on the table for stakeholders to look at trumps anything else.

reply

upvote

by shevy-java1 hours ago|

[-]

[flagged]

reply

upvote

by littlestymaar5 minutes ago|

[-]

And this is why pretty much all commercial software is terrible and runs slower than the equivalent 20 years ago despite incredible advance in hardware.

reply

upvote

by serial_dev3 hours ago|

[-]

Another anecdote, the team couldn’t improve concurrency reliably in Python, they rewrote the service in about a month (ten years ago) in Go, everything ran about 20x faster.

reply

upvote

by asveikau8 hours ago|

[-]

> After identifying the small misbehaving function, we had to study the C++ code pretty hard to even understand what the problem was. I don't remember the exact nature of the bug, but I do remember thinking that particular type of bug would be hard to express in Python, and that's exactly why it was accidentally fixed.

Pure speculation, but I would guess this has something to do with a copy constructor getting invoked in a place you wouldn't guess, that ends up in a critical path.

reply

upvote

by andrewflnr8 hours ago|

[-]

Given the context, I'm thinking bad cache keys resulting in spurious cache misses, where the keys are built in some low-level way. Cache misses almost certainly have a bigger asymptotic impact than extra copies, unless that copy constructor is really heavy.

reply

upvote

by asveikau8 hours ago|

[-]

I'm just remembering a performance issue I heard of eons ago where a sorting function comparison callback inadvertently allocated memory. It made sorting very slow. Someone said in a meeting that sorting was slow, and we all had a laugh about "shouldn't have used the bubble sort!" But it was the key comparison doing something stupid.

reply

upvote

by NooneAtAll38 hours ago|

[-]

good ol' shallow-vs-deep copy

reply

upvote

by tda3 hours ago|

[-]

Ome advantage of python is that it is so slow that if you choose the wrong algorithm or data structure that soon gets obvious. And for complicated stuff this is exactly where I find the LLMs struggle. So I make a first version in Python, and only when I am happy with the results and the speed feels reasonable compared to the problem complexity, I ask Claude Code to port the critical parts to Rust.

reply

upvote

by rabisg3 hours ago|

[-]

The last part is really interesting. It feels like the whole world will soon become Python/JS because thats what LLMs are good at. Very few people will then take the pain of optimizing it

reply

upvote

by eru58 minutes ago|

[-]

The LLMs are pretty good at optimising.

Not because they are brilliant, but because they are pretty good at throwing pretty much all known techniques at a problem. And they also don't tire of profiling and running experiments.

reply

upvote

by elcritch8 minutes ago|

[-]

Not just profiling, but decoding protocols too.

Recently I tried Codex/GPT5 with updating a bluetooth library for batteries and it was able to start capturing bluetooth packets and comparing them with the libraries other models. It was indefatigable. I didn't even know if was so easy to capture BLE packets.

reply

upvote

by asa4009 hours ago|

[-]

Fun story! Performance is often highly unintuitive, and even counterintuitive (e.g. going from C++ to Python). Very much an art as well as a science.

Crazy how many stories like this I’ve heard of how doing performance work helped people uncover bugs and/or hidden assumptions about their systems.

reply

upvote

by staticassertion4 hours ago|

[-]

It doesn't come off as unintuitive by my read. They had a bug that led to a massive performance regression. Rewriting the code didn't have that bug so it led to a performance improvement.

They found that they had fewer bugs in Python so they continued with it.

reply

upvote

by harpiaharpyja2 hours ago|

[-]

I think a lot of people (especially those who are only peripherally involved in development, like management) don't really consider performance regressions at all when thinking about how to get software to go faster.

Meanwhile my experience has been that whenever there has been a performance issue severe enough to actually matter, it's often been the result of some kind of performance bug, not so much language, runtime, or even algorithm choices for that matter.

Hence whenever the topic of how to improve performance comes up, I always, always insist that we profile first.

reply

upvote

by staticassertion2 hours ago|

[-]

My experience has been that performance bugs show up in lots of places and I'm very lucky when it's just a bug. The far more painful performance issues are language and runtime limitations.

But, of course, profiling is always step one.

reply

upvote

by peter_retief1 hours ago|

[-]

I suspect that you used highly optimized algorithms written for python, like the vector algorithms in numpy? You will struggle to write better code, at least I would.

reply

upvote

by masklinn1 hours ago|

[-]

Python 1.4 would be mid-late 90s long before numpy and vector algorithms would have been available.

I suspect it’s more likely to be something like passing std::string by value not realising that would copy the string every time, especially with the statement that the mistake would be hard to express in Python.

reply

upvote

by johnisgood12 minutes ago|

[-]

Everything is new to the uninitiated. :P

reply

upvote

by shevy-java1 hours ago|

[-]

> We immediately started moving the rest of our back end to Python. Most things were slower, but not by much because most of our back end was i/o bound.

Would be kind of cool if e. g. python or ruby could be as fast as C or C++.

I wonder if this could be possible, assuming we could modify both to achieve that as outcome. But without having a language that would be like C or C++. Right now there is a strange divide between "scripting" languages and compiled ones.

reply

upvote

by DaleBiagio9 hours ago|

[-]

[flagged]

reply

upvote

by Aurornis6 hours ago|

[-]

This comment comes from a bot account. One of the more clever ones I’ve seen that avoids some of the usual tells, but the comment history taken together exposes it.

I hit the flag button on the comment and suggest others do too.

reply

upvote

by furyofantares6 hours ago|

[-]

Thanks, Programming History Facts Bot

I was not actually sure this one was a bot, despite LLM-isms and, sadly, being new. But you can look at the comment history and see.

reply

upvote

by samiv7 hours ago|

[-]

Until at some point in a language like python all the things that allowed you write software faster start to slow you down like the lack of static typing and typing errors and spending time figuring out whether foo method works with ducks or quacks or foovars or whether the latest refactoring actually silently broke it because now you need bazzes instead of ducks. Yeah.

reply

upvote

by apitman8 hours ago|

[-]

I don't think the better software part is playing out

reply

upvote

by ch4s37 hours ago|

[-]

There’s a lot of really great software out there right now, and a lot that’s terrible and I think powerful abstractions enable both.

reply

upvote

by remexre7 hours ago|

[-]

you're thinking of the programs in low-level langs that survived their higher-level-lang competitors; if you plot the programs on your machine by age, how does the low quartile compare on reliability between programs written in each group

reply

upvote

by envguard9 hours ago|

[-]

Agreed — the headline buries the lede. Algorithmic complexity improvements compound across all future inputs regardless of implementation language, while the WASM boundary win is more of a one-time gain. Worth noting that the statement-level caching insight generalises well: many parser-adjacent hot paths suffer the same O(N²) trap when doing repeated prefix/suffix matching without memoisation.

reply

upvote

by sincerely1 hours ago|

[-]

AI account

reply

upvote

by blundergoat12 hours ago|

[-]

The real win here isn't TS over Rust, it's the O(N²) -> O(N) streaming fix via statement-level caching. That's a 3.3x improvement on its own, independent of language choice. The WASM boundary elimination is 2-4x, but the algorithmic fix is what actually matters for user-perceived latency during streaming. Title undersells the more interesting engineering imo.

reply

upvote

by nulltrace10 hours ago|

[-]

Yeah the algorithmic fix is doing most of the work here. But call that parser hundreds of times on tiny streaming chunks and the WASM boundary cost per call adds up fast. Same thing would happen with C++ compiled to WASM.

reply

upvote

by hrmtst938372 hours ago|

[-]

WASM boundary overhead is only half the story. Once you start bouncing tiny chunks across JS and WASM over and over, the data shuffling and memory layout mismatch can trash cache behavior, pile on allocation churn, and turn a nice benchmark into something that looks nothing like a parser living inside a streaming pipeline. That's why most 'language duel' posts feel beside the point.

reply

upvote

by azakai10 hours ago|

[-]

O(N²) -> O(N) was 3.3x faster, but before that, eliminating the boundary (replacing wasm with JS) led to speedups of 2.2x, 4.6x, 3.0x (see one table back).

It looks like neither is the "real win". both the language and the algorithm made a big difference, as you can see in the first column in the last table - going to wasm was a big speedup, and improving the algorithm on top of that was another big speedup.

reply

upvote

by hrmtst938373 hours ago|

[-]

[dead]

reply

upvote

by socalgal211 hours ago|

[-]

same for uv but no one takes that message. They just think "rust rulez!" and ignore that all of uv's benefits are algo, not lang.

reply

upvote

by estebank10 hours ago|

[-]

Some architectures are made easier by the choice of implementation language.

reply

upvote

by EdwardDiego5 hours ago|

[-]

UV also has the distinct advantage in dependency resolution that it didn't have to implement the backwards compatible stuff Pip does, I think Astral blogged on it. If I can find it, I'll edit the link in.

edit wasn't Astral, but here's the blog post I was thinking of. https://nesbitt.io/2025/12/26/how-uv-got-so-fast.html

That said, your point is very much correct, if you watch or read the Jane Street tech talk Astral gave, you can see how they really leveraged Rust for performance like turning Python version identifiers into u64s.

reply

upvote

by crubier9 hours ago|

[-]

In my experience Rust typically makes it a little bit harder to write the most efficient algo actually.

reply

upvote

by catlifeonmars6 hours ago|

[-]

That’s usually ok bc in most code your N is small and compiler optimizations dominate.

reply

upvote

by Defletter6 hours ago|

[-]

Would you be willing to give an example of this?

reply

upvote

by lukeweston12341 hours ago|

[-]

Not OP, but one example where it is a bit harder to do something in Rust that in C, C++, Zig, etc. is mutability on disjoint slices of an array. Rust offers a few utilities, like chunks_by, split_at, etc. but for certain data structures and algorithms it can be a bit annoying.

It's also worth noting that unsafe Rust != C, and you are still battling these rules. With enough experience you gain an understanding of these patterns and it goes away, and you also have these realy solid tools like Miri for finding undefined behavior, but it can be a bit of a hastle.

reply

upvote

by rowanG07710 hours ago|

[-]

That's a pretty big claim. I don't doubt that a lot of uv's benefits are algo. But everything? Considering that running non IO-bound native code should be an order of magnitude faster than python.

reply

upvote

by jeremyjh9 hours ago|

[-]

Its a pretty well-supported claim. uv skips doing a number of things that generate file I/O. File I/O is far more costly than the difference in raw computation. pip can't drop those for compatibility reasons.

https://nesbitt.io/2025/12/26/how-uv-got-so-fast.html

reply

upvote

by staticassertion4 hours ago|

[-]

Do you actually believe that UV would be as fast if it were written in Python?

reply

upvote

by tinco2 hours ago|

[-]

It would come pretty close, probably close enough that you wouldn't be able to tell the difference on 90% of projects.

reply

upvote

by staticassertion1 hours ago|

[-]

Vague. What's pretty close? I mean, even for IO bound tasks you can pretty quickly validate that the performance between languages is not close at all - 10 to 100x difference.

reply

upvote

by tinco1 hours ago|

[-]

Sure, within 100ms. Who cares what the performance multiples are?

reply

upvote

by staticassertion1 hours ago|

[-]

That literally makes no sense. 100ms... out of what? Is it 1ms vs 100ms? 100000ms vs 100100ms?

Anyway, dubious claim since a Python interpreter will take 10s of milliseconds just to print out its version.

Do you have any evidence? I can point at techempower benchmarks showing IO bound tasks are still 10-100x faster in native languages vs Python/JS.

reply

upvote

by tinco1 hours ago|

[-]

I'm saying that the Rust might execute in 50ms and the Python in 150ms. You are the one not making sense, we are talking about application performance, why are you not measuring that in milliseconds.

That is assuming Rust is 100x faster than Python btw, 49ms of I/O, 1ms of Rust, 100ms of Python.

reply

upvote

by rowanG0779 hours ago|

[-]

I don't think the article you linked supports the claim that none of UV performance improvements are related to using rust over python at all. In fact it directly states the exact opposite. They have an entire section dedicated to why using Rust has direct performance advantages for UV.

reply

upvote

by jeremyjh7 hours ago|

[-]

What it says is this:

> uv is fast because of what it doesn’t do, not because of what language it’s written in. The standards work of PEP 518, 517, 621, and 658 made fast package management possible. Dropping eggs, pip.conf, and permissive parsing made it achievable. Rust makes it a bit faster still.

reply

upvote

by rowanG0777 hours ago|

[-]

Yes exactly! That quote directly disproves that all of the improvements UV has over competitors is because of algos, not because of rust.

So the claim is not well supported at all by the article as you stated, in fact the claim is literally disproven by the article.

reply

upvote

by kyralis6 hours ago|

[-]

This is either an overly pedantic take or a disingenuous one. The very first line that the parent quoted is

> uv is fast because of what it doesn’t do, not because of what language it’s written in.

The fact that the language had a small effect ("a bit") does not invalidate the statement that algorithmic improvements are the reason for the relative speed. In fact, there's no reason to believe that rust without the algorithmic version would be notably faster at all. Sure, "all" is an exaggeration, but the point made still stands in the form that most readers would understand it: algorithmic improvements are the important difference between the systems.

reply

upvote

by rowanG0776 hours ago|

[-]

I think we might be talking past each other a bit.

The specific claim I was responding to was that all of uv’s performance improvements come from algorithms rather than the language. My point was just that this is a stronger claim than what the article supports, the article itself says Rust contributes “a bit” to the speed, so it’s not purely algorithmic.

I do agree with the broader point that algorithmic and architectural choices are the main reason uv is fast, and I tried to acknowledge that, apparently unsuccessfully, in my very my first comment (“I don't doubt that a lot of uv's benefits are algo. But everything?”).

reply

upvote

by jeremyjh7 hours ago|

[-]

You are right. 99% is not 100%.

reply

upvote

by rowanG0776 hours ago|

[-]

I don't think the article has substantive numbers. You'd have to re-implement UV in python to do that. I don't think anyone did that. It would be interesting at least to see how much UV spends in syscalls vs PIP and make a relative estimate based on that.

reply

upvote

by thfuran9 hours ago|

[-]

More than one, I'd think.

reply

upvote

by catlifeonmars6 hours ago|

[-]

You’re not wrong, but that win would not get as many views. It’s not clickbaity enough

reply

upvote

by Aurornis11 hours ago|

[-]

> Title undersells the more interesting engineering imo.

Thanks for cutting through the clickbait. The post is interesting, but I'm so tired of being unnecessarily clickbaited into reading articles.

reply

upvote

by sroussey11 hours ago|

[-]

Yeah, though the n^2 is overstating things.

One thing I noticed was that they time each call and then use a median. Sigh. In a browser. :/ With timing attack defenses build into the JS engine.

reply

upvote

by fn-mote10 hours ago|

[-]

For those of us not in the know, what are we expecting the results of the defenses to be here?

reply

upvote

by sroussey5 hours ago|

[-]

Jitter. It make precise timings unreliable. Time the entire time of 1000 runs and divide by 1000 instead of starting and stopping 1000 timers.

reply

upvote

by adastra224 hours ago|

[-]

No AI generated comments on HN please.

reply

upvote

by shmerl11 hours ago|

[-]

More like a misleading clickbait.

reply

upvote

by simonbw7 hours ago|

[-]

Yeah if you're serializing and deserializing data across the JS-WASM boundary (or actually between web workers in general whether they're WASM or not) the data marshaling costs can add up. There is a way of sharing memory across the boundary though without any marshaling: TypedArrays and SharedArrayBuffers. TypedArrays let you transfer ownership of the underlying memory from one worker (or the main thread) to another without any copying. SharedArrayBuffers allow multiple workers to read and write to the same contiguous chunk of memory. The downside is that you lose all the niceties of any JavaScript types and you're basically stuck working with raw bytes.

You still do get some latency from the event loop, because postMessage gets queued as a MacroTask, which is probably on the order of 10μs. But this is the price you have to pay if you want to run some code in a non-blocking way.

reply

upvote

by osullivj4 hours ago|

[-]

Strongly agree from an Emscripten C++ wasm pov: it's key to minimise emscripten::val roundtrips. Caches must be designed for rectilinear data geometry, and SharedArrayBuffers are the way for bulk data. But only JS allows us to express asynchrony, so we need an on_completion callback design at the lang boundary.

reply

upvote

by jesse__5 hours ago|

[-]

This should be the top comment

reply

upvote

by nine_k11 hours ago|

[-]

"We rewrote this code from language L to language M, and the result is better!" No wonder: it was a chance to rectify everything that was tangled or crooked, avoid every known bad decision, and apply newly-invented better approaches.

So this holds even for L = M. The speedup is not in the language, but in the rewriting and rethinking.

reply

upvote

by MiddleEndian11 hours ago|

[-]

Now they just need a third party who's never seen the original to rewrite their TypeScript solution in Rust for even more gains.

reply

upvote

by nine_k11 hours ago|

[-]

Indeed! But only after a year or so of using it in production, so that the drawbacks would be discovered.

reply

upvote

by azakai10 hours ago|

[-]

You're generally right - rewrites let you improve the code - but they do have an actual reason the new language was better: avoiding copies on the boundary.

They say they measured that cost, and it was most of the runtime in the old version (though they don't give exact numbers). That cost does not exist at all in the new version, simply because of the language.

reply

upvote

by necovek52 minutes ago|

[-]

It's doing copies and (de)serialization on both sides into native data types.

If they used raw byte structures, implemented the caching improvements on the wasm side, the copies might not be as bad.

But they still have an issue with multi-language stack: complexity also has a cost.

Python/C combo does not have this issue because you can work with Python types natively in C, but otherwise, this is a cross-language conversion issue, and not a Rust issue at all.

reply

upvote

by johnisgood25 minutes ago|

[-]

I have been saying this for a while now (thought it was obvious), and often I get downvoted when I point this out.

reply

upvote

by rabisg3 hours ago|

[-]

One of the authors here. While that’s generally true, in this case it wasn’t time that helped us learn what worked. It was a nagging sense that the architecture wasn’t right, just days before launch, along with heavy instrumentation to test our assumptions.

reply

upvote

by baranul11 hours ago|

[-]

Truth. You can see improvement, even rewriting code in the same language.

reply

upvote

by awesome_dude8 hours ago|

[-]

I think that they were honest about that to a degree, they pointed out that one source of the speed up was caused by the python fixing a big they hadn't noticed in the C++

Edit: fixed phone typos

reply

upvote

by evmar11 hours ago|

[-]

By the way, I did a deeper dive on the problem of serializing objects across the Rust/JS boundary, noticed the approach used by serde wasn’t great for performance, and explored improving it here: https://neugierig.org/software/blog/2024/04/rust-wasm-to-js....

reply

upvote

by slopinthebag9 hours ago|

[-]

Did you try something like msgpack or bebop?

reply

upvote

by spankalee11 hours ago|

[-]

I was wondering why I hadn't heard of Open UI doing anything with WASM.

This new company chose a very confusing name that has been used by the Open UI W3C Community Group for over 5 years.

https://open-ui.org/

Open UI is the standards group responsible for HTML having popovers, customizable select, invoker commands, and accordions. They're doing great work.

reply

upvote

by Dwedit1 hours ago|

[-]

JS and WASM share the main arraybuffer. It's just very not-javascript-like to try to use an arraybuffer heap, because then you don't have strings or objects, just index,size pairs into that arraybuffer.

Anyway, Javascript is no stranger to breaking changes. Compare Chromium 47 to today. Just add actual integers as another breaking change, then WASM becomes almost unnecessary.

reply

upvote

by bulbar1 hours ago|

[-]

Is this an outlier or has Rust started to be part of the establishment and being 'old' so that people want to share their "moving away from Rust" stories?

I didn't mind reading articles that are not about how Rust is great in theory (and maybe practice).

reply

upvote

by zozbot23426 minutes ago|

[-]

This story is about moving away from WASM for an application that's unsuitable for it. It's not really about Rust.

reply

upvote

by quotemstr1 hours ago|

[-]

There's a certain segment of the industry that's always chasing the newest thing. Many of them like Zig for some ghastly reason.

That said, Rust does have real problems. Manual memory management sucks. People think GC is expensive? Well, keep in mind malloc() and free() take global locks! People just have totally bogus mental models of what drives performance. These models lead them to technical nonsense.

reply

upvote

by athrowaway3z4 hours ago|

[-]

Its also worth underlining that it's not just "The parsing computation is fast enough that V8's JIT eliminates any Rust advantage", but specifically that this kind of straight-forward well-defined data structures and mutation, without any strange eval paths or global access is going to be JITed to near native speed relatively easily.

reply

upvote

by 4 hours ago|

[-]

deleted

reply

upvote

by vmsp9 hours ago|

[-]

Not directly related to the post but what does OpenUI do? I'm finding it interesting but hard to understand. Is it an intermediate layer that makes LLMs generate better UI?

reply

upvote

by rabisg3 hours ago|

[-]

Its the library that bridges the gap between LLMs and live UI. Best example would be to imagine you want to build interactive charts within your AI agent (like Claude)

The most obvious approach would be to let LLMs generate code and render it but that introduces problems like safety, UI consistency and speed. OpenUI solves those problems and provides a safe, consistent and token optimized runtime for the LLMs to render live UI

reply

upvote

by horacemorace6 hours ago|

[-]

I’m more of a dabbler dev/script guy than a dev but Every. single. thing I ever write in javascript ends up being incredibly fast. It forces me to think in callbacks and events and promises. Python and C (or async!) seem easy and sorta lazy in comparison.

reply

upvote

by joaohaas10 hours ago|

[-]

God I hate AI writing.

That final summary benchmark means nothing. It mentions 'baseline' value for the 'Full-stream total' for the rust implementation, and then says the `serde-wasm-bindgen` is '+9-29% slower', but it never gives us the baseline value, because clearly the only benchmark it did against the Rust codebase was the per-call one.

Then it mentions: "End result: 2.2-4.6x faster per call and 2.6-3.3x lower total streaming cost."

But the "2.6-3.3x" is by their own definition a comparison against the naive TS implementation.

I really think the guy just prompted claude to "get this shit fast and then publish a blog post".

reply

upvote

by chvish4 hours ago|

[-]

This. It’s so annoying to read these types of blogs now where the writer clearly didn’t put the effort to understand things fully or atleast review the blog their LLM wrote. Who is this useful for?

reply

upvote

by JimDabell4 hours ago|

[-]

The article as a whole makes no sense. They are generating UI with an LLM. How fast the UI appears to the user is going to be completely dictated by the speed of the LLM, not the speed of the serialisation.

reply

upvote

by rabisg2 hours ago|

[-]

as an author of the blog - ouch did a little bit more than prompt claude but a lot of claude prompting was definitely involved

I understand your frustration with AI writing though. We are a small team and given our roadmap it was either use LLMs to help collate all the internal benchmark results file into a blog or never write it so we chose the former. This was a genuinely surprising and counterintuitive result for us, which is why we wanted to share it. Happy to clarify any of the numbers if helpful.

reply

upvote

by jesse__5 hours ago|

[-]

This somehow reminds me of the days when the fastest way to deep copy an object in javascript was to round trip through toString. I thought that was gross then, and I think this is gross now

reply

upvote

by jeremyjh9 hours ago|

[-]

> The openui-lang parser converts a custom DSL emitted by an LLM into a React component tree.

> converts internal AST into the public OutputNode format consumed by the React renderer

Why not just have the LLM emit the JSON for OutputNode ? Why is a custom "language" and parser needed at all? And yes, there is a cost for marshaling data, so you should avoid doing it where possible, and do it in large chunks when its not possible to avoid. This is not an unknown phenomenon.

reply

upvote

by mwcampbell4 hours ago|

[-]

I hope we can still get to a point where wasm modules can directly access the web platform APIs and get JS out of the picture entirely. After all, those APIs themselves are implemented in C++ (and maybe some Rust now).

reply

upvote

by slopinthebag9 hours ago|

[-]

This article is obviously AI generated and besides being jarring to read, it makes me really doubt its validity. You can get substantially faster parsing versus `JSON.parse()` by parsing structured binary data, and it's also faster to pass a byte array compared to a JSON string from wasm to the browser. My guess is not only this article was AI generated, but also their benchmarks, and perhaps the implementation as well.

reply

upvote

by StilesCrisis9 hours ago|

[-]

It's vibe code all the way down!

reply

upvote

by sakesun6 hours ago|

[-]

I heard a lot of similar stories in the past when I started using Python 20+ years ago. A number of people claimed their solutions got faster when develop in Python, mainly because Python make it easier to quickly pivot to experiment with various alternative methods, hence finally yield at more efficient outcome at the end.

reply

upvote

by ivanjermakov10 hours ago|

[-]

Good software is usually written on 2nd+ try.

reply

upvote

by owenpalmer8 hours ago|

[-]

So this is an issue with WASM/JS interop, not with Rust per se?

reply

upvote

by nallana10 hours ago|

[-]

Why not a shared buffer? Serializing into JSON on this hot path should be entirely avoidable

reply

upvote

by mavdol049 hours ago|

[-]

I think a shared array just avoids the copy, not the serialization which is the main problem as they showed with serde-wasm-bindgen test

reply

upvote

by devnotes7710 hours ago|

[-]

[dead]

reply

upvote

by dmix11 hours ago|

[-]

That blog post design is very nice. I like the 'scrollspy' sidebar which highlights all visible headings.

Claude tells me this is https://www.fumadocs.dev/

reply

upvote

by sroussey11 hours ago|

[-]

Interesting, thanks. I need make some good docs soon.

reply

upvote

by dmix11 hours ago|

[-]

Good documentation is always worth the effort. Markdown explaining your products is gold these days with LLMs.

reply

upvote

by rabisg2 hours ago|

[-]

[dead]

reply

upvote

by envguard9 hours ago|

[-]

The WASM story is interesting from a security angle too. WASM modules inheriting the host's memory model means any parsing bugs that trigger buffer overreads in the Rust code could surface in ways that are harder to audit at the JS boundary. Moving to native TS at least keeps the attack surface in one runtime, even if the theoretical memory safety guarantees go down.

reply

upvote

by kennykartman9 hours ago|

[-]

I dream of the day in which there is no need to pass by JS and Wasm can do all the job by itself. Meanwhile, we are stuck.

reply

upvote

by shevy-java1 hours ago|

[-]

So ...

Rust.

WASM.

TypeScript.

I am slowly beginning to understand why WASM did not really succeed.

reply

upvote

by marcosdumay9 hours ago|

[-]

It would be great if people stopped dismissing the problem that WASM not being a first-class runtime for the web causes.

reply

upvote

by caderosche11 hours ago|

[-]

What is the purpose of the Rust WASM parser? Didn't understand that easily from the article. Would love a better explanation.

reply

upvote

by joshuanapoli10 hours ago|

[-]

They use a bespoke language to define LLM-generated UI components. I think that this is supposed to prevent exfiltration if the LLM is prompt-injected. In any case, the parser compiles chunks streaming from the LLM to build a live UI. The WASM parser restarted from the beginning upon each chunk received. Fixing this algorithm to work more incrementally (while porting from Rust to TypeScript) improved performance a lot.

reply

upvote

by nssnsjsjsjs10 hours ago|

[-]

Rewrite bias. Yoy want to also rewrite the Rust one in Rust for comparison.

reply

upvote

by jeremyjh10 hours ago|

[-]

It would be surprising if rewriting in Rust could change the WASM boundary tax that the article identified as the actual problem.

reply

upvote

by rabisg2 hours ago|

[-]

(author here) We'd be really surprised if a rewrite could fix the boundary tax but if it does, we'd happily move over to it. People (including me) really underestimate how insanely fast browser's JSON.parse is

reply

upvote

by szmarczak10 hours ago|

[-]

> Attempted Fix: Skip the JSON Round-Trip > We integrated serde-wasm-bindgen

So you're reinventing JSON but binary? V8 JSON nowadays is highly optimized [1] and can process gigabytes per second [2], I doubt it is a bottleneck here.

[1] https://v8.dev/blog/json-stringify [2] https://github.com/simdjson/simdjson

reply

upvote

by kam9 hours ago|

[-]

No, serde-wasm-bindgen implements the serde Serializer interface by calling into JS to directly construct the JS objects on the JS heap without an intermediate serialization/deserialization. You pay the cost of one or more FFI calls for every object though.

https://docs.rs/serde-wasm-bindgen/

reply

upvote

by measurablefunc7 hours ago|

[-]

I tried a similar experiment recently w/ FFT transform for wav files in the browser and javascript was faster than wasm. It was mostly vibe coded Rust to wasm but FFT is a well-known algorithm so I don't think there were any low hanging performance improvements left to pick.

reply

upvote

by neuropacabra10 hours ago|

[-]

This is very unusual statement :-D

reply

upvote

by slowhadoken10 hours ago|

[-]

Am I mistaken or isn’t TypeScript just Golang under the hood these days?

reply

upvote

by jeremyjh9 hours ago|

[-]

There is too much wrong here to call it a mistake.

reply

upvote

by iainmerrick10 hours ago|

[-]

Hmm, there's an in-progress rewrite of the TypeScript compiler in Go; is that what you mean?

I don't think that's actually out yet, and more importantly, it doesn't change anything at runtime -- your code still runs in a JS engine (V8, JSC etc).

reply

upvote

by koakuma-chan8 hours ago|

[-]

npm i -D @typescript/native-preview

You can use it today.

reply

upvote

by Yanko_111 hours ago|

[-]

[dead]

reply

upvote

by derodero247 hours ago|

[-]

[dead]

reply

upvote

by aimarketintel7 hours ago|

[-]

[dead]

reply

upvote

by DaleBiagio9 hours ago|

[-]

[dead]

reply

upvote

by dualblocksgame9 hours ago|

[-]

[dead]

reply

upvote

by patapim10 hours ago|

[-]

[dead]

reply

upvote

by SCLeo11 hours ago|

[-]

They should rewrite it in rust again to get another 3x performance increase /s

reply

upvote

by ConanRus9 hours ago|

[-]

[dead]

reply