upvote
So I know these are just benchmarks, but apparently Elixir is one of the best languages to use with AI, despite having a smaller training dataset: https://www.youtube.com/watch?v=iV1EcfZSdCM and https://github.com/Tencent-Hunyuan/AutoCodeBenchmark/tree/ma...

Furthermore, it's actually kind of annoying that the LLMs are not better than us, and still benefit from having code properly typed, well-architected, and split into modules/files. I was lamenting this fact the other day; the only reason we moved away from Assembly and BASIC, using GOTOs in a single huge file was because us humans needed the organization to help us maintain context. Turns out, because of how they're trained, so do the LLMs.

So TypeScript types and tests actually do help a lot, simply because they're deterministic guardrails that the LLM can use to check its work and be steered to producing code that actually works.

reply
I don't think LLMs benefit from having code properly typed (at the call definition). It's costly to have to check a possibly remote file to check. The LLM should be able to intuit what the types are at the callsite and elixir has ~strong conventions that LLMs probably take advantage of
reply
llms benefit greatly from feedback and typing/type errors are one of the fastest and easiest methods of feedback to give to an llm.
reply
Think about fitts law: the fastest place to click under a cursor is the location of the cursor. For an LLM the least context-expensive feedback is no feedback at all.

I think codebases that are strongly typed sometimes have bad habits that "you can get away with" because of the typing and feedback loops, the LLM has learned this.

https://x.com/neogoose_btw/status/2023902379440304452?s=61

reply
This is well put. If the LLM gets the type wrong, then we're already discussing a failure scenario with a feedback loop involving back-and-forth changes.

LLMs are not really good at this. The idea that LLMs benefit from TypeScript is a case of people anthropomorphizing AI.

The kinds of mistakes AI makes are very different. It's WAY better than humans at copying stuff verbatim accurately and nailing the 'form' of the logic. What it struggles with is 'substance' because it doesn't have a complete worldview so it doesn't fully understand what we mean or what we want.

LLMs struggle more with requirements engineering and architecture since architecture ties into anticipating requirements changes.

reply
> I don't see the point of Elixir now. LLMs work better with mainstream languages which make up a bigger portion of their training set.

I can't say if it works better with other languages, but I can definitely say both Opus and Codex work really well with Elixir. I work on a fairly large application and they consistently produce well structured working code, and are able to review existing code to find issues that are very easy to miss.

The LLM needs guidance around general patterns, e.g. "Let's use a state machine to implement this functionality" but it writes code that uses language idioms, leverages immutability and concurrency, and generally speaking it's much better than any first pass that I would manually do.

I have my ethical concerns, but it would be foolish of me to state that it works poorly - if anything it makes me question my own abilities and focus in comparison (which is a whole different topic).

reply
> Succinctness, functionality and popularity of the language are now much more important factors.

Not my experience at all. The most important factor is simplicity and clarity. If an LLM can find the pattern, it can replicate that pattern.

Language matters to the extent it encourages/forces clear patterns. Language with more examples, shorter tokens, popularity, etc - doesn't matter at all if the codebase is a mess.

Functional languages like Elixir make it very easy to build highly structured applications. Each fn takes in a thing and returns another. Side effects? What side effects? LLMs can follow this function composition pattern all day long. There's less complexity, objectively.

But take languages that are less disciplined. Throw in arbitrary side effects and hidden control flow and mutable state ... the LLM will fail to find an obviously correct pattern and guess wildly. In practice, this makes logical bugs much more likely. Millions of examples don't help if your codebase is a swamp. And languages without said discipline often end up in a swamp.

reply
LLMs work great with Elixir. Running tsc in a loop while generating code still catches type errors introduced by an LLM and it’s faster than generating additional tests. Elixir is also succinct and highly functional. If you can’t find a specific library it’s easier than ever to build out the barebones functionality you need yourself or use NIFs, ports, etc.

https://dashbit.co/blog/why-elixir-best-language-for-ai

reply
> Succinctness, functionality and popularity of the language are now much more important factors.

No. I would argue that popularity per se is irrelevant: if there are a billion examples of crap code, the LLMs learn crap code. conversely know only 250 documents can poison an LLM independent if model size. [Cite anthropic paper here].

The most important thing is conserve context. Succinctness is not really what you want because most context is burned on thinking and tool calls (I think) and not codegen.

Here is what I think is not important: strong typing, it requires a tool call anyways to fetch the type.

Here is what I think is important:

- fewer footguns - great testing (and great testing examples) - strong language conventions (local indicators for types, argument order conventions, etc) - no weird shit like __init__.py that could do literally anything invisible to the standard code flow

reply
Your code doesn’t run anywhere? Running on the BEAM is extremely helpful for a lot of things. Also, I review my LLM output, I want that experience to be enjoyable.
reply
I'm starting to see a new genre of post here in the AI bubble, where people go to topics that aren't about AI at all, and comment something like, "this doesn't matter because it's not AI". This is the third I've seen in a week.
reply