That is clearly false. I’m only familiar with Opus, but it quite regularly tells me that, and/or decides it needs to do research before answering.
If I instruct it to answer regardless, it generally turns out that it indeed didn’t know.
> Please carefully review (whatever it is) and list out the parts that have the most risk and uncertainty. Also, for each major claim or assumption can you list a few questions that come to mind? Rank those questions and ambiguities as: minor, moderate, or critical.
> Afterwards, review the (plan / design / document / implementation) again thoroughly under this new light and present your analysis as well as your confidence about each aspect.
There's a million variations on patterns like this. It can work surprisingly well.
You can also inject 1-2 key insights to guide the process. E.g. "I don't think X is completely correct because of A and B. We need to look into that and also see how it affects the rest of (whatever you are working on)."
"Ok let's look at these issues 1 at a time. Can you walk me through each one and help me think through how to address it"
And then it will usually give a few options for what to do for each one as well as a recommendation. The recommendation is often fairly decent, in which case I can just say "sounds good". Or maybe provide a small bit of color like: "sounds good but make sure to consider X".
Often we will have a side discussion about that particular issue until I'm satisfied. This happen more when I'm doing design / architectural / planning sessions with the AI. It can be as short or as long as it needs. And then we move on to the next one.
My main goal with these strategies is to help the AI get the relevant knowledge and expertise from my brain with as little effort as possible on my part. :D
A few other tactics:
- You can address multiple at once: "Item 3, 4, and 7 sound good, but lets work through the others together."
- Defer a discussion or issue until later: "Let's come back to item 2 or possibly save for that for a later session".
- Save the review notes / analysis / design sketch to a markdown doc to use in a future session. Or just as a reference to remember why something was done a certain way when I'm coming back to it. Can be useful to give to the AI for future related work as well.
- Send the content to a sub-agent for a detailed review and then discuss with the main agent.
The only way to make LLMs useful for now is to restrain their hallucinations as much as possible with evals, and these evals need to be very clear about what are the goal you're optimizing for.
See karpathy's work on the autoresearch agent and how it carry experiments, it might be useful for what you're doing.
Man, I wish this was true. I know a bunch of non tech people who just trusts random shit that chatgpt made up.
I had an architect tell me "ask chatgpt" when I asked her the difference between two industrial standard measures :)
We had politicians share LLM crap, researchers doing papers with hallucinated citations..
It's not just tech people.
It took a lot of back-and-forths with her to convince her that the numbers she uses every day are "Arabic numerals". Even the author of the spec could barely convince her -- it took a meeting with the Arabic translators (several different ones) to finally do it. Think about that for a minute. People won't believe subject matter experts over an LLM.
We're cooked.
It would help if you briefly specified the AI you are using here. There are wildly different results between using, say, an 8B open-weights LLM and Claude Opus 4.6.
You want low deterministic latency with sharp tails.
If all you care about is throughput then deep pipelines + lots of threads will get you there at the cost of latency.
You have to optimize your memory usage patterns to fit in CPU cache as much as possible which is something typical Java develops don't consider. I have a background in assembly and C.
I'd say it's slightly harder since there is a little bit of abstraction but most of the time the JIT will produce code as good as C compilers. It's also an niche that often considers any application running on a general purpose CPU to be slow. If you want industry leading speed you start building custom FPGAs.
how exactly you are passing data? You can pass some primitives without allocating them on heap. You can use some tiny subset of Java+standard library to write high performance code, but why would you do this instead of using Rust or C++?
Strangely this is one of the areas where I want to use project panama so I might re-implement some of the ring buffers constructs.
You allocate off heap memory and dump data into it. With modern Java classes like Arena, MemoryLayout, and VarHandle it's honestly a lot like C structs.
I answered "why" in another post in this thread.
Then things like the jit, by default, doing run time profiling and adaptation.
In terms of speed, memory usage, runtime characteristics... sure there are better options. But if java is good enough, or can be made good enough by writing the code correctly, why add another toolchain?
"writing code correctly" here means stripping 95% of lang capabilities, and writing in some other language which looks like C without structs (because they will be heap allocated with cross thread synchronization and GC overhead) and standard lib.
Its good enough for some tiny algo, but not good enough for anything serious.
those have low bar of performance, also they mostly became popular because of investments from Java hype, and rust didn't exist or had weak ecosystem at that time.
It wasn't a matter of choosing Java for HFT, it was a matter of selecting a project that was a good fit for Java and my personal knowledge. I was a Java instructor for Sun for over a decade, I authored a chunk of their Java curriculum. I wrote many of the concurrency questions in the certification exams. It's in my wheelhouse :)
My C and assembly is rusty at this point so I believe I can hit my performance goals with Java sooner than if I developed in more bare metal languages.
I've worked at places where ~5us was considered the fast path and tails were acceptable.
In my current role it's less than a microsecond packet in, packet out (excluding time to cross the bus to the NIC).
But arguably it's not true HFT today unless you're using FPGA or ASIC somewhere in your stack.
So yeah there's really no HFT anymore, it's just order execution, and some algo trades want more or less latency which merits varying levels of technical squeezing latency out of systems.
I don't work for a firm so don't get to play with FPGAs. I'm also not co-located in an exchange and using microwave towers for networking. I might never even have access to kernel networking bypass hardware (still hopeful about this one). Hardware optimization in my case will likely top out at CPU isolation for the hot path thread and a hosting provider in close proximity to the exchanges.
The real goal is a combination of eliminating as much slippage as possible, making some lower timeframe strategies possible and also having best class back testing performance for parameter grid searching and strategy discovery. I expect to sit between industry leading firms and typical retail systematic traders.