undefined

points

[-]

Wow, that's such a drastic different experience than mine. May I ask what toolset are you using? Are you limited to using your home grown "AcmeCode" or have full access to Claude Code / Cursor with the latest and greatest models, 1M context size, full repo access?

I see it generating between 50% to 90% accuracy in both small and large tasks, as in the PRs it generates range between being 50% usable code that a human can tweak, to 90% solution (with the occasional 100% wow, it actually did it, no comments, let's merge)

I also found it to be a skillset, some engineers seem to find it easier to articulate what they want and some have it easier to think while writing code.

by ramraj071 hours ago|

parent|

[-]

I used to think that the people who keep saying (in March 2026) that AI does not generate good code are just not smart and ask stupid prompts.

I think I've amended that thought. They are not necessarily lacking in intelligence. I hypothesize that LLMs pick up on optimism and pessimism among other sentiments in the incoming prompt: someone prompting with no hope that the result will be useful end up with useless garbage output and vice versa.

by oh_my_goodness33 minutes ago|

parent|

[-]

Exactly. You have to manifest at a high vibrational frequency.

by jplusequalt20 minutes ago|

parent|

[-]

Thanks for the laugh.

by bitwize1 hours ago|

parent|

prev|

[-]

This is kinda like that thing about how psychic mediums supposedly can't medium if there's a skeptic in the room. Goes to show that AI really is a modern-day ouija board.

by ctrust1 hours ago|

parent|

[-]

The accurate inferences that can be drawn from subtle linguistic attributes should freak you out more than they do.

by wg043 minutes ago|

prev|

[-]

This checks out logical speaking.

The FANG code basis are very large and date back years might not necessarily be using open source frameworks rather in house libraries and frameworks none of which are certainly available to Anthropic or OpenAI hence these models have zero visibility into them.

Therefore combined with the fact that these are not reasoning or thinking machines rather probabilistic (image/text) generators, they can't generate what they haven't seen.

by goalieca2 hours ago|

prev|

[-]

I can second this. I’ve never had a problem writing short scripts and glue code in stuff ive mastered. In places I actually need help, I’m finding it slows me down.

by wombat-man1 hours ago|

prev|

[-]

Also at FAANG. I think I am using the tools more than my peers based on my conversations. The first few times I tried our AI tooling, it was extremely hit and miss. But right around December the tooling improved a lot, and is a lot more effective. I am able to make prototypes very quickly. They are seldom check-in ready, but I can validate assumptions and ideas. I also had a very positive experience where the LLM pointed out a key flaw in an API I had been designing, and I was able to adjust it before going further into the process.

Once the plan is set, using the agentic coder to create smaller CLs has been the best avenue for me. You don't want to generate code faster than you and your reviewers can comprehend it. It'll feel slow, but check ins actually move faster.

I will say it's not all magic and success. I have had the AI lead me down some dark corners, assuring me one design would work when actually it is a bit outdated or not quite the right fit for the system we are building for because of reasons. So, I wouldn't really say that it's a 10x multiplier or anything, but I'm definitely getting things done faster than I could on my own. Expertise on the part of the user is still crucial.

One classic issue I used to run into, is doing a small refactor and then having to manually fix a bunch of tests. It is so much simpler to ask the LLM to move X from A to B and fix any test failures. Then I circle back in a few minutes to review what was done and fix any issues.

The other thing is, it has visibility for the wider code base, including some of our infrastructure that we're dependent on. There have been a couple times in the past quarter where our build is busted by an external team, and I am able to ask the LLM given the timeframe and a description of the issue, the exact external failure that caused it. I don't really know how long it would have taken to resolve the issue otherwise, since the issues were missed by their testing. That said, I gotta wonder if those breakages were introduced by LLM use.

My job hasn't been this fun in a long, long time and I am a little uneasy about what these tools are going to mean for my personal job security, but I don't know how we can put the genie back into the bottle at this point.

by asdff3 hours ago|

prev|

[-]

Can you elaborate on the shortcomings you find in professional setting that aren't coming up on personal projects? With it handling greenfield tasks are you perhaps referring to the usual sort of boilerplate code/file structure setup that is step 0 with using a lot of libraries?

by jatins2 hours ago|

prev|

[-]

Experience depends on which FAANG it is. Amazon for example doesn't allow Claude Code or Codex so you are stuck with whatever internal tool they have

Meta, despite competing with these, is open to let their devs use better off the shelf tools.

by konaraddi1 hours ago|

parent|

[-]

I work at aws and generally use Claude Opus 4.6 1M with Kiro (aws’s public competitor to Claude Code). My experience is positive. Kiro writes most of my code. My complaints:

1. Degraded quality over longer context window usage. I have to think about managing context and agents instead of focusing solely on the task.

2. It’s slow (when it’s “thinking”). Especially when it’s tasked with something simple (e.g., I could ask Claude Opus to commit code and submit for review but it’s just faster if I run the commands myself and I don’t want to have to think about conditionally switching to Haiku / faster models mid task execution).

3. It often requires a lot of upfront planning and feedback loop set up to the extent that sometimes I wonder if it would’ve been faster if I did it myself.

A smarter model would be great but there are bigger productivity gains to be had with a good set up, a faster model, and abstracting away the need to think about agents or context usage. I’m still figuring out a good set up. Something with the speed of Haiku with the reasoning of Opus without the overhead of having to think about the management of agents or context would be sweet.

by David-Brug-Ai47 minutes ago|

parent|

[-]

The context degradation problem gets much worse when you have multiple agents or models touching the same project. One agent compacts, loses what it knew, and now the human is the only source of truth for what actually happened vs what was reported done. If that human isn't a coder, they can't verify by reading the source either.

I've been working on this and landed on a pattern I call a "mechanical ledger", basically a structured state file that sits outside any context window and gets updated as a side effect of work, not as a step anyone remembers to do. Every commit writes to it, every failed patch writes to it, every test run writes to it. When a session starts (or an agent compacts), it reads the ledger and rebuilds context from ground truth instead of from memory.

Its not a novel idea really, its basically what ops teams do with runbooks and state files, but applied to the AI agent handoff problem. The interesting bit is making the updates mechanical so no agent can forget to do it.

by dvfjsdhgfv54 minutes ago|

parent|

prev|

[-]

> A smarter model would be great but there are bigger productivity gains to be had with a good set up, a faster model, and abstracting away the need to think about agents or context usage. I’m still figuring out a good set up. Something with the speed of Haiku with the reasoning of Opus without the overhead of having to think about the management of agents or context would be sweet.

I was thinking about this recently. This kind of setup is a Holy Grail everyone is searching for. Make the damn tool produce the right output more of the time. And yet, despite testing the methods provided by the people who claim they get excellent results, I still come to the point where the it gets off rails. Nevertheless, since practically everybody works on resolving this particular issue, and huge amounts of money have been poured into getting it right, I hope in the next year or so we will finally have something we can reliably use.

by j3k32 hours ago|

parent|

prev|

[-]

Meta is doing something healthy - signalling that it is behind with its LLM efforts. Nothing wrong with that.

by svara3 hours ago|

prev|

[-]

Could you say more on how the tasks where it works vs. doesn't work differ? Just the fact that it's both small and greenfield in the one case and presumably neither in the other?