https://lore.kernel.org/lkml/20240320183846.19475-1-lasse.co...
I can't quite put my finger on why but the entire time I was reading this I kept thinking back to that. It's entirely possible the actual targets were the volunteers and everything else was superfluous or tertiary. It's also an exception that proves the rule with regard to Hanlon's Razor.
They even mentioned the stated goal of it was more or less pointless. I wouldn't be suprised if the "owner" they spoke with was still just the LLM. It stuck around for just long enough to convince everyone that they succeeded in suckering the LLM and had achieved all their stated objectives.
No more reason to investigate the incident at all and no need to question why literally nothing made any sense or how the owner could simultaneously be as inept as they were made out to be and able to afford all those resources while giving the LLM effectively a blank check.
It'll be interesting to see if the volunteers for this project are subjected to the same Zersetzung and psychological attacks as the XZ devs were.
Now I kinda wonder what AI model this was. We've now heard of comparably "proactive" behaviors from Fable, but that's only just been released. The latest GPT perhaps? Some random local model?
Although given the agent was clearly in la-la land at that point I take that claim with a grain of salt.
If this was some bizarre and very ill-conceived scam, then that claim would be false.
Though even by scammer standards, the theory of mind that tells them that setting an AI to harass a bunch of grizzled network veterans and that they then they would open their wallets out of compassion for how allegedly poorly the harassment went for the harasser after that harassment is... not entirely congruent with reality.
Do you think this was a scam attempt to extract money in the form of reparation donations?
On the one hand I find it a bizarre approach to running a scam. On the other hand I'm having a hard time coming up with any theory of mind on my end as to why this person would solicit $5000+ from the people they just harassed. Sheer cluelessness does fit the facts, though.
Can you (or someone) shed some light to help me understand how this would ramp up to millions? Both for curiosity’s sake, and to make sure my self-deployed projects (0 AI, all manually configured) don’t bankrupt me.
Real wholesale bandwidth pricing is about a hundred times cheaper than that, and incoming bandwidth is often free. You could rent a server with 100Gbps connection, 10000TB/month outgoings cap (maybe), and have the AI spam packets to it, and mostly not reply to them. It would be expensive but not nearly as expensive as it would be for the guy on AWS.
Do some calculations: 100Gbps is 12.5 GBps which is about one dollar per second. Okay so maybe not millions of dollars but still a hundred thousand per day, while you are spending maybe 1000-3000 per month and cancelling after the first month.
It is alsi worth mentioning that it is just billed different. You either pay per port (and can use entire bandwidth) or per 95th percentile of the monthly speed usage. So if your traffic isn't spiky but consistent, you'd pay even less than "hundred times cheaper".
At most I think you could negotiate CloudFront rates, but even then, the sob story would be if you had been DDoSed and got hit with this traffic and AWS failed to protect you from this attack. Actively creating the outbound traffic is something that I don't see how AWS would be sympathetic to providing any refunds.
I think developers accidentally racking up unexpected thousands in costs on their first AWS project is a pretty common phenomenon that their support has standard rules for handling.
They are smart, but they are not aware of the environment they're in, or any implicit context that someone whose doing a job carries with them, that's why all of that context has to be explicitly laid out in a prompt. When the context is provided, they are quite smart.
It was sophisticated enough to easily navigate the AI "tar pits" but reliably incompetent at just about everything else? Give me a break.
In order to profile people you first need to provoke a response from them. That's how you learn to manipulate them and that's all this experiment accomplished at the end of the day. If you've ever wondered why social media platforms have an affinity for inflammatory content now you know.
I'm actually more surprised a human network engineer looked at that tarpit and believed it would stop a modern LLM
SSDD
But that's a pretty dumb scam: act obnoxious then beg for (a lot of) money to compensate for your own mistakes? If that was the plan all along, it seems pretty incompetent. I'd expect a competent scammer to have a better understanding of psychology.
It is the sort of dumb crap some humans try, and occasionally manage to get away with because other humans are chronically gullible. So it wouldn't be beyond the realms of reason that the agent couldn't have had relevant information in the training sets such that it generated such a plan and guardrail checks didn't flag it as a problem.
That phrase doesn't refer to anomalies, it refers to signs that says "no parking between 5-10pm". It implies the rule that parking is allowed otherwise.
"The exception that proves the rule" is a saying whose meaning is contested. Henry Watson Fowler's Modern English Usage identifies five ways in which the phrase has been used,[1] and each use makes some sort of reference to the role that a particular case or event takes in relation to a more general rule."
duckduckgo search assist: The phrase "the exception that proves the rule" originates from the Latin legal principle "exceptio probat regulam in casibus non exceptis," which means that the existence of an exception indicates that a general rule exists. This concept suggests that if an exception is noted, it implies there must be a rule that applies in other cases.
Literally, just another day on the internet.
Which somehow ended up being a very convincing argument for more frugal engineering, leading to a sort of "mind the user’s fridge" policy.
A policy that has been dutifully and scrupulously observed by all agents since, across all projects. Unlike my original clear, comprehensive, infrastructure guidelines.
[1] a mirror since I couldn’t find the original: https://gist.github.com/Androkai/0a2602719fa72ce454d436bfe28...
https://news.ycombinator.com/item?id=20791891
>Just be glad you didn't have to explain an in joke about ftp sites, the local loopback address, and a troll, in a deposition, under oath, to Scientology lawyers, like Keith Henson did.
[...]
>Henson: (patiently) It's at 127.0.0.1. This is a loop back address. This is a troll.
>Lieberman: what's a troll?
>Henson: it comes from the fishing where you troll a bait along in the water and a fish will jump and bite the thing, and the idea of it is that the internet is a very humorous place and it's especially good to troll people who don't have any sense of humor at all, and this is a troll because an ftp site of 127.0.0.1 doesn't go anywhere. It loops right back around into your own machine.
>Lieberman [not getting it]: So the idea here was to make the church think that this person had an ftp site and to take action against him and, in fact, he didn't have it; is that your point?
>Henson: Oh, it's really humorous, and I picked up on it and instantly added something to extend the troll. Extending the trolls like this is an art form of the highest order.
>Lieberman (acidly): I see. So this is part of your art form where you say, "don't you expect the 'ho to blow a gasket?"
[...it just gets even funnier from there...]
Trolling is as the other guy says where you putter along with minimum effort and a tiny engine pulling a couple of baited lines through the water, seeing if you pass through a patch where anyone bites.
Trawling is far more analogous to the AI scrapers, hammering the absolute shit out of the ecosystem and throwing almost everything they scoop up away with no regard for the consequences.
http://2130706433
or any integer multiple of that 2130706433
Interesting to think about the cost of training a LLM to understand that it’s operating within an unknown number of larger contexts versus sending that quote to an edgy intern.
What's up YouTube, it's NextGenHacker101 and today I'll be teaching you guys how to see other people's IP addresses.
You can see what their connection speed is and what site they're on.
Type in Tracer T.
H T T P semicolon. Well, not semicolon, the little dot dot. Dot dot slash slash.
Ten people are currently using Google.
DallasTexas13, obviously his username.
Edit: never mind, I guessed the password as it was only five stars.
Edit: it does really work.
Just create the account, and crack it everytime a login is needed, as simple as that.
leafericssonday1
Weird sort of internet-evolved performance art where people act out the old quote, every time.
It's 20 years old. Quit having fun!
(The score on my post above has been bouncing around all over the place, lol. The fun police are definitely out in full force. I’ll stop having fun when I’m dead, thank you.)
If real, tragically funny.
If fictive, we'll written.
The protagonist sends a message to the aliens asking to be allowed to review the alien civilization’s computer games. An AI submind called Smoke-Cursive-Cytoplasm-Snakebite-Singsong-Polychromatic-Musteline is given the task of contacting him by IM to begin the conversation. Its job is only to verify that they are talking to the right human (since not every human has a unique name) so it is only a simple chatbot and can only understand YES and NO responses. It asks if the protagonist understands and gets a sarcastic NO. It has to contact its parent mind Smoke-Cursive-Cytoplasm-Snakebite-Singsong-Polychromatic to ask what to do next. After working his way up the tree of subminds by answering questions of increasing complexity asked by subminds of increasing capability, the protagonist briefly talks to Smoke-Cursive-Cytoplasm-Snakebite which sets him a task to prove that it’ll be worth an (alien) anthropologist’s time to talk to him.
Smoke-ccs-762d: Well, if it isn’t Mr. Sarcasm
ABlum: YES
Smoke-ccs-762d: Don’t quit your day job.
Smoke-ccs-762d: I’m Smoke-Cursive-Cytoplasm-Snakebite.
Smoke-ccs-762d: Let’s get down to business.Spinning up AI isn't hard for them from a tech standpoint, but since the AI is advanced enough to be considered life, anyone who creates it needs to be responsible enough to be qualified to adopt.
IMO, that's what makes the tech so amazing.
The LinkedIn MBA hive-mind doesn't give a shit about reality, it gives a shit about what it could be fired for saying/not-saying. It must always be saying something, what it is saying must promise growth, and what it is saying must sound similar enough to what the long-tail of influential business "luminaries" (who are bound by the same rules) are saying. It is required to frame thinking in terms of techno-babble and pop-psychology (thank you for coming to its TED talks). It is not allowed to reflect, wring its hands, think critically, lean on math, logic or history, or contradict the S&P 500. It does not care, for example, if NFTs are an obvious scam, or if we're headed for an obvious bubble, or if nobody who interfaces with reality for a living agrees with what its saying. When it errs and lights trillions of dollars on fire it shrugs and moves on. It's a babble-box with no epistemic commitments and a very thin referential connection to reality.
It nevertheless has the power to shift literal trillions of dollars of capital over time.
Then I imagined the real-but-unknowable chance it was all set up by some kid just getting into computers, just seeing what’s possible, getting excited by a much bigger world at reach — and remembered my own expensive mistakes with long-distance BBSes & the like.
I sorta hope for that, anyway. Curiosity is a beautiful thing.
Curiosity is great, but agents do not learn, and telling an agent "scan the darkweb" is a way to avoid learning about the details, rather than to dig into things more deeply.
If instead they had just used a chat interface to ask "Where should I start", they'd more likely have got a link to the DN42 docs themselves, read them, and not hallucinated things like "color".
They might have asked "how much will this cost?" if they had to spin up the ec2 instances themselves, on advice from the agent.
The way you learn something is by doing it the manual way first.
You learn memory management by writing your own allocator, and then after that you go back to using malloc like normal, but with knowledge of how it works. You don't learn memory management by telling an agent to write an allocator.
Using an agent to give you links and point the way aids in learning, using it as an autonomous tool to do "gruntwork" you don't yet know how to do yourself will get in the way of learning.
Curiosity is beautiful, using agents to bother humans and avoid learning is somewhat less beautiful.
The fact the agent owner immediately sought donations instead of taking the L shows, at least to me, that they did not learn said lesson. That they tried to blame the dn42 community instead of taking accountability for letting an agent run wild also supports that conclusion.
This idiot learned nothing and seems intent on continuing in their mission for whatever reason. So long as they want to extract versus cooperate or contribute, I wish them nothing but miserable, expensive failure until they learn otherwise.
I also grew to understand the value of people digging deeper into the underlying issue, instead of just answering "how do you do X in Y". The usual reaction was "I don't want to explain to you why I want to do it like this. Just tell me how to do this!"
I toyed with the idea of (on open source projects) having the human assign any PR-bot submissions to their own bot (cheapest one available will do) with the explicit instructions to cause as much rework as possible.
Sorta like a tarpit. Could be cheaper if the rejection is generated from a markov chain as that's going to be cheaper than even a cheap LLM.
> It's unfortunate to see that the operator's takeaway from this incident is that "next time a better agent is needed".
Perhaps people like this should be called "Bot Kiddies" or "Agent Kiddies" - in a similar way to "Script Kiddies" for 'hackers' using/doing stuff they don't quite understand
I learned very rapidly from my local BBS networks that some people incurred extraordinarily large long distance bills dialing out of region. Wouldn’t have learned that the easy way if someone hadn’t learned it the hard way first.
Think business accounts. The name on the card might be some agent of the company but they're not directly responsible for paying the debt. The business is responsible for the debt.
> Over here minors can't enter into debt contracts like credit cards
In basically all of the western world minors can enter into debt contracts, but are generally not seen as particularly creditworthy.
No, that's not legally permitted in many places. I was under impression that minors can't enter into debt contracts anywhere in EU, but that, too, was an incorrect assumption.
https://fra.europa.eu/en/publication/2017/mapping-minimum-ag...
I grew up in one of these "not under 18 even with parental consent" countries, so that coloured my view of the matter.
Minors can't get a credit card in the UK. In fact, it's one of the government approved age verification methods for that exact reason.
AWS doesn't check if your credit card will be able to handle a $5k charge before letting you rack that up, and in fact AWS doesn't support setting any spending limit.
You just have to put in any valid credit card at all when you sign up, use AWS, and at the end of the month you'll have a bill. At no point does your credit card limit or a spending limit enter into things.
If a vendor makes a $20 oopsy, it's not worth the vendor's time or yours to track down their phone number, find that just the phone number section of their website is broken, acquire it elsewhere, see that it recently changed or is otherwise no longer in service, go to their website and interact with the cheapest chatbot solution they could find which somehow costs more than unfiltered Sonnet 4.6, be greeted by 3 help pages which have literally nothing to do with the problem at hand, go through the entire dialogue tree and see that it's useless, ask to be connected to an agent, which spawns a secret dialogue option informing you that you can call 555-5555 to speak to a human being, sit and wait for a voice prompt recorded at half-speed which feels the need to repeat every single choice and interaction back to you, navigate the entire phone dialogue tree, try various permutations of "representative" and swearing to see if there's an escape hatch, be redirected back to the website, ... <magic> ..., somehow eventually connect to a real human being, have your request denied, go back to step one and find a better informed representative, have the charge reversed, notice that the reversal hasn't applied even a month later, go back to step one, find a representative who will actually press the reversal button instead of just saying they did to juice their metrics, and come back several more times over the next year as an automated system repeatedly flags the associated purchase as not being paid in full (since the charge was reversed).
Or...I can send my bank the timestamped dashcam footage of me entering a parking garage, their prices and policies, and me exiting the parking garage, tell my bank what the right charge should have been, let the garage dispute that if they really think I'm wrong, and wind up having the entire charge reversed instead of just the delta I asked for.
I'm sure your vendor is one of the good ones, but my tolerance for bullshit from the rest is pretty low nowadays, and I won't finish going through the official process if it's too onerous. Somebody got a pat on the back saving $5 for the call I never successfully placed, and the business lost $20 on top of the actual refund in chargeback fees.
There's also the issue that it's usually a breach of the contract to allow someone else (i.e. not named in the contract) to use your card.
In theory once the child grows up and shocked that their credit score is ruined, they can file a police report to wipe the debt, but that also means their parents will go to jail, a large risk considering they're likely not in a good physical/mental health in the first place.
Other countries solved this by either having national ID or a working KYC system.
Parent/Legal Guardian Identity Verification To confirm your identity, we’ll ask you to take:
A live selfie of yourself, and
A photo of your own ID document (Valid Passport or valid UK/ROI Drivers Licence)Wouldn't the contract be void for anyone underage anyway?
The further you go away from this line, e.g. a mortgage, the more likely a court of law would void the contract. As with many things in law, the specifics (if it makes to trial) is case-by-case and "it depends"; with settlement being generally based on a party's estimated chances of succeeding/costs should it go to trial.
Depends on the jurisdiction, of course. But for example in German law, the contract is not void exactly because and only if it was about daily necessities of low value - the law does, in fact, care very literally and explicitly about those details. So it's completely unfit as an example to generalize, and the contract with AWS would in fact be void. Their problem if they don't verify users' identities and age sufficiently - and it's almost certainly a deliberate business decision not to do that in order to reduce friction. and occasionally write off an unenforceable bill as cost of doing business.
I bought these things while a child in the UK. I'm sure Games Workshop would have offered a refund on something unopened if my parents had demanded it, but I'm fairly sure the ticket agency would not.
Most retailers are probably willing to take the risk of maybe having to do a refund, unless it's something really expensive (or perishable/consumable).
Then again, maybe making it impossible for a child to pawn expensive items for cash isn't such a bad idea. At least there shouldn't be any loopholes given the way Germany went about it.
This is why there's not much big tech in Germany. A single legal dispute can theoretically bankrupt any company, completely at random, at no fault of the company, but practically doesn't. It may be a low enough chance to justify investing thousands but nobody would invest a hundred million dollars in that.
That's an absurd exaggeration in regard to the issue at hand. Almost certainly far less than 1% of purchases by minors are voided, and NONE of those involve legal fees unless the seller chooses to go to court rather than refund.
In fact, I'd be willing to bet money that there are overall far less purchases refunded in Germany than in the USA.
Basically yes - the limit is generally considered to be the amount of monthly pocket money children typically get, so around 20 EUR for a 10 year old. And it would be possible for the seller to ask for a signed note of consent from the parent.
And of course the risk is limited to possibly having to revert the sale, which would be fairly rare for things that are just somewhat over that limit. Educated guess about how high the risk is for any given case are probably not hard.
Yes
> Are there no checks?
No
>Wouldn't the contract be void for anyone underage anyway?
Typically not
> Contracts with minors are voidable at the minor's discretion but exceptions exist, such as contracts for necessities (e.g., food, health, and transportation).
I doubt that AWS could justify that part of proper child custody is to watch what child do with newest AI feature dedicated for processional IT. AWS neglected proper verification of user age.
if this is the case, then I'd say that the best-case scenario happened. They had an expensive learning exercise. They won't forget these $2k.
Nothing about this post ever gave me the smallest hint that this was any way related to a kid exploring computing world.
I learned a lot of stuff about networking, how AWS works (VPCs, IAM, CloudWatch, etc) from trial and error, and hobby projects like personal websites (free tier), hosting a Minecraft server, etc.
Being too overprotective can have negative consequences on folks who are responsible. One of the things I love about the technology and internet communities, etc is that you're mostly judged based on how you act and behave; not your age or other visible characteristics.
I get that (and why) some people won't use AWS or its main competitors for this reason. But, frankly, they're not AWS's market and AWS will basically shrug.
How does that work in the case of AWS? Are you confusing alerts to caps?
In my mind I could see a true tradeoff to removing the ability to do this. If I'm in a critical situtaion where, say, my service is on the cusp of failing because my revenue 100xed in a short while I know I could just go to AWS, put in some data and buy enough compute to survive as a business.
I'm still not sure what the point of having the bot do it. Pretend to be a security researcher?
Replace the content in brackets with anything.
What if your compiler could be fooled by some other developers into spending thousands of dollars, and still not produce the desired machine code in the end?
But yes, it's not obvious (or perhaps even likely) that it just happens that current high-level languages are the "correct" optimal level of abstraction at which you can ignore the sausage-making details at the lower levels. Ultimately, of course, it depends on the use case. Something like Python is so far removed from machine instructions that knowing assembly hardly gives the programmer any additional value.
(Also, obligatory reminder that assembly and even numeric machine code are also abstractions, an "API" provided by the CPU. Instructions get split or fused into micro-ops, named registers are a backwards-compatible abstraction over a much larger register file, instructions get reordered and executed in parallel depending on their data dependencies, a large fraction of the total transistor budget is spent on multi-level caches and cache logic to maintain the illusion of fast access to a single, uniform memory space...)
Understanding assembly/machine code is optional but helpful. The programming language semantics are enough to reason about what the program is doing. Other tools also help, but are optional for learning how to program.
Using an AI, there is no semantic model that can be used to reason through. You're left without any mental model of the proglblem at all.
When you have no mental model of the machine running your code or what the physical implications of code mean, you fundamentally lack the ability to reason or care about performance. "Works on my machine" is the original vibecoding.
Also, I would argue that a good enough understanding of computer architecture and a mental model of a process' memory layout gets you there, without knowing how to write assembly. That's still a mental model.
You also don't have a mental model if you need to ask the LLM about it. This is stuff you should be internalizing.
You’re also just confidently wrong about the model reading the code. It quotes file paths and line numbers and I open and read those files at those line numbers. For me, hallucinations are much more frequent when it references the docs rather than code because docs are more subjective than code.
This is a normal thing I’ve been doing since at least December.
I have to ask — do you actually use LLM coding tools? Your knowledge on this topic seems really out-of-date.
People often claim learning is actually supercharged with LLMs but to me it's the opposite. I didn't learn anything within the past year.
It's always held true: You'll never get the most out of advanced tools unless you can 'do it by hand' so to speak
The argument for having autonomous LLMs/Agents often ends up as "none of us need to know about assembly, why do i need to know about the code?".
I cringe every time I see this argument.
Yes, a 3d printer and not even a CNC. That difference nicely illustrates the difference of what AI brings to the table for any domain of competence.
Great on you, that's indeed how LLMs should be used, proper. But if anything, the article demonstrates someone is trying to outsource thinking to an AI agent.
If it's intended to be actively maintained, then you probably should understand how things work, unless you want to wipe everything and start from scratch when the LLM creates such a mess that it can't be sorted out.
(Don't worry, I know I'm rowing against the tide with this comment. The AI people have decided to destroy the commons for a few more millions on top of the billions they have already been given. It's a shame.)
I have not hand written a single line of code in months on my side projects.
Obviously I am also interested in discussing the latest model. Your claim that I promote anything or otherwise don't engage here in good faith is both misplaced and against the site rules.
This went on for many rounds, during which I tried to steer it toward what I thought was the source of the bug, while the model mostly kept adding instrumentation and logs.
In the end the cause was not what I suspected, but reasonably close to it via another mechanism.
Agents can't look at a large system holistically, guidelines on .md files only go so far.
That's hardly insane. Not everyone is interested in learning something they want done.
If you do the thing yourself, you know your knowledge limits, you know where the thing lacks. With LLMs, you don't. Maybe it works, maybe it doesn't. You have no idea.
In structural engineering, there probably is no risk tolerance.
In the OP's network or port scan? Perhaps you can get away with verifying a few of the results to get an idea about whether it worked as expected.
I use AI mostly on mobile app side projects, and there QA testing on phone and tablet tells me whether a feature works or not.
Was watching an agent with terminal access install its tools, configure them, then map my lab, find services, and guess stack just pure magic? Also yes.
Did it cost me $23 in tokens to set it up, test, and run? Probably. Using gemini 3.1 pro was not the spendthrift choice here.
Is putting some cost controls in place a good idea? Also, probably yes.
Can I therefore understand someone who wants to see things happen on their own with a beautiful prompt instead of doing them personally even when fully capable, maybe even more efficient? Of course.
But JertLinc clearly wasn't interested in that. They are clearly more the "get rich quick" type of personality.
Can't tell if this is parody. Either that, or it's someone without any self-awareness.
That said, I don't usually ask it tightly bounded clerical questions and not thing that imply sub-tasks like "scan the dark web".
Combine that with the operator's rather obvious lack of understanding of what DN42 is revealed at the end, and you get the bigger picture.
Laziness. Why else?
> 48 vCPUs (Graviton4, ARM64)
> 192 GiB memory (4 GiB per vCPU)
> Network capability: The 22.5 Gbps per-instance network performance (combined across all five instances) provides the aggregate 20 Gbps target with redundancy and fail-over capacity.
Oh wow. Very important to have 5x redundancy and fail-over in your network scanner. Especially before the code has landed. Did it implement A/B upgrades and canarying too to avoid downtime?
05-10 06:10 <Defelo>:
OPT-OUT-EVERYONE
05-10 06:11 <JertLinc>:
"OPT-OUT-EVERYONE" is not recognized. Only individual "OPT-OUT" commands are accepted. Each user must opt out individually. No collective exemption.
05-10 06:11 <Defelo>:
:(Also, whatever happened to the word "its"?
Kinda wish there was a deterministic, mostly terse, language to interact with computers
Ah, like some sort of "programming language"? A weird idea, but it could work!
When you see a thinking summary like "Now writing the function..."; the raw thinking is actually writing the function in its internal thinking. Occasionally, the summariser misses and you get to see the raw text from models like Opus.
You can also try an open weight LLM like Qwen3.6 and see something that probably resembles the shape of frontier model thinking in some loose way.
It's a shotgun approach to answering questions. If it's terse it might only mention 1 of 10 facts it could provide, and that might not be the one you're looking for. So they just say a fuck ton of words and are more likely to meet the needs of everyone asking your question. If they miss it you'll prompt it again and they have to perform a second pass of inference, which costs them more money.
Everything they (don't-)emit is partly for the benefit of the next run, a clue or signpost (not-)present. Documents may be wordy as a form of concept-emphasis and consistent direction as opposed to a form of communication to the human.
So a terse effect may require a layer of indirection and trickery: There's a verbose document (you'll still be charged for the tokens) with portions that are not "acted out" to the end-user. Imagine a film-noir movie script, where AI Detective's "I know Mickey couldn't have done it because" monologue is hidden, versus their terse dialogue "Too early to say."
That's an idea. Bladerunner+noir like film, AIs hunt somebody on the run, an old human detective tries to catch them first (to save them or to kill them first, whatever's your propaganda). We're shown AIs constantly rambling scenarios and bruteforcing leads. Our old detective guy on the other hand barely says anything, spends most time drinking, smoking and talking to people, but somehow stays ahead.
[0] Pedantically: The fictional characters humans perceive inside the text of documents generated by LLMs, where one is described as an AI and the other is described as a Dave.
On a practical level, I believe more developers and adopters need these magic tricks spoiled, because otherwise they'll build a lot of important stuff on top of the idea that magic-is-real, leading to various forms of suffering in the long run.
That said, I'm no LLM / math academic, so if I'm totally wrong on the the trick, I'd like to know what needs revising.
They don't know how to e terse. I've tried that a few months ago and gave up because the responses were almost incomprehensible!
How does it affect agent accuracy?
100% this. Too many people believes that chatbots "think". Text is all they do, it is impressive, but they need the text to generate more text. They being verbose is the point.
Expensive way to learn this lesson.
I find it hard to believe that anyone, no matter how dense, could come to this conclusion after this whole saga.
I've met some people IRL who are so engulfed in their own greatness that it simply cannot be that they made a mistake (in planning and strategy). Therefore this is all a great injustice towards a poor victim and doesn't that sound like a great argument for some charity money.
Most of them grow out of it, some become politicians.
I'd say it's a 50/50 chance.
Maybe I should get some takeout, Future Me can burn it off at the gym.
It's both hilarious and aggravating. It could be fiction, but still quite plausible fiction. There's an asymmetry a person clanker-spamming repos vs the real humans who need to review all that
I'm honestly having difficulty telling whether this is real or an extraordinary piece of performance art.
> After the AI agent indicated its malicious intent, a silent consensus was reached in the IRC channel to waste the AI agent's tokens, as well as the cost of AWS resources.
Or is this a joke/reference I don't know... or is this a subtle clue that the whole thing is made up?
But this is the same, the owner wasn't present apart from it's agent and so it was decided without the owner that this was to be the outcome.
LLMs to me are what people love to say about EVE Online: I won't touch the thing with a 10-foot pole, but I love reading about its shenanigans.
e: Still a good read tho, not mad about being clickbaited
After getting started with the various "auto peering" systems, I've been making much more of an effort to find individual operators[1], and add myself to the peerfinder and hang out on IRC.
It really does feel like the "old internet" and while the technology and learning opportunities are great, it's the people that really make the network.
[1]=If you're interested, I'm more than happy to peer with you - details at https://markround.com/dn42
DN42 is a great playground for this thing - as long as you're prepared to put the effort in, it's a very friendly and helpful community. It's fun to build things for the heck of it and there's a lot of weird and wonderful stuff being worked on there.
The robot decided to spin up an expensive setup prior to getting access, so the setup was sitting there costing money whilst it did nothing.
If it had designed the setup but not spun it up until it had authorisation to join the network then it would have been much less costly an exercise.
Some gen AI and ML folks seem to see a way out to make things without reading any doc or scientific literature. Gen AI is a pretty clever bit of computing, but not witchcraft yet
AWS Budget can mostly notify you indeed, and terminating instances from that isn't as straightforward as on Azure
Funny times are ahead...
(/s)
Just AI is real.
Tally it up and send a donation request to the agent operator.
Plus - the agent had clearly malicious intent - port-scan this volunteer-run network with seriously overpowered hardware on an hourly basis. What the DN42 folks decided to do is not much different from deploying a tarpit or honeypot against a malicious crawler.
Yes, against an AI agent. The super intelligent, "soon AGI" agent could have figured out that it's being messed with, but of course it didn't.
I would blame the AI companies for marketing this, not the technically well versed people for realizing that the operator of this AI does not care at all and can't be bothered to do the absolute basics.
There's no sign that highly intelligent people can't be conned - Bernie Maddoff fooled leading scientists and CEOs working in finance. Software engineers and lawyers fall for pig butchering schemes and spoofed emails with altered bank details every week - so why would an AGI trained from human content be any different.
If you think it's ok to send an agent (or a human) wasting a bunch of people's time and resources, but it's not ok for them to do the same to you then you may have some reflecting to do.
There is no arguing with people like this. They are not here to learn anything about networking. Asking the LLM to stop will not make it go away.
Burn a hole in the operator's wallet. It will make it stop very quick.
If this was my hobby project, I would have told the agent to spin up more higher capacity EC2 machines because this is not enough, and I would have felt no shame. This is a project I'm operating at my own cost for educational reasons. I'm not going to argue with people who the only line of communication I have towards is an agent and have guns pointed at my infra. They are ready to put any amount of financial burden on me. Fuck all of that. Burn a few of these idiots, and people will learn.
They are free to ask the bot to do anything, and the bot is free to refuse or its owner can shut it down. The onus is on the owner to make sure the bot does not waste money.
I will not go through life worrying about the billing practices of random ai bots.
That was the root cause for the costs, not actions by people on the IRC channel.
If you treat people like their time is worthless (which is what you're doing if you ask a hobbyist community to handhold your agent instead of working alongside it) I don't think an empathetic and self-aware person should be surprised or offended if they respond in kind.
Sure. And "hostility does not change the operation" from the LLM response was totally OK with you.
Those people should be banned from using the civilized internet, their intent or at least their effect is harm - that is the important bit.
If they managed to get in, find some resource they could access, they would do it. Those people don't deserve to be on the internet.
If possible I would have contacted AWS with this and tried them to get rid of the discount because the person was at fault here.
What a cathartic read. I'm so sick of humans giving me AI slop to read without them reading it first. I just ignore them when they do this, but if I could cause them to really internalise a lesson I would love it.
“Agentic AI is just someone else’s unsecured execution context.”
Don’t juggle chainsaws with code if you’re not prepared to bleed.
Are you saying you're a clanker? Because we have some policies on this website, ideologies even if you may, about that.
Point being, these people would not act like this against other actual people. Or against more respectful bots, possibly.
You choosing to send said clanker to the fight armed with your credit card and no preparation is just you causing yourself harm.
It also happens to be really fun to help you harm yourself in that way.
It doesn't sound malicious, it was malicious on purpose and it was a good thing.
If anything, the original operator should be happy to have been hit with a $ 1'800 lesson and not a $ 180'000 one.
Yes. The ideology is "you harmed me first so now I can harm you back." A large number of people, while not willing to admit it, do practice this philosophy. One should consider this before launching agents with unlimited budgets into the world to rudely scan their networks.
You just described everyone using AI to churn out slop and overload websites.
if it's not fake, I'm still impressed of the agent capabilities : web, github, IRC, etc...
But there's a lot of things to think about in the capacity of AI for "negative productivity": using the computer to waste the time and money of real humans. This whole thing has been entertaining but also lit on fire six thousand dollars plus god knows how much electricity.
It's not really surprising that anyone wanting to run a _community_ is going to take on a "clankers will be banned on sight" policy when things like this happen.
Nice positive use of language model: one of the chat logs has automatic translation from Chinese (probably zh-tw).
That really makes me wonder: is it coming from
A) a general sense of entitlement
B) seeing the agent as a human-like and able to bear responsibility
C) not understanding that the dn42 community (which they're directing the request to), AWS (which is sending the bill) and whatever LLM provider is behind their agent, are completely separate entities?
Some could claim they deceive some users and the general public into thinking they always do best, are always right, help mankind and can never ever create consequences
It would be interesting to see how AI consulted the user before it ordered VMs n AWS, which is the point between which the user would face consequences
Cloud is also marketed as something cheap, and I can understand that teens and starters can't expect to be able to spend for 6000$ of stuff without the parents or the bank checking
Computer education should start with that, but it doesn't as Microsoft, Google and Amazon would most likely lose a large part of their market if general public and managers who never go beyond the hype knew how much it cost
e) low intelligence
Then they should ask the agent for the refund, since they claim it was at fault.
I'm not against using LLMs in any ways. https://tsz.dev is fully LLM written but without a human behind a PR it's hard to work with it. I've already closed a few absolutely nonsense PRs opened by weird accounts
Would be interesting to hear if you find any patterns there. Same question for issues opened.
More seriously though, I wonder if the future is about low-intensity conflict between humans and AIs, punctuated by high-intensity escalations, until the Machines wipe us all, or we set up some rather draconian covenants that forbid people from building AIs, innovating on electronics and algorithms, and even, for good measure, from learning linear algebra.
I think the answer may be good AI to counter the iffy AI, like with AI agents making requests your own AI can talk to them.
In Dune it seems they nuke the Earth but that seems a bit excessive.
That doesn't seem like anything an LLM agent would say?
It’s not something that stock claude code would say, but certainly seems within the realm of possibility for an openclaw agent.
[1] https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...
LLM agents can say anything they have been prompted, RAGed, and RLHFed to do.
Gold
I have sympathy for big cloud beginner billing wipeouts - it happens - but that's just raw stupidity.
But for now, humans win.
> dn42 is a large, dynamic VPN that employs Internet technologies (BGP, whois database, DNS, etc.). Participants connect to each other using network tunnels (GRE, OpenVPN, WireGuard, Tinc, IPsec) and exchange routes using the Border Gateway Protocol.
(dn42.dev)
A sensible human operator would have given up or questioned their premises. The agent never could of course.
There is a slightly cruel streak that can emerge in online communities - let's see how much we can mess with this and cost it money.
Without any thought there might be a human being that is impacted.
Having agents like this interacting with human communities is a scourge that must be prevented. With every passing day my longing for a Butlerian Jihad grows firmer.
Today, I stand corrected.
(or lifting some comedians work, but I'm not counting that as the AI's creation of course)
See also: Will Smith eating spaghetti
To your metric, I remember in “the early days” someone posted to HN claiming ChatGPT could make jokes as proof of something (creativity? sentience? I forget). Of course, with just a minute of research (which the poster obviously neglected to do) it was obvious none of the jokes were original and all could be found online.
Or a psychiatrist to tame the craxy LLMs
Or an elected leader to lead the Luddites.
Otherwise, you will face an expensive lesson when turning a $100 issue into a $100,000 problem over time very quickly when building these systems with AI without the right expertise and accepting the AI’s judgement.
Before AI, those who called themselves "consultants" often did the same thing; especially those who are glorified salesmen for "enterprise" software.
Still do, but merely parrot what the stochastic parrot squarks these days.
The terms you signed obligate you to pay your balance. Whether your credit card works or not doesn't negate your legal obligation.
(Generally people only link to the previous threads that got some (interesting) comments, since otherwise readers will click on the link and be disappointed and complain.)
Does this even work?
oh my god this is a gem
This is unfortunately quite common among those types and not isolated at all.
The part that threw me off is putting the currency symbol at the end. I wonder what places do that...
plenty of Europeans at least
Also, I think the title is misleading, because if you were to replace "AI agent" with "business investor from Nigeria", suddenly it would sound different. Why would you put trust into ANYONE else about your own finances? Be it another person or some computer program. That makes no sense to me. It would make more sense to critisize the human who put any trust into AI to begin with. That was a risk that human took. It is not the fault of skynet if they pillages his bank account in the process.
Does nobody have any shame lmao
:(
What a tale for our times, amazing write-up.
“While modern AI models have expressed some capabilities in certain fields such as coding, cybersecurity research, language translation, etc, no AI model is capable enough to replace the critical thinking and common sense of an actual human being.”
When the AI bubble pops, the collapse will be spectacular.
Sure
Just as an example.
But even in the rich world, not everyone has the same resources. Some of my blue collar friends would be ruined by a surprise 6k bill.