That's what they aim Claude Cowork at. Every executive/leader I've shown Claude Cowork to has gone from 'what is AI' to 'vibecoding whole apps' in weeks. Then when Claude is down for an hour, they get visibly angry and don't remember how to do anything pre-Claude :)
I understand the impulse to provide a UI to manage codebases, etc. But my observation is that these people just ask Claude to do whatever it is they need done. Codebase needs managing? They just ask Claude to do it. No idea how to deploy an app? They just ask Claude to do it.
Any app built on top of this stack to 'make it easier' is competing with 'I don't care what's happening, just ask Claude to do it'.
I've been guilty of this and gotten pushback from my manager: "this feels like homework, cut these options down to 100 words each, max".
Curation and refinement are even more important when you can have genAI generate reams of text.
Seeking outside signals is even more important, like talking to customers, looking at real usage data, and more. It's too easy to trust believe what Claude tells you, even if you say "please argue against this idea", which you always should.
The word "more" there is doing a lot of work.
What is the "more"? Is it:
- more documents and text or more understanding
- more code or more valuable features
- more things to throw against the wall or more considered experiments
It's way easier to do the first things instead of the second.
Adding is always easier than reducing to the essence.
I find it interesting how herding agents has so much in common with being a team lead. Constant struggle between too detailed and too loose instructions. One difference though is that the team learns from you, but with agents it's only you who adapts. Saying that, because I don't count instructions or anything in the context window as adapting.
Someone will compete and take your place. Now you're expected to produce more, or step aside for someone who will.
Who decides value? I'm sure you can figure that out.
It matches the pattern of LLMs being very good at simulating the form of work output, which is an issue with code but it seems quite exacerbated with anything non-verifiable, like written communication.
I'm using Claude to write large files too, but it's a very iterative process and involves a lot of reading and correcting.
to be fair, i've been guilty of this with code. Ask claude to generate a python script that takes X as input and produces Y as output, run it, pipe to more, output looks ok but i don't check everything, write it to a file, send it on.
The drug is scary when everyone is depending on it. I wonder what is future like.
I do agree quality will be missed, and shadow IT will be again a big issue like at the end of the 80s and early 90s.
I don't think so. Back then, the pool of people doing such a thing basically self-selected for intelligent, motivated types who were capable of learning on their own. The new "programmers" "programming" via Claude Code are going to be very different from those hobbyists you're talking about.
Why are people making things with Claude Code if not because they’re motivated?
But I think the same applies to not just AI but various tools that have abstracted away the complexity of things over the years.
For example, I would imagine the average person deploying some sort of web app or API today knows far less about networking and infrastructure than someone doing it 10 to 20 years ago.
Compare that to say 30ish years ago. If you wanted to do something as simple as play a computer game you had to know how to navigate a command line, know about device drivers, make a boot disk, etc. Users were a whole lot closer to the realities of what makes computing work. And no internet, at least as we know it now. You really had to have a certain mindset to be a developer.
It's a far cry from "hey Claude make an app."
Once they hit a wall, that is where you find out whether they are motivated or not
Yep. That has to happen first.
Planes falling out of the sky, trains crashing into each other, pacemakers downloading updates and freezing
didn’t we see this with crowdstrike
Then regulators will take things seriously.
Then why did I just spend the last few days releasing a slew of documents for a handful of trivial changes to an ancient medical device?
Someone should have told me that I could just ignore regulations!
gotta get a PE to stamp your LLM-generated code in sensitive environments.
I’ve heard the same from the best devs, and some who thought themselves to be the best, I’ve known long before LLMs were ever a thing.
I’m sure others heard the same when JavaScript and Python became near ubiquitous. When PHP emerged. When C supplanted Fortran and COBOL. When these two took over from Assembly. When punch cards went the way of the dodo.
There’s always someone for whom shitty is becoming the new normal. If that makes it a rule, what do we make of that rule?
Also we went from compilers with an IDE that had a debugger, profiler, built-in help and would fit on a 3.5" disk and would load on machines with 640KiB RAM (Turbo Pascal) to chat apps or password managers that are hundreds of megabytes and regularly gobble up more than a gigabyte of memory because they ship with their own browser.
Something is lost along the way.
Definitely.
My point is more that it seems to have been the way of the world for the past few decades. I’m arguing it’s nothing new, basically a trope at this point.
And once that is said, if we care about it, what do we do about it? Besides just repeating it.
You heard right! Most JavaScript and PHP in the world _is_ profoundly shitty. It's taken 20 years of intense research to make JavaScript compilers that are almost good enough to mostly optimize away the design foibles of the language.
Progress!
(I’m half kidding)
Coding per se is not hard. Proper engineering is. I do hope this change brings a change in focus (people train in algorithms, efficiency, solid development patterns) but I am afraid it won’t be the case.
And for those who might care about these things, they’ll probably just be facing constant pressure to deliver more, faster, perhaps with less.
One of the issue about literacy in algorithms, data modeling, efficiency, development patterns, systems designs… is that people either aren’t aware of them (and in the best case scenario reinvent the wheel), don’t care about them, or feel they aren’t given the time to invest in learning them (or worse, might be penalized for it).
Enlighten us.
Sounds like job prospects to me.
Probably "don't do anything to upset AI companies or you will effectively become a handicapped person"
Not that different from life in China: "don't do anything to upset Tencent and AliPay or you will become an outcast"
Or life in the US if you're a content creator: "don't do anything to upset Meta or Youtube or you will not be able to pay your rent"
The future: ToS basically becomes law, and you will be stripped of your own second brain if you violate it or say anything they deem "sensitive"
Like Slack or GitHub or AWS or whatever. It’s almost always a net positive to wait vs do it yourself.
What could possibly go wrong.
Also, would bet money that the derived data from the meeting-summarizers is being sold to hedge-funds, to give them a bit of an edge.
And if it isn't already, you can be that they're probably to start.
All those "difficult to program but easy-if-time-consuming-for-human" tasks, will 1000% be farmed out to models at unprecedented scales.
The incentives reward this kind of behavior. I wonder then how to operate in a world that is low of moral values and ethicality - does it mean I have to do so to have a fair shot? I'd like to think not.
However, the temptation of productivity gains are strong, and few of the customers look into relaxing these rules.
When the electricity goes out, (most) people get similarly upset. No electricity means no internet, and all of a sudden everything that people had planed to do can’t be done until the power returns.
I can't wait for a Hollywood blockbuster that'll pretty much be science non-fiction.
Yesterday one asked another "how much of this deck did Claude do"? and the response was "50%". "What 50% did you do?" => "I chose the font and colors".
* ransomware attack, fire in the server room, database HDD crash, car accident takes out the internet connection, ...
Do you, and those executives, own the risks associated with that practice? Are those risks actually indemnified?
Its neat that 'anyone can do anything' but if they don't actually know what the risk to business or 3rd parties, why is this a good thing, especially in the enterprise where there are actors who are explicitly looking for this type of environment to exploit?
I've been working in tech since the late 90s. This is the biggest and most sudden change in company behavior I've ever seen. The only thing that comes close was the web 1.0 world in the 90s where everything suddenly became websites.
That creates tons of risks and opportunities. Good and bad. Maybe a great time to start a security company. But maybe a terrible time to be a small time web app developer when your clients can get 'good enough' in minutes for dollars on their own.
You comments read like reddit clickbait. How many of these executives/senior/coffee bean/whatever ppl do you even know and why you the one enlightening them with claude cowork ? . "Every X i know" sounds like a large sample size. Make ridiculous claims by prefixing " every X i know" .
I feel so angry at this linkedin speak. so infuriating. Hate that we've accepted these ppl without any pushback.
I say this as someone who deals with sales/CRO/CFO functions quite regulary, I have to tell everyone that uploading contracts to Claude and/or ChatGPT does not hold confidentiality because files are not covered under enterprise ZDRs. [0] [1]
It comes down to 'everyone else is doing it' without an understanding of why, then past that, the what of how that applies to the specific business to find the unique value of AI to an organization that does not touch external networks.
Please give your GC the links below, let them look over your contracts and obligations to ensure you aren't exposing risk for no real reason other than saving a couple seconds for something that a SDR/BDR level employee could do.
[0] https://code.claude.com/docs/en/zero-data-retention#what-zdr...
[1] https://developers.openai.com/api/docs/guides/your-data#zero...
It’s an interesting time.
where do you see this going/any interesting theories?
If its so obvious that everyone is doing it then you dont need "every executive i know takes a shit" .
every interaction is now laced with ulterior motives like op trying to pitch himself as ai expert to sell his courses or whatever. He is apparently going around blowing executives minds with claude cowork. so ridiculous.
With all due respect, I have no idea what you are talking about. I'm saying that I've observed friends and associates (who are executives, because I'm old and work in business) pick up and adopt a specific tool at rates faster than any other tool I can think of, which seems interesting to me. Do with that information whatever you want. It's just an anecdote from a random person on the internet. I'm sorry that this observation makes you angry.
I'm not selling or pitching you (or anyone else) anything. I haven't taught any programming courses since the 2010s (pre-ChatGPT).
> Every executive/leader I've shown Claude Cowork
now you are saying you were merely "observing" ?
Wait, you exposed people to a technology, taught them how to use it, then you are not going to own the implications of that action without teaching them about the risks or telling them how they need to ensure they don't shoot themselves in the face or violate their duty of care?
Do you understand what you are saying and the implications of that in the real world relative to the insurance contracts that they have?
Your company is associated with HIPAA, you should have a much higher standard than this.
For big corps - this is different. But modulo hipaa - this is why they are gung ho hi about binding arbitration - they are trying to match velocity to some degree - and mostly failing…
From what I have seen - most executives would rather shut down the business and quit than accept the possibility of personal liability - and just avoid the regions of the world in which they do have it.
I think this is where we have the issue in my tone and approach to my comments. My response was based off of the OP stating that the people who they were introduction were 'executives/leaders' and not 'friends', which has a very different connotation when it comes to information security, liability, responsibility, accountability, and ownership. It was only in their response to my question about risk ownership that they described the persons as friends.
If they had said 'friends' from the very beginning, instead of 'executive/leader' I would not have had the reaction than I did. The reason why I brought up HIPAA was because of 'executive/leader', since the idea of duty of care extends to leadership within any organization, especially those who are involved with healthcare, which they know based off of their company.
>"I’m a CFO and network regularly with other executives, board members who also are board members at other companies, investors, people who see a combined large population of companies"
The call to HIPAA wasn't about PII, it was about knowledge around standards and regulations such as HIPAA when it comes to application/information/network security is just baked in. Which is why the passivity around the statement made no sense given the risks/obligations/liability associated with vibe coding applications at the executive level, which someone who's company deals with HIPAA should understand and appreciate.
Never have I said that, and please quote me word-for-word otherwise, what I said applied to "very executive/ leader at my place of business who does nothing except work with PII data all day", that is a windmill you created yourself.
You can keep tilting at the windmill.
[0] https://news.ycombinator.com/threads?id=Ucalegon#48133230
But I appreciate you trying to police the expression of my deeply held beliefs, but, like, nope!
If you care about data privacy, especially your own protected health information, that sentence should give you a lot of comfort.
In a HIPAA environment, people who are sufficiently trained on how to develop regulated software securely are called "software engineers".
In my opinion, agents will replace the majority of the rest of businesses before they are good enough at agentic engineering to be able to autonomously develop software that safely and reliably can manage PHI without a single mistake.
It goes without saying: never trust your PHI to any company who is vibe coding in production.
'Adding value' is a very interesting statement and way to judge the worth of something. Adding value to who? And if that value add also causes massive harms, how do we reconcile that? So you build a brand new app with does all of the things that all of your total addressable market wants, but it also exposes all of the IP your existing clients, does that mean you will be able to achieve that TAM?
Corp IT does not exist in a vacuum. Understanding the why of that isn't a 'you should just accept this' but more 'how can we make this better and avoid mistakes already made by others'. I will always point to aviation and 'bold text is written in blood' as a great model to understand all of this not as a blocker but, instead, as a building block.
In general, safe businesses can only exist with government support or government prohibition of all other businesses globally - and that is a very hard bar to clear.
In a properly structured organization, of which there are many and who are required by regulations and/or best practices, senior executives tend to have need/role-based access to information, just like everyone else in the organization. So they may have access to strategic business information, but not patient records or payroll. They may have access to planning data, but not the financial records of individual or clients. Etc. etc.
Smaller or newer orgs may not have this compartmentalization, but in general I think the principle holds true for orgs over a certain number of folks in size.
Generally, when it comes to 'privileged' information within an executives inbox it is business information or trust releastionships and not specific PII/PHI of an user. It was me being terrible at trying to impart that even the most begin seeming access may have major consequences even if it is not a total compromise of everything given the massive scope of 'what could happen' with executives vibe coding applications, like something managing their inbox past their EA, or something trivial seeming.
These are 'proper' (sometimes) access controls, but can still be abused. Not from email...but you get the idea.
Compliance is due to the legal obligations thanks to local regulations and obligations that are defined through contracts with 3rd parties.
Saying 'found the Microsoft person' expresses a lack of understanding of the domain.
This is how IT acts in my enterprise orgs. There is absolutely a need for compliance and governance but unfortunately the people in these roles are typically not technically minded and have low incentives to innovate so you get these folks only really arguing for their jobs.
Do you think the MSFT sales person, or anyone who has the financial incentive to innovate, doesn't want you to innovate? They want you on Azure and O365 regardless, they don't care.
Hell, Microsoft will give you will give you 150k [0] of credits to do so.
But keep talking as if you have some magical, unique, special insight that escapes contracts and the law, compared to the people who, sadly, have to deal with reality.
Risk is always nonzero but you can already today get pretty comfortable with most of these orgs with some customization in the contracts.
We are talking about vibe coded applications by executives and the risks that are associated with that, nothing within a DPA covers that. Please, be my guest, link an Anthropic DPA which includes indemnity for damages associated with the code produced.
Again, you keep showing your lacking of understanding of the domain in some really fundamental ways which shows that you haven't negotiated B2B contracts nor have you held a position of responsibility where you hold liability.
But keep responding because this feels more like therapy for you, and your feelings about people like me, rather than the realities of the exposure that come from vibe coded applications for executives.
Each entity and group have to consider the risks. I don’t think anything you’re trying to point at though is really useful for the discussion at hand. There is absolutely a use case for Claude code/cowork/codex and related tools to be used by non-technical folks. There is also a lot of figuring out in each of these groups. Unfortunately IT in most orgs in what I have seen have ignored the art of what’s possible for the last 3 years and now that we have hit this inflection point are scrambling to catch up but sadly the incentives are usually not aligned so they are really only incentivized to not take any risks.
You went further than "a joke."
You continued making aggressive, non-substantive remarks that were out of line.[0]
#1 > you have no idea about the details.
#2 > i don’t think you have a grasp what’s going on around you.
#3 > What is your deal about contract law? It’s not some mystical thing.
You wasted everyone's time.
There are significant reasons why an organization would not want to use Cowork, because it does not fall under Anthropic's ZDR [0], which is a huge issue for... anyone dealing with anything sensitive.
What I think this comes down to is that you value velocity regardless of whatever the costs. We will get to see how that solves itself, there are going to be a lot of billable hours that are going to figure that out.
But none of this means that you have any idea what you are talking about nor do you understand why individuals or organizations act the way that they do.
You are free to do it better. Please do.
[0] https://code.claude.com/docs/en/zero-data-retention#what-zdr...
I am sorry you feel this way, it does not change the facts of whats being discussed, its just that you disagree and you lacked the initial courage or intellectual capabilities to express that constructively, so you had to obfuscate through providing nothing of value to the discussion via low value comments. I get that YOU don't think something, but just because YOU feel something doesn't make it valid, grounded in reason, or should be listened too.
Have a great rest of your day and weekend!
But you are totally free to build a company where there is no oppressive corporate IT, where there is always an incentive to innovate and grow, you can build that future.
The reason why that will not happen might be contained within the first ten words of the first sentence of my first paragraph, but you can prove me wrong. Let me be your motivation! Your dream should be your reality!
Not sure by you keep thinking I have anything to prove to you. My point stands. The governance and risk are very valuable discussion and it’s going to change between industry and the trust level of each group.
Unfortunately most IT is short sighted and trying to play catchup. We had 3 years of thinking about how these tools are going to impact the workplace and are now rushing to catch up while also being insistent that Copilot is a worthwhile alternative. I generally disagree with that. I am not advocating that IT oppressive but that unfortunately most IT leaders are not technical and it shows.
My point has been consistent. You jumped to specific conclusions from a 30second post that adds little to the parent discussion.
IMHO,
1. Dismissing attorney client privilege is reckless
2. and the vast majority of users aren't aware of what "customization in the contracts" is needed to enable autonomous agents or if it's already contractually allowed.
This is still a fair question:
> Do you, and those executives, own the risks associated with that practice? Are those risks actually indemnified?
Reading the first part, I was going to say they don’t even care about whether or not there’s a codebase. It doesn’t matter; it could be all gremlins and hamsters in wheels for all they care, and for all they should care. All that matters is the functionality, the value it gives them.
We’re even getting disposable code now. Entire single-use ephemeral web apps, built on the go to enable, visualise, or simplify a specific thing, then thrown away.
Will it all lead to some trouble? Definitely. So did computers, and so did the internet.
Weird times. Fun times.
I would get called in to rewrite it, using a proper database, documented rules and ensure it stayed scalable - and everyone would be happy.
These Access "apps" were abominations from a technical point of view - but they got the job done without having to spend a load of money on off-the-shelf or bespoke software. And the "tech guy" made a valuable contribution to the company. It's only at a certain point that Access started to struggle.
I foresee the exact same thing happening in the near future - except we won't be building the replacement apps ourselves - we'll just know how to give the coding agents well-specified prompts and tell them when they're making a mistake.
What is different on this one vs the others is I have Claude to help me data dive and write the boring CRUD parts. I am able to spend so much more time with users testing and getting feedback and just thinking deeply about how to structure things. The quality of what I’m building now has never been higher and I think it’s just because I have more time to spend with it.
My experience with AI has been almost wholly positive and I wonder if Rails is part of the reason. Such well established patterns and structure the agent one shots most things and I spend most of my time wrangling view code based on my preferences.
I think what a lot of us are concerned about is that the vibe-coded stuff bloats fast. It's so verbose and all over the place, that picking that thing apart will be a huge job, and relying on an AI to pick apart work that an AI already failed to maintain seem like wishful thinking.
It's literally "The AI is failing! Don't worry I'll just use AI to fix the AI!".
And even if they somehow weren’t, we’d just do what we used to do with documents to turn them into "chewable bites": chunking, extracting, summarising,…
What I needed to do was sit with a user (not a manager/the person buying my services) and ask them to show me the different things they did with the software. Then I could write a spec for the actual _feature_ and would only need to look at the existing codebase if they needed data transferring across[1]. I don't see why our new LLM-based future would be any different
[1] Of course this meant I would leave out edge-cases and/or weird quirks of the system - often this was actually a bonus as they were either no longer relevant or worked that way because that was the only way they knew how to do it
Its not a good experience,esp the "debugger" and its traits - but a good tool that just does its job :-)
Isn't it the uber model? Isn't that likely where the future is to go with this new uncertain technology that will surely create new unthought of verticals?
To put it another way, the customers of these frontier models are implicitly being competed against by the model itself.
Withdrawal symptoms. We've all been there.
We are now measuring productivity in lines-of-code. This is not going to end well, not to mention introduce massive amounts of burnout.
That would be a capable 'personal assistant', or 'executive assistant', of 'chief of staff'.
Why? because the point is, just like in real life, to abstract away the complexity, irrespective of domain.
"Average user" implies someone not skilled or savvy in the domain you're thinking of. For a medical doctor, the 'average user' is not-a-doctor. For a technologist, the average user is not-a-technologist. For an insurance specialist, an average user is not-an-insurance-specialist. Etc. etc.
The personal assistant, exec assistant or chief of staff are themselves not necessarily experts in any domain, but they do rely on specialists to get stuff done.
So the UI for this killer app is basically voice input, keyboard input, camera input (mirros of human output) in the user's language with natural language interaction, and the output is voice and monitor/screen, and possibly a robotic arm/hand/body (mirrors of human input). Anything more complex than that would require tailoring it to a domain/domains.
If you doubt this analysis, think of all those folks for whom the IE/Chrome icon was/is "The Internet". Sure, you can go one level deeper with having them put in URLs, or operate email through the aol/gmail bookmark or desktop icon, maybe open documents/files from 'My Documents', but are they going to go any deeper than that, for the 'average user'?
You mean UX? Isn't Claude Cowork supposed to be 'Claude but for normies'? As for Claude Code / OpenAI Codex for non-programmers, believe Replit, Loveable, & others are trying & succeeding.
WhatsApp comes to mind in how its sole focus on replacing SMS (rather than Skype/AOL/MSN Messenger/YChat/GChat) meant it had no (user-facing) password/username, no elaborate signup, no login, no chat/friend requests, no sync etc. & became the biggest social network right under the nose of well resourced competitors with worldwide distribution, like Google & Facebook.
Probably phone operators were not impacted too: SMSes bundled with flat plans are still flat plans and Europe style unlimited calls + 100 SMS per month plans are still there and those SMSes are still mostly unused.
So we could have a killer app and yet nothing changes in the flow of money around it.
UX wise, WhatsApp is a big improvement over SMS. Vocal messages, I'm not a fan of them. A waste of my time.
Mobile network operators lost the profits (at prices that were pretty much pure margin) they had on pay as you go messages, and messages not included in flat plans (e.g. overseas SMS's). They also lost a huge amount on highly profitable overseas calls. Those of us with family in other countries save a lot of money by using Whatsapp and similar instead of phone calls.
Net neutrality was triggered by their attempts to block VOIP and messenger apps.
I knew one telco who made €3Bn clear profit a year from 2 Dell servers and a team of five to keep SMS messages flowing. Their billing infrastructure was bigger, much bigger than the SMS servers.
It'll just be power users. We're moving toward a world of significantly fewer analysts and more into "Super SMEs" that can actually learn tools like Claude and manage enormous complexity with them.
Just giving average users these tools will produce garbage. This example from Claude is so contrived and any business analyst can see how a process that requires uploading additional data will fail. You can't expect users that don't even know their own data to be able to make this thing work.
There will be no "average" user in the future. It'll be multi-disciplinary SMEs that are extremely creative and knowledgeable about their businesses.
I think you’re underestimating “average users”. If we talk about the median, then probably you’re right, but if we talk about “the group of people clustered around the average” I think there’s a lot of untapped potential, especially in people who assumed data and programming were unknowable/impossible and have therefore been held back by “good” tools like excel
We're obviously going to be holding ourselves back in terms of scale and in terms of not being a "true" SaaS with this approach, but my thesis is that we get much higher quality results and higher compliance/activation and can charge more for the bespoke model backed by our own platform.
The power of Excel is not what it was. Nor is the power of ordinary thought.
These narrow integrations with specific software suites seems like a dead end.
If you look closely, people we already creating databases and doing computation. But on paper. Spreadsheet software move the medium to the digital and with that brings a lot of convenience. Same with email, instant chat, and shopping on the web. The killer app is not about bringing something new, but making an old problem easy to solve.
The issue with LLMs is that it makes errors. Uncontrollably. And even if you can spot the obvious ones, there’s always some you won’t be able to catch unless you’re a subject expert. I’ve never seen a random people willing to monitor a piece of tech.
If they can build an integrated AI assistant (what Siri should be) that can spin up and call agents it will be big (or it will flop but my money is on big if it’s the easiest way to use agents in your daily life)
I haven't tried it, or know a lot about it, but isn't this the whole claw thing?
ChatGPT/Claude's web ui is much more like something for average user, tbh.
I really thought Airtable would take off because it was even more of a "database that a normal person could actually use".
Learning how to type commands and use a terminal is not something people cannot already learn right now. And that was the way before.
I think the real killer app is making marketing and other non development (non analytical) work better. In case of marketing, we have tried many AI tools for marketing, and so far they mostly make campaigns more generic, less exciting, and often worse. They help a little but you need to careful that they do not to make it worse.
This is probably fine as long as the code is acting on local resources. The moment you have vibe coded software interacting with shared state or database the risk increases exponentially and all it takes to have a bad day is a poorly worded prompt from one of those users.
Some oversight by humans or automated guardrails will probably reduce those instances.
/s
A figma like dashboard for turning ClaudeCode, Gemini Cli, Codex into an OpenClaw but with security measures to break the lethal trifecta while running on a VM.
But it's not quite there in terms of usability. I agree that is the hardest part of the equation. It's something I'm constantly experimenting with and haven't found the solution to it yet. Open to feedback!
excel isnt used because it's a database, it is because you can do things in it in relatively unstructured ways and reference things youve already done with a click. the future of databasing is bringing more spreadsheet UI to the database, not bringing more users away from spreadsheets. with AI i agree there could be some sort of UI that could pop off that leverages it well, but im not sure its going to be t bring users closer to coding. I think it is going to look more like a project management tool than anything else. i mean shit, it might even just be an excel add-on because excel is still where the data is
It's targeted for creatives atm. For the few in private testing, it's been amazing what they're able to do with the little tooling I've given them. It is a legitimate change in their daily drive.
I don't know anyone not building a product in that space
I have a vision for what will be the next household ChatGPT:
1. An actually frictionless way of keeping the human in the loop. My product is primarily targeting that: Your tools should feel like an extension of you, not replacing you.
2. Juggling work. I feel like what I'm making here is the secret sauce, so keeping a hush on it :)
3. Keeping all your work in one place. Drawing, sketching, developing, emailing, planning, writing; there is no reason to depend on other apps if you have one place that does it all, and it's the best offering among them.
Edit with some follow up thoughts -- I think what I'm trying to make is best summarized as claude code for non-developers (that's what I put in my YC application), but I think what I'm trying to make doesn't quite even have a developer equivalent.
There's not an environment you can go into right now and say "after this builds every single time, deploy to this machine" and it actually seamlessly does that. The tech is there but making it a whole Factorio-esque operation is still very manual -- and that's what I'm solving.
Good for your feelings, but I feel the same for my work ..
The main problem is still, agents are not reliable and what normal (and dev) people really want, is to have them reliable. Or well, tools to manage unreliable agents in a more clear way.
(It is a big market I think)
Or, miscommunication.
You don't normally get raked over the coals for having an idea you're still carving out -- an idea that's validated in production, even.
I'm currently doing something like this in the internal model-independent LLM chat app I work on at a F100, specifically targeted at our everyday users. <input type="file" webkitdirectory> lets the user give the model read and write access to a local folder (and OPFS lets us reuse the same fs tools we give the model for files manually attached to the chat, or for files tools want to create if they haven't granted folder access).
Every time we used to release a new version it was "still can't handle the 6MB Excel file I drop into it" when that was being extracted to CSV and added to context - now it can poke about in the big Excel file directly with SheetJS to pull the sheets/headers and inspect the shape of the data, and use locally sandboxed code execution to write code against either extracted data or the spreadsheet itself via SheetJS for pivot tables and such (all locally - none of which need go into the context).
The base models are good enough at tool calling (I really mean Claude, though, the GPTs just go on a tear calling tools with no context for the user) they're already decent at automating stuff for the user without a dedicated harness (our default system prompt is still "You are a helpful AI assistant", lol). Add tools for Graph API stuff, and now it can pull the nightly batch file from a support inbox, unzip the spreadsheet within, diff it against yesterday's and generate an import file for new users and draft an email to welcome them, something that used to be a daily support task (which I'd already automated most of - but now you don't need a dev for this kind of thing). Or go find the big 450,000+ row spreadsheet that's being automated somewhere on SharePoint, pull it down in 150,000 row chunks (Graph Excel REST API limit) and write code to go figure out whatever the user is asking.
Having implemented and used it, I like this setup so much it kinda ruined Claude.ai and ChatGPT.com for me, so I've hooked up similar access for them using a browser extension to add the folder picker input, with the extension talking to a local server to tell it which folder to give access to, and Claude/ChatGPT talking to the same server over MCP via a CloudFlare Tunnel to work with the selected folder.
Think the movie Her 2013. OS1 it's called.
You should check out Cursor 3 :)
Isn’t that literally Claude’s web UI?
Super early stage but I am really happy to read your comment.
Claude can write code pretty well, but there are just a few tasks that I need to do to orchestrate everything. If it could do those tasks well even some of the time it would be about 10x more useful.
It's called Zenning AI - we're a small team in London, testing it with a few companies at the moment!
Honestly though we are finding that a little FDE to set up pre-bake stuff that’s sufficiently specific to the customer is needed. Otherwise people are like, “I don’t need to close the books, I need to do a per-working-day profitability analysis for 10 EU countries with different public holidays”, and they get stuck there.