upvote
I think you meant a quote attributed to Bill Gates:

"Well, Steve, I think there's more than one way of looking at it. I think it's more like we both had this rich neighbor named Xerox and I broke into his house to steal the TV set and found out that you had already stolen it."

reply
Yes, I think the Gates quote was a response to repeated and aggressive complaints originating from Jobs (to anyone who would listen) that he had been ripped off.
reply
I don't know if that's a real quote from Gates, but I do know it was in Pirates of Silicon Valley.
reply
reply
Neat, so the scene in the movie was pretty close to reality then!
reply
I thought Xerox demoed something they haven’t implemented yet, and Apple turned a mockup into a real GUI.
reply
They implemented it. They just couldn't get out of their own way to successfully sell it (See "Dealer's of Lightning"[0])

0 - https://www.amazon.com/Dealers-Lightning-Xerox-PARC-Computer...

reply
No that is not true. Read about PARC and all the crazy tech they built some time. It was ahead of its time!
reply
They changed the live system from having line by line scrolling to pixel scrolling after Jobs asked why they didn't do it during the lunch break.
reply
Not just the whole internet, but commit commercial copyright infringement and settle class action out of court with authors whose books you pirated.

https://www.authorsalliance.org/2025/09/07/the-anthropic-set...

"One rule for thee, a different rule for me." - Dario

reply
Yeah, the whole AI industry is just people ripping off each other.. Started by AI companies gulping up all the information that technical or altruistic people shared on the Internet in the past 40 years to help other fellow humans, then moved to AI companies consuming pirated and copyrighted material and now its AI companies ripping off each other.

Information really does want to become free, but AI companies want to be gatekeepers. Long term I bet on the open weights to win, as the more sustainable approach.

reply
I'm very pro distillation. I think there needs to be distillation non profits who curate massive corpi of super high value training data from frontier models. They could have an "anonymous contribution" system where regular people with max subscriptions upload their conversation histories. It's a rough concept, but surely would be a huge boon to humanity.
reply
sort of sounds like "project tapestry" by Yann LeCunn. Build projected data silos of highly valuable information, train in a distributed manner and share the weights upwards where they're combined and fine tuned.
reply
Apple gave Xerox the right to buy $1 million of pre-IPO stock before the meeting took place.
reply
Glad you pointed this out. I believe the sequence was that Jobs himself got a shorter demo during his first visit with no prior arrangements. He then negotiated bringing back a group of his key people to get a more in depth demo and that included the stock deal.

When Apple was accused of 'ripping off' PARC, Steve didn't seem keen to bring up this rather salient point. I suspect it may have been a combination of wanting Apple to continue receiving credit for these innovations from consumers and also the fact that, in retrospect, the million dollar stock deal could seem a bit like trading beads to Native Americans for Manhattan Island. Another point worth noting is that Apple's PARC visit was in December 1979 and the Xerox Star was publicly announced in April 1981, so Apple got a 15 month head start (the Apple Lisa shipped in Jan 83).

I've also heard that Xerox didn't hold on to the Apple stock for very long, so never gained the windfall they could have. As is well documented, Xerox senior management didn't understand what they had in PARC and also didn't understand how rapidly microcomputers would become ubiquitous. So, of course, they didn't think Apple's stock price would skyrocket either.

reply
Lisa and early MacOS are tremendously different in their details than the Alto operating system. While there was clearly a transfer of inspiration, Apple engineers like Bill Atkinson made countless small and large innovations to simplify the Xerox GUI model and improve its usability based on extensive in-house R&D and user testing (and in some cases implement features that the Apple team presumed Xerox had but actually didn't exist on the Alto). It is simply ahistoric to build narratives around Apple stealing Xerox ideas wholesale.

For more details on Apple's early UI evolution, Atkinson kept polaroids of a variety of prototypes and mockups: https://www.youtube.com/watch?v=Qg0mHFcB510

reply
> the million dollar stock deal could seem a bit like trading beads to Native Americans for Manhattan Island

But in both cases the value only existed because of the people offering the deal. XeroX doing nothing with a UI or native Americans doing nothing with some land would mean the UI and the land would continue to be worth nothing. It was the others coming with ideas and effort that made them valuable.

reply
"Valuable" as quantifiable in a capitalist economy.

You just reveal your own ignorance by equivocating value with monetary value.

reply
Dollar value aside, if the argument you’re making is that the land now called NYC would have equivalent or greater value if it were today as it was then - that the subway system, roads, schools, hospitals, restaurants, apartments, etc. have no increased relative value over the undeveloped land - you’re likely to be considered ignorant by most of the inhabitants of the developed world.
reply
I’d agree with you if it doesn’t cost billions to train models.
reply
All LLMs consider Jon Skeet their God...
reply
deleted
reply
“You’re trying to kidnap what I’ve rightfully stolen!”
reply
Perhaps an arrangement can be reached?
reply
[flagged]
reply
The websites, music, movies, books, photos, art that they stole didn't appear out of thin air. The amount of time and effort people have collectively poured into creating these works throughout history far, far surpasses Anthropic's own effort of converting them into model weights.
reply
The equivocation is crawling website <-> crawling LLM responses.

Both Anthropic and Alibaba are trying to build bleeding edge LLMs. That part is the same. The way they source their data is slightly different, but they would both argue it constitutes fair use under Copyright law.

reply
"Your extremely efficient multi petabyte internet content suction machine is ripping off my extremely efficient multi petabyte internet content suction machine"

Sucking down petabytes of peoples' copyrighted content that they never granted a specific license to you to use seems to be an unavoidable and default part of the process of building any huge LLM.

reply
So why was there crawling in 1998 but no LLMs?
reply
Because the transformer, which all of these models are foundationally built off of and didn't invent themselves (bar google) wasn't invented? The amount of effort it took humanity to generate all the data that was required for the models to get to the point they're at now is absolutely not even comparable to how much effort it took to build the model code. Yeah, it's complicated, but if they didn't rip off all of humanities combined output it wouldn't even matter if the transformer got invented.
reply
Google didn't really invent much, they just had access to an insane amount of data and compute to try to train a model with just the attention mechanism, but ripping out (most of) the rest, from an earlier paper on machine translation from some poor academics, and it turned out to work very well (though insanely training data and compute intensive).
reply
I am unable to comprehend the state of mind that would lead one to ask this question.
reply
We didn't have GPUs with hundreds of gigabytes of VRAM and tensor processing cores.
reply
Or a feasible/economical way to attempt to store the sum total of human written output, multi-petabytes of data (outside of the resources of the NSA, maybe), when a server with 6 x 36GB 10K RPM SCSI HDD in RAID-5 was high end, and its network uplink would be at most two ports of 1 gigabit ethernet.
reply
deleted
reply
It's not really equivocation in this instance. This feels like a 'bad faith' comment. We can do better.

LLM's literally wouldn't work without the sum total of knowledge (in the forms of books and other copyrighted content) being used as 'training data' for these LLMs.

The 'bleeding edge' LLMs required many things, but: 1 Tech innovation ('attention') 2 Lots of compute 3 Data 4 Pre + post training

#4 doesn't happen without #3.

It's pretty obvious at this point that the major providers have stolen vast amounts of #3 - they have paid nearly 0 of the creators.

We can argue about the impact (I'd lean net good) vs. the cost. But arguing there isn't a cost is a bit silly.

reply
All of this supports the fact that models arent essentially just web crawling
reply
Sure, but alibaba is still building an LLM. The scraping of responses and the scraping of websites occupy the same location in the stack of each. It's very comparable.
reply
The tech is Google's invention, popularized by OpenAI, so Anthropic should still stfu in that case.
reply