upvote
Problem is that it's been heavily contaminated with people speculating about who the author is. It would probably be difficult to get an unbiased answer out of it (although who knows - it's crazy that it can do this at all).
reply
So train on pre 2009 mailing lost archive. Someone must be doing this surely.
reply
This is very clever. You should pass the idea along to the guys at https://talkie-lm.com/introducing-talkie
reply
Much better, train on the cypherpunk mailing list archive or anyone discussing e-cash on crypto forums or usenet from the 80's to the early 2010s
reply
It's a hard stylometric challenge, just because of its format. The forum posts are probably better for comparison, but what I don't see people doing that I wish they would is comparing what the different Satoshi suspects have written since the forum posts and whitepaper.

Everybody's going to get more similar in terms of topic. Bitcoin actually exists now. There's more to say about it than there was at launch. But does anyone still sound like Satoshi? Or sound more like Satoshi than they did before?

The slight wrench in the works is that it's hard to do this with my personal favorite Satoshi candidate. He stopped writing altogether in 2014, and lost capacity from shortly after the whitepaper came out until he was writing with his eyes by the time he had his head frozen.

He's also the only candidate who seems more likely to me over time, though. The longer things go, the less likely a living person stays tight-lipped.

reply
The whitepaper states the author, so…
reply
deleted
reply
Pseudonymously
reply
That doesn’t matter. The LLM will still answer based on what it knows about Satoshi Nakamoto, rather than just based on the writing style.
reply
welcome to the internet. you must be new.
reply
You missed the point. The fact that the whitepaper states an author will heavily affect the LLMs answer when asking it about the likely author of any correlatable portion of the text. It will answer based on its knowledge of Satoshi Nakamoto.
reply