upvote
Yeah, I suppose, but how do I get sufficiently high quality synthetic data without sending the original data to OpenAI/Anthropic, or by using local models when none of them seem strong enough to be able to generate that "sufficiently high quality synthetic data" in the first place?
reply
you could do something like rent GPU time yourself, and use it to run a higher-quality local model (e.g. one of the Chinese "close to frontier" ones). Not guaranteed to preserve privacy of course, but it at least avoids directly sending the data to OpenAI/Anthropic.
reply