Hacker News
new
past
comments
ask
show
jobs
points
by
choppaface
9 hours ago
|
comments
by
londons_explore
8 hours ago
|
[-]
If it actually cost that much RAM, they would almost certainly add extra things to the API to manage cache lifetime. Ie. A 'please cache this for X minutes' flag, or a setting for a single re-use cache (the most common use case)
reply
by
cyanydeez
7 hours ago
|
parent
|
[-]
https://platform.claude.com/docs/en/build-with-claude/prompt...
suggests the can cache outside the gpu.
reply