undefined

points

[-]

Take a look at https://chatjimmy.ai/ -- it's running against Taalas' "hardcore" silicon model, ie a dedicated, ASIC-like chip.

by bikelang11 hours ago|

parent|

[-]

Wow - actually pretty astonishing how fast their inference is. So fast it feels fake?

by qingcharles9 hours ago|

parent|

[-]

Yeah, when you find fast inference like that it almost feels like the answer arrives before you hit return. Now imagine it running locally with no server round-trip.

by adamsmark8 hours ago|

prev|

[-]

Groq was the preview of the broadband era of LLMs for me. I remember asking a question on the demo site and the answer text showed up near instantly. Far faster than I could read. This was ~1 year ago and pre-acquisition.