3mo ago

A possible hardware solution for ultra speed (73x faster than H200) self hosted small models that is not dependent on RAM

404: Page Not Found

Approach hardwires model weights into transistors, and uses older 6nm process. Targetting 70b model sizes (presumably 16 bit) by year end. It should cost much less than a 140gb card. but I don't know details.