Tenstorrent Wormhole Dev Kits and Workstations Power High-End AI Development

tt wormhole hero
Tenstorrent's name might not have quite the same cache that NVIDIA's does when it comes to AI acceleration yet, but the company's unique approach has a lot of benefits. Between a royalty-free RISC-V processor ISA and highly scalable AI accelerator designs, the company promises it can lower the total cost of ownership of AI deployments in the cloud or in on-premise configurations. The Grayskull family of accelerator cards ushered in dev kits for the company, but the new second-generation Wormhole design brings performance to a new level. 

For the uninitiated, Tenstorrent's approach to AI development is to use standard Python and C++ libraries, from PyTorch to ONNX. In between the software and the hardware sit the TT-Buda general AI compiler and the bare-metal software stack dubbed TT-Metallium. Tenstorrent's software, much like plenty of other AI compilers and execution providers work to optimize a given model for the hardware on which it's running.

While Tenstorrent isn't a household name just yet, HotHardware readers are no doubt familiar with some of the company's leadership. That group includes AMD Zen lead designer and Tenstorrent CEO Jim Keller. With context established, let's talk about what's new. 

First up, Tenstorrent's Wormhole accelerator cards have been in the pipeline for a bit, but they are officially available for purchase starting today. The Wormhole n150 includes a single Wormhole processor sporting 72 of the company's new Tensix AI acceleration cores and 108 MB of SRAM, all of which is clocked at 1 GHz. That chip is rated to provide 262 TFLOPs of compute power with FP8 data. There's also 12 GB of GDDR6 with 288 GB/sec of bandwidth. On its own, the n150 should be pretty speedy for a developer kit. With a total board power of 160 Watts, it's pretty power efficient, as well. 

tenstorrent wormhole n300
Tenstorrent Wormhole n300

However, when the Wormhole n300 adds a second chip trimmed down to 64 cores, performance can reach even higher. With a total of 128 cores, the n300 is rated for 488 TFLOPs of compute power with a total board power of 300 Watts. That's basically a linear increase of TFLOPs compared to power draw. There's also now 24 GB of GDDR6 with 576 GB/sec bandwidth.

Both cards come in a 3/4 length PCI Express 4.0 x16 form factor that is set to occupy two slots in a chassis. Pricing is pretty attractive given the large pool of memory, power efficiency, and performance. The Wormhole n150 will retail for $1,000 while the souped-up n300 will sell for $1,400. The existing Grayskull e75 and e150 cards remain available at $600 and $800, respectively until stock dries up.

Tenstorrent Is Launching Its Own Workstations For Developers

The workstation setup is where things get really exciting, though. Tenstorrent has the TT-LoudBox and the TT-QuietBox on tap, where the names indicate the relative noise levels produced by each. They're both available in a rack mount or regular tower configuration, include 4 TB of NVMe storage, 512 GB of memory, and feature 2x 10 Gigabit Ethernet ports. That's where the similarities end, though.

loud boxquiet box
Get ready for TT-LoudBox (left) and TT-QuietBox (right)

The $12,000 TT-LoudBox features a pair of Intel Xeon 4309Y octo-core processors that hit up to 3.6 GHz, 512 GB of memory, and four Wormhole n300 cards (eight processors total) in a 2x4 mesh configuration. Everything is air cooled, and by Tenstorrent's own descriptions, the TT-LoudBox is more "louder PC" than it is "deafening server" so sitting in the same room should be an option, though we don't have actual sound pressure data to share.

Developers who want to maybe sit one of these behemoths in an office right beside them can opt for the $15,000 TT-QuietBox. This machine has a single AMD EPYC 8124P 16-core processor with a maximum boost clock of 3 GHz. The same quad array of Wormhole n300 is present, but it's under liquid cooling, which should bring the noise down considerably.

The TT-LoudBox and TT-QuietBox can be set up in several different configurations. Tenstorrent says that a single workstation can be set up as one enormous AI processor, but it can also be split up to serve four developers at once, each with their own direct access to a developer card. Not only that, but each Wormhole AI accelerator ASIC can be dedicated to a different model with its own 12 GB of dedicated memory.

Tenstorrent Has Much More In The Pipeline

tenstorrent silicon roadmap
Tenstorrent's roadmap shows Blackhole is up next, which is now coming this year (click to enlarge)

Tenstorrent's dev kits use the same core designs as its datacenter hardware, so moving from a developer machine to a cloud deployment should be a seamless experience. The key, according to Tenstorrent, is getting the hardware into developers' hands. Despite the relatively high cost of a workstation, much of that cost is consumed by the price of all of that AI acceleration horsepower.

Wormhole is the company's first processor commercialized for the datacenter, so having eight of those processors in a server makes it flexible and powerful, and awfully convenient when it comes time to deploy. As shown on the roadmap, the Blackhole architecture will come next, and the company says it's coming later this year. The next generation after that will focus on chiplet designs, which will no doubt come in handy for the company's automotive AI endeavors

These developer kits and machines are up for order on Tenstorrent's website today. The TT-LoudBox is available for immediate delivery, as are the Wormhole n150 and n300. Meanwhile the TT-QuietBox is available for preorder with delivery coming later in the summer.