Cerebras claims record in molecular dynamics simulations, says…

The Wafer Scale Engine measures 8 inches by 8 inches, which is considerably larger than a 1-inch to 1.5-inch GPU. Whereas a GPU has about 5,000 cores, the WSE has 850,000 cores and 40 GB of on-chip SRAM memory, which is 10 times faster than HBM memory used in GPUs. That means 20 PB/sec of memory bandwidth and 6.25 petaflops of processing power on dense matrices and 62.5 petaflops on sparse matrices.

In another benchmark against the Meta Llama 3.1-405B model used to train generative AI to respond to human input, Cerebras produced 969 tokens per second, far outpacing the number two performer, Samba Nova, which generated 164 tokens per second. That makes Cerebras’s throughput 12 times faster than AWS’s AI instance and six times faster than its closest competitor, Samba Nova.

Cerebras isn’t shy about the secret to its success. According to James Wang, director of product marketing at Cerebras, it’s the giant Wafer Scale Engine with its 850,000 cores that can all talk to each other at high speeds.

“Supercomputers today are great for weak scaling,” said Wang. “You can do more work, more volume of work, but you can’t make the same work go faster. Typically it tapers out at the max number of GPUs you have per node, which is around eight or 16, depending on configuration. Beyond that, you can do more volume, but you can’t go faster. And we don’t have this problem. We literally, because our chip itself is so large, move the strong scaling curve up by one or two orders of magnitude.”

Inside a single server with eight GPUs, the GPUs use NVLink to share data and communicate, so they can be programmed roughly to look like a single processor, Wang adds. But once it goes beyond eight GPUs, in any supercomputer configurations, the interconnect changes from NVLink to InfiniBand or Ethernet, and at that point, “they can’t be programmed like a single unit,” Wang says.

Earlier this month, Cerebras announced that Sandia National Laboratories is deploying a Cerebras CS-3 testbed for AI workloads.

Source link

Cerebras claims record in molecular dynamics simulations, says it’s 748x faster than Frontier supercomputer

VMWARE

Helping Public Sector Organisations Define Cloud Strategy

How to change the VLAN ID of the Service Console in ESX from the command line/console

Cisco UCS and Vmware Interfaces (Vnics) HA Design Considerations

Troubleshooting network and TCP/UDP port connectivity issues on ESX/ESXi(2020669)

vSphere Client Parameters

Configuration Templates

CUE Licenses

Trouble shooting Unity Express with Call Manager Integeration & Operational Issues

CME Configuration Example: SIP Trunks to Viatalk and VoIP.ms

SIP Phone registration – CME Configuration

CUE Voicemail + VPIM networking (CUE to unity)

Related Post

VMWARE

Configuration Templates