Tachyum FP8 Super-Sparsity Is Showing Path to Efficient Generative AI

LAS VEGAS–(BUSINESS WIRE)–#Linux—Tachyum^® announced today the release of a new research paper to address how Prodigy^®, the world’s first Universal Processor, will transform the quality, efficiency, and economics of generative AI (GenAI).

“Unprecedented Scale and Efficiency in Generative AI with FP8 8:3 Super-Sparsity” offers technical information on how Prodigy can more effectively meet the computation and scale requirements of generative AI, which trains on massive data sets to create original results, rather than identifying or analyzing known data. The larger the training data, the better and more accurate the output of GenAI. ChatGPT 3.5, a quintessential example of a generative AI model, has 175 billion trainable parameters, and ChatGPT 4.0 increases this by a factor of 10 to 1.76 trillion parameters, with another 10x increase possible in the near future.

Language models like ChatGPT, vision models, and other GenAI tools have improved dramatically due to successful scale-up, resulting in impressive few-shot capabilities close to that of a human being. These growing numbers of parameters require corresponding increases in computational power to train AI systems: high memory capacity, high processing performance, and high memory bandwidth to optimize the efficiency of large and dense models. Today, the scale of the largest AI computation is doubling every six months, outpacing Moore’s Law by 7x; and moving from generative AI to cognitive AI is expected to require 100-1000x more capacity.

To address memory and energy consumption, quantization reduces the precision of parameters as a means of compressing deep-neural networks (DNNs). Similarly, pruning removes redundant/insensitive parameters to reduce density. While density is often necessary to successfully train the model, once trained, many parameters can be removed without any quality degradation.

In this paper Tachyum shows how Prodigy overcomes the hardware inefficiencies that make GenAI cost-prohibitive and energy-excessive. Prodigy enables quantization using 8-bit floating point (FP8) with 8:3 block pruning, improving performance, power, and memory bandwidth to enable enormous model sizes. Tachyum’s recommendations significantly increase training speed, and reduce the memory footprint of the model after training. Super-sparsity FP8 8:3 greatly reduces the model sizes, important for language models, as well as power and area—important for edge and IOT applications.

“GenAI is a truly transformational technology, but its value cannot be realized, nor can it be widely adopted, without solving the hardware challenges of running such large models,” said Dr. Radoslav Danilak, founder and CEO of Tachyum. “With Prodigy poised to become the mainstream cost-efficient high performance processor in 2024, these compression approaches, together with hardware support, will enable even small to midsized enterprises and academic users to work with large, dense deep learning models.”

Because Prodigy offers increased memory over currently available AI processors—2TB using low-cost DRAM and 32TB/socket, with a 4-socket Prodigy platform supporting low cost 8TB and up to 128TB of TSV DDR5 DRAM — a single Prodigy chip can replace more than 10 competitor units, delivering unprecedented performance, scalability and efficiency.

FP8 8:3 models must be trained on Tachyum chips to achieve the proper computational efficiency. FP8 8:3 inference and generative AI IP is available now to partners and customers; a license includes all necessary software, which is process-independent.

As a Universal Processor offering utility for all workloads, Prodigy-powered data center servers can seamlessly and dynamically switch between computational domains (such as AI/ML, HPC, and cloud) on a single architecture. By eliminating the need for expensive dedicated AI hardware and dramatically increasing server utilization, Prodigy reduces CAPEX and OPEX significantly while delivering unprecedented data center performance, power, and economics. Prodigy integrates 192 high-performance custom-designed 64-bit compute cores, to deliver up to 4.5x the performance of the highest-performing x86 processors for cloud workloads, up to 3x that of the highest performing GPU for HPC, and 6x for AI applications.

“Unprecedented Scale and Efficiency in Generative AI with FP8 8:3 Super-Sparsity” can be found at: https://www.tachyum.com/resources/whitepapers/2023/08/22/unprecedented-scale-and-efficiency-in-generative-ai-with-fp8-83-super-sparsity/.

Follow Tachyum

https://twitter.com/tachyum
https://www.linkedin.com/company/tachyum
https://www.facebook.com/Tachyum/

About Tachyum

Tachyum is transforming the economics of AI, HPC, public and private cloud workloads with Prodigy, the world’s first Universal Processor. Prodigy unifies the functionality of a CPU, a GPGPU, and a TPU in a single processor that delivers industry-leading performance, cost, and power efficiency for both specialty and general-purpose computing. When hyperscale data centers are provisioned with Prodigy, all AI, HPC, and general-purpose applications can run on the same infrastructure, saving companies billions of dollars in hardware, footprint, and operational expenses. As global data center emissions contribute to a changing climate, and consume more than four percent of the world’s electricity—projected to be 10 percent by 2030—the ultra-low power Prodigy Universal Processor is a potential breakthrough for satisfying the world’s appetite for computing at a lower environmental cost. Prodigy, now in its final stages of testing and integration before volume manufacturing, is being adopted in prototype form by a rapidly growing customer base, and robust purchase orders signal a likely IPO in late 2024. Tachyum has offices in the United States and Slovakia. For more information, visit https://www.tachyum.com/.

Contacts

Mark Smith

JPR Communications

818-398-1424

marks@jprcom.com

International World Of Business

Tachyum FP8 Super-Sparsity Is Showing Path to Efficient Generative AI

Patients Will Bear the Cost of U.S. Government Price Controls

FDIC demands entities cease false deposit insurance claims

OIR Approves an 15.1% Decrease in Workers’ Compensation Insurance Rates for 2024

nonPareil Orlando Presents: Sipping for Scholarships –Empowering Adults with Autism

Let’s develop Artificial Intelligence to drive a fairer and more sustainable social transition

wX Insights 2024 Reveals Funding Gap for Women Tech Entrepreneurs in Latin America

Artificial Intelligence: The Challenge Of Combining Innovation with Ethics, Responsibility and Inclusiveness

ARTICLE: 5G is coming to enterprises and everyone will be a winner

AI-Powered Automation Startup for Social Media, Munch, Raises $7.2M in Seed Funding

Scores American Airlines partnership with Texas Rangers reigning World Champions

Walt Disney World Resort offers festive fall fun for the whole family

IMEX America reflects the potential of the MICE segment in the region

Seaworld parks and Bestravel Service sign marketing agreement

IBTM Americas reveals prospects for the recovery of the MICE segment

Finalists Announced for the 27th Don Quijote Awards Gala

Commercial Management: 4 advantages of keeping your team motivated

Florida’s Unemployment Rate Drops to 2.6 Percent

Ranks Florida #1 in the Nation for Attracting and Developing Skilled Workforce

Memorandum of Understanding (MOU) Between the U.S. Department of State and the Hispanic Association of Colleges and Universities (HACU)

Partial lane closure for MIA garage construction begins May

Coalition Letter on Water Infrastructure Program Funding

The Megadrought Has Massive Implications for Business: Action Is Needed

How the Construction Industry Is Working to Increase Diversity and Inclusion

How broadband infrastructure impacts greenhouse gas emissions