Yet another tech startup wants to topple Nvidia with 'orders of magnitude' better energy efficiency; Sagence AI bets on analog in-memory compute to deliver 666K tokens/s on Llama2-70B

Yet another tech startup wants to topple Nvidia with ‘orders of magnitude’ better energy efficiency; Sagence AI bets on analog in-memory compute to deliver 666K tokens/s on Llama2-70B

Sagence AI has revolutionized the field of AI inference with the introduction of an innovative analog in-memory compute architecture. This cutting-edge technology aims to address issues related to power consumption, cost, and scalability in AI applications. By leveraging an analog-based approach, Sagence’s architecture offers enhanced energy efficiency and cost-effectiveness while maintaining high performance levels comparable to top-tier GPU and CPU systems.

In particular, Sagence’s architecture showcases remarkable efficiency and performance benefits when processing large language models like Llama2-70B. Compared to leading GPU-based solutions, Sagence’s technology demonstrates results with 10 times lower power consumption, 20 times lower costs, and requires 20 times less rack space when normalized to 666,000 tokens per second. This emphasis on inference over training aligns with the evolving focus of AI computing within data centers.

At the core of Sagence’s innovation lies its analog in-memory computing technology, which merges storage and computation within memory cells. This approach streamlines chip designs, reduces expenses, and enhances power efficiency by eliminating the need for separate storage and scheduled multiply-accumulate circuits. Furthermore, Sagence incorporates deep subthreshold computing in multi-level memory cells to achieve the efficiency gains necessary for scalable AI inference—an industry-first advancement.

Unlike traditional CPU and GPU-based systems that rely on complex dynamic scheduling, Sagence’s statically scheduled architecture simplifies processes and mirrors biological neural networks. The system also seamlessly integrates with popular AI development frameworks such as PyTorch, ONNX, and TensorFlow, eliminating the need for additional GPU-based processing post-training and thus streamlining deployment while cutting costs.

Vishal Sarin, CEO & Founder of Sagence AI, emphasized the critical importance of advancing AI inference hardware to sustain the future of AI. The company’s mission is to overcome the performance and economic limitations of current computing devices, making high-performance AI inferencing both economically viable and environmentally sustainable. Sagence AI’s groundbreaking technology signifies a significant step towards achieving this vision in an environmentally responsible manner.

Source: Via IEEE Spectrum

You can now rent Google’s most powerful AI chip: Trillium TPU underpins Gemini 2.0 and will put AMD and Nvidia on high alert

Fri, 27 Dec

0 0 votes

Article Rating