Groq

Helping real-time AI applications come to life today

CALIFORNIA, USA

 
groq-microchip.jpg

More about Groq

Groq is on a mission to set the standard for GenAI inference speed, helping real-time AI applications come to life today. An LPU Inference Engine, with LPU standing for Language Processing Unitâ„¢, is a new type of end-to-end processing unit system that provides the fastest inference for computationally intensive applications with a sequential component to them, such as AI language applications (LLMs). The LPU is designed to overcome the two LLM bottlenecks: compute density and memory bandwidth. An LPU has greater compute capacity than a GPU and CPU in regards to LLMs. This reduces the amount of time per word calculated, allowing sequences of text to be generated much faster. Additionally, eliminating external memory bottlenecks enables the LPU Inference Engine to deliver orders of magnitude better performance on LLMs compared to GPUs.

Groq in the news