Researchers at the Korea Advanced Institute of Science and Technology (KAIST) have developed Slim-Llama, an ASIC aimed at reducing power needs for large language models (LLMs). The innovative device utilizes binary/ternary quantization, enhancing efficiency by reducing the precision of model weights. By integrating a Sparsity-aware Look-up Table and implementing output reuse schemes, Slim-Llama minimizes redundant operations and enhances data flow efficiency. The technology boasts a 4.59x improvement in energy efficiency compared to prior solutions, consuming as little as 4.69mW at 25MHz and up to 82.07mW at 200MHz. With support for models containing up to 3 billion parameters and a latency of 489ms, Slim-Llama sets a new standard for power-efficient AI hardware. The breakthrough is set to be presented at the 2025 IEEE International Solid-State Circuits Conference, signaling a shift towards more sustainable and accessible AI solutions.