October 10, 2023
LeapMind's New AI Chip Paves the Way for Unprecedented Cost-Effective AI Computing
Tokyo, Japan: October 10, 2023 - LeapMind Inc, has announced the commencement of the new AI chip development that accelerates the training and inference of large-scale AI models.
The cost of training advanced AI models, including large-scale language models (LLMs), has increased significantly over the past decade due to larger model sizes and greater computational complexity. This rising cost has become a major bottleneck in AI progress.
To create high-quality AI models, a substantial number of processors are needed for parallel computing, which requires a sizable budget. However, with the availability of cost-effective processors, it's now possible to develop improved AI models within the same expense. The shift in demand for AI training processors is moving from sheer performance to cost-effectiveness.
In response to this situation, we've initiated the development of a new AI training and inference processor, referred to as "AI chip." Leveraging our expertise from AI accelerator development for edge devices, this new AI chip targets a computing performance of 2 PFLOPS (petaflops) while aiming for a cost performance 10 times higher than that of an equivalent GPU. We anticipate that this product will be ready for shipment by the end of 2025 at the latest.
Matsuda, Soichi Chief Executive Officer of LeapMind, says “We have achieved a high level of success and a proven track record in the development of edge AI inference accelerators. On the server side, we will accelerate the evolution of next-generation AI by developing new AI chips that leverage our accumulated technological expertise to accelerate the computing process of AI models."
◾️New AI Chip overview
The new AI chip has three major characteristics; 1) Designed for AI model training and inference, 2)Low-bit representation, and 3) Open-source drivers and compilers
Designed for AI model training and inference
Considering AI model training and inference as computational tasks, the following characteristics emerge:
- Matrix multiplication stands out as a computational bottleneck.
- These tasks can be easily executed in parallel.
- There are very few conditional branches involved.
Our design approach leverages the features above, creating AI chips specialized for AI model training and inference rather than pursuing performance improvements as a general-purpose computing machine. For instance, we minimize transistors by eliminating the branch prediction unit due to the scarcity of conditional branches in the program.
The primary bottleneck in AI model calculations is matrix multiplication, which involves an extensive number of multiplications and additions. Multipliers typically require large circuits, but we reduce the number of necessary transistors by adopting lower bit-width data types, such as fp8. Additionally, downsizing the data processed effectively utilizes DRAM bandwidth, which has been a bottleneck in recent years.
Open-source drivers and Compilers
Developing advanced software stacks is essential for AI model development, and no single company can provide all the required components. An open-source software ecosystem involving multiple companies already exists, and to be a part of this ecosystem, it's crucial to engage as an open-source software community member.
We will release comprehensive hardware specifications and software, including drivers and compilers, under OSI-compliant licenses to contribute to the open-source community.
This new AI chip supports training and inference for a wide range of neural networks, including generative AI models such as diffusion models, as well as training for large-scale language models.
Specific performance values, benchmarks, and other details about the new AI chip will be updated on our tech blog.
[LeapMind Tech Blog](https://leapmind.io/tech-blog/)
About LeapMind Inc.
Founded in 2012, LeapMind is a Japan-based technology company that develops and provides artificial intelligence (AI) accelerators and its software, tasked with revolutionizing business and society by contributing to the advancement of AI technology. With its expertise and ability to provide AI solutions, LeapMind is committed to developing both the software and hardware necessary to meet customer needs and drive innovation toward a sustainable future. (as of October 2023).
Head office: Shibuya Dogenzaka Sky Building 3F, 28-1 Maruyama-cho, Shibuya-ku, Tokyo 150-0044
Representative: Soichi Matsuda, CEO
Established: December 2012
*LeapMind, Efficiera and logo are trademarks or registered trademarks of LeapMind Inc.
Marketing and Communication Group, LeapMind Inc.