Efficiera comes with a comprehensive AI software suite that allows you to compile AI models and implement AI applications in production environments
Software Development Kit Optimized for Quantized Models, Bringing AI to Practical Use
The software is crucial for using Efficiera. How the neural network is quantized is very important.
Most neural network frameworks, such as TensorFlow, do not support extremely low-bit quantization by default. For this reason, a neural network quantizer needs to be implemented.
LeapMind has a software suite that automatically optimizes these.
Model Building & Training
Operator library for Extremely low bit quantization model
TensorFlow and PyTorch support
Model Conversion & Optimization
Converted to Efficiera executable instruction format
Memory bandwidth optimization
Parallelization of data transfers and operations
Provides performance profiling tools
Efficiera Runtime Library
Deploy and Run
Runtime API（Python, C++）
Run state management of multiple models
Provides a simulator environment
The only environment capable of developing deep learning models for Extremely low bit quantization
Libraries for deep learning frameworks, defining operators that can be executed by Efficiera and Runtime Library
Models can be built without special attention to the operators required for learning the quantization and scaling coefficients, which is a feature of Extremely low bit quantization.
Utilize TensorFlow or PyTorch functionality, such as visualization with TensorBoard, without modification
Export study results in ONNX format
Ubuntu 18.04 or 20.04 environment with CUDA-enabled GPU
The only tool capable of converting an Extremely low bit quantized deep learning model into an instruction for Efficiera
Convert ONNX format learned and exported using Efficiera NDK into a sequence of instructions that Efficiera IP can execute
Advanced optimization to meet resource and memory bandwidth constraints for specified Efficiera configurations
High speed by parallel execution of data transfer and arithmetic operations
Power saving by reduced memory access
Estimate the number of execution cycles, memory bandwidth, and memory usage for a given Efficiera configuration
Efficiera configurations can be selected to suit the size and structure of the model
Ubuntu 18.04 environment for Intel 64/AMD64 processors
Runtime library for applications that maximize the performance of Efficiera-equipped hardware
Provides API to execute the instruction sequence output from Efficiera Converter on the target
Capable of managing correspondence between multiple deep learning models, Efficiera instances, and CPU threads
Accelerated by CPU SIMD instructions when available
Define APIs for C++ and Python
Simulators on the host environment are also provided, allowing application development even when the target is not yet complete.
Target (actual device): Ubuntu environment for Armv8-A AArch64 and Armv7-A processors
Host: Ubuntu 18.04 environment for Intel 64/AMD64 processors