Overview
Neurenix provides support for Neural Processing Units (NPUs), specialized hardware accelerators designed for efficient AI inference on edge devices. NPUs offer:- Power efficiency - Optimized for low-power operation
- Low latency - Dedicated hardware for neural network operations
- Quantization support - INT8/INT16 operations for efficiency
- Edge deployment - Designed for mobile and embedded systems
- Real-time inference - Deterministic performance for edge applications
Supported NPUs
Mobile NPUs
- Apple Neural Engine (A-series, M-series)
- Qualcomm Hexagon DSP/NPU (Snapdragon)
- MediaTek APU (Dimensity)
- Samsung NPU (Exynos)
- Google Edge TPU
Embedded NPUs
- ARM Ethos-U55, U65
- Intel Movidius Myriad X
- NVIDIA Deep Learning Accelerator (DLA)
- Hailo-8, Hailo-15
- Kneron KL series
Installation
Device Management
Check NPU Availability
Get Device Properties
Set Current Device
Model Compilation
Compile for NPU
Supported Operations
Quantization
Overview
NPUs typically require quantized models for optimal performance:Post-Training Quantization
Quantization-Aware Training
Memory Management
Allocate Memory
Memory Transfer
Edge TPU
Setup
Compile Model
Apple Neural Engine
Core ML Conversion
Neural Engine Optimization
Qualcomm Hexagon NPU
SNPE Integration
Performance Optimization
Batch Processing
Model Optimization
Operator Fusion
Profiling and Debugging
NPU Profiling
Benchmark
Deployment
Export Model
Mobile Integration
Environment Variables
Common Use Cases
Real-Time Object Detection
Edge Classification
See Also
- ARM Acceleration - ARM CPU and Ethos-U NPU
- Model Quantization
- Mobile Deployment
- Edge Deployment