Overview
Neurenix provides extensive hardware acceleration support across multiple device types, enabling optimal performance for AI workloads on diverse hardware platforms. The framework automatically detects available hardware and provides a unified API for device management.Supported Hardware
Neurenix supports the following hardware acceleration platforms:CUDA
NVIDIA GPU acceleration with Tensor Cores support
ROCm
AMD GPU acceleration via HIP/ROCm
ARM
ARM processors with NEON, SVE, and Ethos-U
FPGA
FPGA acceleration with OpenCL, Vitis, OpenVINO
NPU
Neural Processing Units for edge devices
CPU
Optimized CPU operations with SIMD
Device Management
Device Types
The framework supports multiple device types through theDevice class:
Device Detection
Automatically detect available hardware:Device Properties
Query device capabilities and properties:Setting Current Device
Set the active device for operations:Device Selection Strategy
Neurenix automatically selects the best available device based on:- Explicit specification - User-specified device takes precedence
- GPU availability - CUDA/ROCm GPUs preferred for large workloads
- Specialized hardware - NPUs for edge inference, FPGAs for specific workloads
- CPU fallback - Always available as fallback
Memory Management
Unified Memory API
Neurenix provides a unified memory API across all device types:Memory Transfer
Efficient data transfer between host and device:Performance Optimization
Device Synchronization
Stream Management
Device-Specific Features
Each hardware platform provides specialized features:- CUDA: Tensor Cores, TensorRT optimization, cuDNN acceleration
- ROCm: MIOpen, rocBLAS, mixed precision training
- ARM: NEON SIMD, SVE vectorization, Arm Compute Library
- FPGA: Custom bitstreams, OpenCL kernels, Vitis HLS
- NPU: Quantized inference, model compilation, power efficiency
Cross-Platform Compatibility
Write once, run anywhere:Environment Variables
Control hardware behavior via environment variables:See Also
- CUDA Support - NVIDIA GPU acceleration
- ROCm Support - AMD GPU acceleration
- ARM Acceleration - ARM processors
- FPGA Support - FPGA acceleration
- NPU Support - Neural Processing Units