Overview
Neurenix provides built-in API servers for serving models through multiple protocols:- RESTful API: Standard HTTP-based inference via FastAPI
- WebSocket: Real-time bidirectional communication
- gRPC: High-performance RPC for production systems
Quick Start
Simple REST Server
Making Predictions
APIManager
TheAPIManager class provides centralized management of API servers.
Initialization
Creating Servers
Managing Models
RESTful API Server
Setup
API Endpoints
Root Endpoint:Example with Python Client
CORS Configuration
The REST server automatically enables CORS with permissive defaults. For production, configure appropriately in the source code atneurenix/api_support.py:159-165.
WebSocket Server
Setup
WebSocket Protocol
Messages are JSON-formatted with anaction field:
List Models:
Python WebSocket Client
JavaScript WebSocket Client
gRPC Server
Setup
Protocol Definition
The gRPC server automatically generates a proto file atneurenix/proto/neurenix_api.proto:
Python gRPC Client
Multi-Protocol Serving
Serve a model on all three protocols simultaneously:Advanced Configuration
Custom Preprocessing and Postprocessing
Extend the server classes to customize input/output handling:Production Deployment
Using Uvicorn (REST)
For production REST deployments:Health Check Endpoints
Add health checks for production monitoring:Best Practices
- Use gRPC for High Throughput: Better performance than REST for high-volume inference
- WebSocket for Real-Time: Ideal for streaming predictions or continuous inference
- Enable Authentication: Add auth middleware for production deployments
- Monitor Performance: Track latency, throughput, and error rates
- Load Balancing: Deploy multiple server instances behind a load balancer
- Input Validation: Validate input shapes and types before inference
- Error Handling: Return meaningful error messages for debugging