π€ This project is entirely AI-generated and contains significant bugs, hallucinations, and unreliable code. Use at your own risk for educational purposes only.
- π¨ BUGGY CODE: This codebase contains numerous bugs, incomplete implementations, and non-functional components
- π€ AI HALLUCINATIONS: Many claimed features, performance metrics, and API implementations may not work as described
β οΈ EXPERIMENTAL ONLY: Not suitable for production use - intended for learning and experimentation- π¬ RESEARCH PROJECT: Generated by AI as a proof-of-concept with no guarantees of functionality
- π£ NO WARRANTY: Use entirely at your own risk - may cause system instability or data loss
Production-ready C server providing access to Rockchip's AI libraries through JSON-RPC 2.0
- π Design Document - Detailed technical architecture, API reference, and implementation details
- π Instructions - Setup and usage instructions
- π Project Notes - Development notes and findings
Date: July 24, 2025
Implementation: Claims 16 RKLLM + 23 RKNN functions (may be hallucinated)
Reality: Many functions may not work, crash, or behave unexpectedly
Status: Experimental AI-generated code with unknown reliability
- Transport: Unix Domain Socket (
/tmp/rkllm.sock) - Protocol: JSON-RPC 2.0 with direct 1:1 API mapping
- Libraries: Complete RKLLM (language) + Core RKNN (vision) integration
- Structure: Ultra-modular (one function per file)
- Performance: <10ms token latency, 100+ concurrent connections
rkllm.init rkllm.run rkllm.run_async rkllm.destroy
rkllm.load_lora rkllm.abort rkllm.is_running rkllm.get_constants
rkllm.clear_kv_cache rkllm.set_chat_template rkllm.set_function_tools
rknn.init rknn.query rknn.run rknn.destroy
rknn.inputs_set rknn.outputs_get rknn.create_mem rknn.set_core_mask
rknn.mem_sync rknn.get_constants
- RKNN MatMul: 10 specialized functions for transformer matrix operations
- Media Integration: 1 function for camera pipeline optimization
# Build
./scripts/build.sh
# Run
LD_LIBRARY_PATH=build ./build/server
# Test
npm test// Initialize and stream tokens
{"jsonrpc":"2.0","id":1,"method":"rkllm.init","params":[{"model_path":"/models/qwen3/model.rkllm"}]}
{"jsonrpc":"2.0","id":2,"method":"rkllm.run_async","params":[null,{"input_type":0,"prompt_input":"Hello"},{"mode":0},null]}// Load vision model and run inference
{"jsonrpc":"2.0","id":3,"method":"rknn.init","params":{"model_path":"/models/yolo.rknn","core_mask":1}}
{"jsonrpc":"2.0","id":4,"method":"rknn.run","params":{"input_data":"...preprocessed_image..."}}// LoRA fine-tuning
{"jsonrpc":"2.0","id":5,"method":"rkllm.load_lora","params":[{"lora_adapter_path":"/models/lora/coding.rkllm"}]}
// Memory optimization
{"jsonrpc":"2.0","id":6,"method":"rkllm.clear_kv_cache","params":[null,1,[0,50],[100,150]]}- Zero-Copy: Direct callback routing from RKLLM to clients
- Low Latency: <10ms per token
- Format: Each token as complete JSON-RPC response
- NPU Acceleration: Direct access to Rockchip NPU cores
- Multi-Core Support: Configurable core masks for parallel processing
- Memory Efficiency: Zero-copy operations and advanced memory management
- Signal Handlers: Comprehensive crash recovery
- Resource Management: Automatic cleanup and connection limits
- Error Handling: All errors return proper JSON-RPC error responses
- Concurrent Clients: Support for 100+ simultaneous connections
- Token Latency: <10ms per token
- Throughput: 20+ tokens/second sustained
- Max Connections: 100+ simultaneous clients tested
- Request Rate: 10,000+ requests/second
# Environment variables
RKLLM_UDS_PATH=/tmp/rkllm.sock # Socket path
RKLLM_MAX_CONNECTIONS=100 # Max concurrent connections
RKLLM_LOG_LEVEL=1 # 0=DEBUG, 1=INFO, 2=WARN, 3=ERROR
# Custom startup
RKLLM_MAX_CONNECTIONS=200 ./build/server- Platform: Rockchip NPU-enabled devices (RK3588, RK3576, etc.)
- RAM: 4GB+ (depends on model size)
- OS: Linux (Ubuntu 20.04+ recommended)
- Libraries: json-c, pthread
- Build: CMake >= 3.16
npm test # Full test suite
npm run test:streaming # Streaming-specific tests
npm run test:concurrent # Multi-client tests# Build and install
git clone <repository>
cd nano && ./scripts/build.sh
# Start server
LD_LIBRARY_PATH=build ./build/server
# Monitor
tail -f /var/log/rkllm-server.log
ss -lx | grep rkllm.sockAI-Generated Problems:
- Untested Code: Most functions have never been properly tested
- Memory Leaks: Likely contains significant memory management issues
- Race Conditions: Multi-threading may be improperly implemented
- API Mismatches: JSON-RPC implementations may not match actual library APIs
- Hallucinated Features: Some claimed capabilities may not exist at all
- Documentation Errors: README claims may not reflect actual code functionality
Real Limitations:
- Single Language Model: Only one RKLLM model loaded at a time (NPU constraint)
- Platform Specific: Requires Rockchip NPU drivers and libraries
- Local Access: Unix Domain Socket limits to single machine
- π Learn: Study AI-generated code patterns and common mistakes
- π§ Debug: Practice fixing AI-generated bugs and issues
- π§ͺ Experiment: Use as a starting point for your own implementation
- π Research: Analyze AI code generation capabilities and limitations
- β Working Software: This code may not function as described
- π Production Systems: Completely unsuitable for any production use
- π Reliability: No guarantees about stability or correctness
- π Performance: Claims about speed/efficiency are likely false