Skip to main content

EdgeAI Documentation

Real ExecuTorch + QNN Integration for Llama3.2-1B inference on Android

Real AI Inference

EdgeAI runs actual Llama3.2-1B model inference on your Android device, powered by ExecuTorch and Qualcomm QNN acceleration.

Hardware Accelerated

Leverage Qualcomm's AI Engine Direct with v79 context binaries for optimal performance on Snapdragon processors.

On-Device Processing

Process AI requests locally without internet connectivity, ensuring privacy and reducing latency for better user experience.