Add Run Android LlamaDemo with QNN backend (#16011)

luffy-yu · Copilot · web-flow · commit 5b00d919f2b0 · 2025-12-03T18:25:36.000-08:00
### Motivation - The current Android LlamaDemo only supports XNNPACK. - The QNN-backend Android Demo is missing, the `qnn_llama_runner` approach is provided though. ### Summary - Add Guidance for building and running Android LlamaDemo with QNN-Backend - It's verified on Samsung S23 (SoC SM8550). ### Test plan - This PR has been tested on [LlamaDemo-Executorch-QNN](https://github.com/luffy-yu/LlamaDemo-Executorch-QNN). > The APK file and the pre-built model can be found in its README. - This PR primarily borrows doc from [QNN_ANDROID_FIX_SUMMARY](https://github.com/luffy-yu/LlamaDemo-Executorch-QNN/blob/master/QNN_ANDROID_FIX_SUMMARY.md). --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
diff --git a/docs/source/backends-qualcomm.md b/docs/source/backends-qualcomm.md
@@ -294,6 +294,145 @@ After the above command, pre-processed inputs and outputs are put in `$EXECUTORC
 The command-line arguments are written in [utils.py](https://github.com/pytorch/executorch/blob/main/examples/qualcomm/utils.py#L139).
 The model, inputs, and output location are passed to `qnn_executorch_runner` by `--model_path`, `--input_list_path`, and `--output_folder_path`.
 
+### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend
+
+`$DEMO_APP` refers to the root of the executorch android demo, i.e., the directory containing `build.gradle.kts`.
+
+***Step 1***: Rebuild ExecuTorch AAR
+
+```bash
+# Build the AAR
+cd $EXECUTORCH_ROOT
+export BUILD_AAR_DIR=$EXECUTORCH_ROOT/aar-out
+./scripts/build_android_library.sh
+```
+
+***Step 2***: Copy AAR to Android Project
+
+```bash
+cp $EXECUTORCH_ROOT/aar-out/executorch.aar \
+   $DEMO_APP/app/libs/executorch.aar
+```
+
+***Step 3***: Build Android APK
+
+```bash
+cd $DEMO_APP
+./gradlew clean assembleDebug -PuseLocalAar=true
+```
+
+***Step 4***: Install on Device
+
+```bash
+adb install -r app/build/outputs/apk/debug/app-debug.apk
+```
+
+***Step 5***: Push model
+
+```bash
+adb shell mkdir -p /data/local/tmp/llama
+adb push model.pte /data/local/tmp/llama
+adb push tokenizer.bin /data/local/tmp/llama
+```
+
+***Step 6***: Run the Llama Demo
+
+- Open the App on Android
+- Select `QUALCOMM` backend
+- Select `model.pte` Model
+- Select `tokenizer.bin` Tokenizer
+- Select Model Type
+- Click LOAD MODEL
+- It should show `Successfully loaded model.`
+
+
+#### Verification Steps
+
+***Step 1***. Verify AAR Contains Your Changes
+
+```bash
+# Check for debug strings in the AAR
+unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \
+  strings | grep "QNN"   # Replace "QNN" with your actual debug string if needed
+```
+
+If found, your changes are in the AAR.
+
+***Step 2***. Verify APK Contains Correct Libraries
+
+```bash
+# Check QNN library version in APK
+cd $DEMO_APP
+unzip -l app/build/outputs/apk/debug/app-debug.apk | grep "libQnnHtp.so"
+```
+
+Expected size for QNN 2.37.0: ~2,465,440 bytes
+
+***Step 3***. Monitor Logs During Model Loading
+
+```bash
+adb logcat -c
+adb logcat | grep -E "ExecuTorch"
+```
+
+#### Common Issues and Solutions
+
+##### Issue 1: Error 18 (InvalidArgument)
+
+- **Cause**: Wrong parameter order in Runner constructor or missing QNN config
+
+- **Solution**: Check `$EXECUTORCH_ROOT/examples/qualcomm/oss_scripts/llama/runner/runner.h` for the correct constructor signature.
+
+##### Issue 2: Error 1 (Internal) with QNN API Version Mismatch
+
+- **Symptoms**:
+
+    ```
+    W [Qnn ExecuTorch]: Qnn API version 2.33.0 is mismatched
+    E [Qnn ExecuTorch]: Using newer context binary on old SDK
+    E [Qnn ExecuTorch]: Can't create context from binary. Error 5000
+    ```
+
+- **Cause**: Model compiled with QNN SDK version X but APK uses QNN runtime version Y
+
+- **Solution**:
+    - Update `build.gradle.kts` with matching QNN runtime version
+
+    > **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.
+
+    **Before**:
+    ```kotlin
+    implementation("com.qualcomm.qti:qnn-runtime:2.33.0")
+    ```
+    
+    **After**:
+    ```kotlin
+    implementation("com.qualcomm.qti:qnn-runtime:2.37.0")
+    ```
+
+    - Or recompile model with matching QNN SDK version
+
+##### Issue 3: Native Code Changes Not Applied
+
+- **Symptoms**:
+    - Debug logs don't appear
+    - Behavior doesn't change
+
+- **Cause**:
+    - Gradle using Maven dependency instead of local AAR
+
+- **Solution**:
+    - Always build with `-PuseLocalAar=true` flag
+
+##### Issue 4: Logs Not Appearing
+
+- **Cause**: Wrong logging tag filter
+
+- **Solution**: QNN uses "ExecuTorch" tag:
+
+    ```bash
+    adb logcat | grep "ExecuTorch"
+    ```
 
 ## Supported model list