Skip to content

高通NPU上使用llama.cpp运行LLM并支持trace可视化 #18

@Ethan-a2

Description

@Ethan-a2

效果图

Image

environment

  • xiaomi 15,Snapdragon® 8 Elite Mobile
  • llama.cpp
  • LLM模型:LFM2

build

将必要的文件推到手机里面:

adb shell mkdir -p /data/local/tmp/gguf
adb push LFM2-1.2B-Q4_0.gguf /data/local/tmp/gguf
adb push pkg-adb/llama.cpp /data/local/tmp/
adb push surfing.txt /data/local/tmp/llama.cpp
adb push LFM2-350M-Q4_0.gguf /data/local/tmp/gguf

surfing.txt:

1+1=?

itrace

cmake --build build-snapdragon --target ggml-hexagon --verbose
adb push ./build-snapdragon/bin/libggml-hexagon.so /data/local/tmp/llama.cpp/lib/libggml-hexagon.so
TRACE=1 M=LFM2-1.2B-Q4_0.gguf D=HTP0 ./scripts/snapdragon/adb/run-cli.sh -no-cnv -p \"1+1=?\"
adb pull /data/local/tmp/itrace_results
  • 浏览器打开: https://ui.perfetto.dev/
  • 选择左边的: open trace file
  • 选择: itrace_results/json/itrace_output.json

PR

完整trace及日志

itrace_results.zip

cli.log

reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    llama3.cppLLM inference in C/C++

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions