forked from samcamwilliams/HyperBEAM
-
Notifications
You must be signed in to change notification settings - Fork 72
feat: Add Inference Device and SEV GPU Attestation Support #580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jax-cn
wants to merge
41
commits into
permaweb:edge
Choose a base branch
from
apuslabs:PR/dcv_inference
base: edge
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… update related configurations
…t process monitoring
commit 97e92aa Author: jax <jax@apus.network> Date: Thu Jul 24 03:18:30 2025 +0000 optimize code and add files for TC commit 9a2c4dd Author: jax <jax@apus.network> Date: Wed Jul 23 13:21:16 2025 +0000 add more comments commit 626d356 Author: jax <jax@apus.network> Date: Wed Jul 23 11:40:24 2025 +0000 add dev_sev_gpu for gpu attestaion gernate
… command execution
… management from inference module
…request body reading
…verification via the `nvat` SDK.
…ing to error level.
…on_ctx` for a streamlined process, include EAT data in the result, and correct a Makefile build flag.
…tion from evidence JSON.
…tion result field from `valid` to `verified`, and update Erlang tests for stricter NIF error handling.
… the response's attestation field.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces a new Inference Device (
dev_inference) that provides an OpenAI-compatible API for local LLM inference. It also adds a SEV GPU Device (dev_sev_gpu) to support NVIDIA GPU TEE attestation, ensuring secure and verifiable inference workloads.To support these features, the core HTTP handling logic has been extended to support Server-Sent Events (SSE) for streaming responses.
Key Changes
🚀 New Features
/v1/chat/completions).🛠 Core Modifications
reply/5to handlestream_generator, enabling real-time token streaming for LLM responses.inference@1.0andsev_gpu@1.0devices./v1/.*to the local inference server.inference_optsfor model configuration (hash, name, size).max_readersconfiguration to optimize LMDB for high-concurrency read scenarios.🧹 Maintenance
hb_app:stop/1to ensure the inference server is gracefully shut down.Testing
HB_PRINT=inference rebar3 as inference shell.2.1 Check
/healthendpoint.2.2 Test a completion request to
/v1/chat/completions(both streaming and non-streaming).2.3 Verify TEE attestation if running on supported hardware.