Voice Streaming #183

isabelle-cedar · 2025-09-27T19:24:35Z

No description provided.

greptile-apps

Greptile Overview

Summary

This PR implements comprehensive voice streaming functionality for the Cedar OS product roadmap example. The implementation enables end-to-end voice interaction: audio recording → transcription → LLM processing → text-to-speech synthesis → streaming audio response.

Key Changes:

Backend Integration: Added @mastra/voice-openai dependency and created voice stream handler with SSE-based real-time communication
Frontend Voice State: Enhanced voice slice with streaming support, proper error handling, and audio playback capabilities
Provider Updates: Extended Mastra provider with voice streaming endpoints and event parsing
Workflow Enhancement: Modified chat workflow to accumulate text for voice synthesis instead of streaming individual chunks
Configuration: Updated provider config to include voice routing and enabled streaming voice settings

Technical Implementation:

Uses WebRTC for audio capture, OpenAI Whisper for transcription, and OpenAI TTS for synthesis
Implements proper stream handling with both Node.js Readable and Web ReadableStream compatibility
Includes comprehensive error handling and resource cleanup
Supports both streaming and non-streaming voice modes

Confidence Score: 3/5

This PR is moderately safe to merge with some implementation concerns that should be addressed
The implementation is architecturally sound with proper separation of concerns, but has several technical issues: missing environment variable validation could cause runtime errors, hardcoded audio format assumptions, and fragile stream type detection logic that could fail with certain stream implementations
Pay special attention to voiceStreamHandler.ts for environment variable validation and streamUtils.ts for stream compatibility detection

Important Files Changed

File Analysis

Filename	Score	Overview
examples-backend/product-roadmap-backend/src/mastra/voiceStreamHandler.ts	3/5	New voice streaming handler with transcription and LLM integration. Has some potential null handling issues and hardcoded transcription format.
examples-backend/product-roadmap-backend/src/utils/streamUtils.ts	3/5	Enhanced streaming utilities with voice support. Buffer handling logic may have edge cases with stream compatibility detection.
examples-backend/product-roadmap-backend/src/mastra/workflows/chatWorkflow.ts	4/5	Updated workflow to support voice mode with text accumulation for TTS synthesis. Clean integration of voice handling logic.
packages/cedar-os/src/store/voice/voiceSlice.ts	4/5	Comprehensive voice state management with streaming support. Well-structured with proper error handling and resource cleanup.
packages/cedar-os/src/store/agentConnection/providers/mastra.ts	4/5	Enhanced Mastra provider with voice streaming capabilities. Robust event parsing and proper stream handling.

Sequence Diagram

sequenceDiagram
    participant User
    participant CedarOS as Cedar OS (Frontend)
    participant MastraProvider as Mastra Provider
    participant VoiceHandler as Voice Stream Handler
    participant OpenAIVoice as @mastra/voice-openai
    participant ChatWorkflow as Chat Workflow
    participant LLM as OpenAI LLM

    User->>CedarOS: Record audio and submit
    CedarOS->>MastraProvider: voiceStreamLLM(audioData, settings)
    MastraProvider->>VoiceHandler: POST /voice/stream
    
    VoiceHandler->>OpenAIVoice: listen(audioBuffer, {filetype: 'webm'})
    OpenAIVoice->>VoiceHandler: transcription text
    VoiceHandler->>CedarOS: SSE: {type: 'transcription', transcription}
    
    VoiceHandler->>ChatWorkflow: start workflow with transcription
    ChatWorkflow->>LLM: streamVNext(transcription + context)
    
    loop Text streaming chunks
        LLM->>ChatWorkflow: text-delta chunks
        ChatWorkflow->>ChatWorkflow: accumulate pendingText (for voice mode)
    end
    
    ChatWorkflow->>OpenAIVoice: speak(pendingText)
    OpenAIVoice->>ChatWorkflow: audio stream
    ChatWorkflow->>VoiceHandler: audio data
    VoiceHandler->>CedarOS: SSE: {type: 'audio', audioData, content}
    
    VoiceHandler->>CedarOS: SSE: {type: 'done'}
    CedarOS->>User: Play audio response & show text

_{12 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2025-09-27T19:27:48Z

examples-backend/product-roadmap-backend/src/mastra/voiceStreamHandler.ts

+		const transcription = await voiceProvider.listen(Readable.from(buf), {
+			filetype: 'webm',
+		});


style: hardcoded filetype: 'webm' assumes input format - consider making dynamic

Suggested change

const transcription = await voiceProvider.listen(Readable.from(buf), {

filetype: 'webm',

});

const transcription = await voiceProvider.listen(Readable.from(buf), {

filetype: audioFile.type.includes('webm') ? 'webm' : 'wav',

});

Prompt To Fix With AI

This is a comment left during a code review. Path: examples-backend/product-roadmap-backend/src/mastra/voiceStreamHandler.ts Line: 85:87 Comment: style: hardcoded `filetype: 'webm'` assumes input format - consider making dynamic ```suggestion const transcription = await voiceProvider.listen(Readable.from(buf), { filetype: audioFile.type.includes('webm') ? 'webm' : 'wav', }); ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2025-09-27T19:27:49Z

examples-backend/product-roadmap-backend/src/utils/streamUtils.ts

+	if (typeof (speechStream as ReadableStream).getReader === 'function') {
+		// Web ReadableStream


logic: stream type detection is fragile - checking for getReader method may miss other ReadableStream-like objects

Suggested change

if (typeof (speechStream as ReadableStream).getReader === 'function') {

// Web ReadableStream

if ('getReader' in speechStream && typeof speechStream.getReader === 'function') {

Prompt To Fix With AI

This is a comment left during a code review. Path: examples-backend/product-roadmap-backend/src/utils/streamUtils.ts Line: 118:119 Comment: logic: stream type detection is fragile - checking for `getReader` method may miss other ReadableStream-like objects ```suggestion if ('getReader' in speechStream && typeof speechStream.getReader === 'function') { ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2025-09-27T19:27:50Z

examples-backend/product-roadmap-backend/src/mastra/voiceStreamHandler.ts

+	speechModel: { apiKey: process.env.OPENAI_API_KEY!, name: 'tts-1' },
+	listeningModel: {
+		apiKey: process.env.OPENAI_API_KEY!,
+		name: 'whisper-1',
+	},


logic: missing environment variable validation will cause runtime errors if API key is not set

Prompt To Fix With AI

This is a comment left during a code review. Path: examples-backend/product-roadmap-backend/src/mastra/voiceStreamHandler.ts Line: 8:12 Comment: logic: missing environment variable validation will cause runtime errors if API key is not set How can I resolve this? If you propose a fix, please make it concise.

isabelle-cedar added 3 commits September 27, 2025 15:23

feat: voice streaming

3195159

feat: example of voice in product-roadmap

38fa6a4

Merge remote-tracking branch 'origin/main' into feat/voice_streaming

527b7d2

greptile-apps bot reviewed Sep 27, 2025

View reviewed changes

isabelle-cedar merged commit a12d6e7 into main Sep 27, 2025
4 of 6 checks passed

isabelle-cedar deleted the feat/voice_streaming branch September 27, 2025 19:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Voice Streaming #183

Voice Streaming #183

Uh oh!

isabelle-cedar commented Sep 27, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Sep 27, 2025

Uh oh!

greptile-apps bot Sep 27, 2025

Uh oh!

greptile-apps bot Sep 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if (typeof (speechStream as ReadableStream).getReader === 'function') {
		// Web ReadableStream

	if (typeof (speechStream as ReadableStream).getReader === 'function') {
	// Web ReadableStream
	if ('getReader' in speechStream && typeof speechStream.getReader === 'function') {

Voice Streaming #183

Voice Streaming #183

Uh oh!

Conversation

isabelle-cedar commented Sep 27, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants