Skip to content

Conversation

@42atomys
Copy link

@42atomys 42atomys commented Dec 26, 2025

This pull request enhances the offline audio sync workflow by ensuring that developer webhooks are triggered with audio bytes even for files uploaded and processed offline.

I also add a new query params to send the timestamp of the file to provide the right "when is recorded" to webhooks

It introduces utility functions to read WAV files as PCM data and schedules asynchronous webhook notifications for each synced audio file.

Resolve #3846

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully adds the functionality to trigger webhooks for offline audio sync. The new utility functions are well-defined. However, there is a significant performance issue: the asynchronous webhook function uses a synchronous HTTP library (requests), which will block the event loop. My review includes a comment with details on how to fix this by using an asynchronous library like httpx, which was conveniently added in this PR.

@42atomys 42atomys force-pushed the fix/audio-bytes-on-offline-sync branch from d569556 to 511fce6 Compare December 26, 2025 17:12
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the offline audio sync workflow by ensuring that developer webhooks are triggered with audio bytes even for files uploaded and processed offline. The changes correctly replace the synchronous requests library with httpx for asynchronous webhook calls. However, a critical performance issue is identified due to performing blocking file I/O within an async endpoint, which will block the server's event loop. Suggestions have been provided to refactor this using asyncio.to_thread to prevent performance degradation, aligning with best practices for asynchronous applications. Additionally, a suggestion to narrow a broad exception catch has been included to improve error handling.

42atomys and others added 2 commits December 26, 2025 18:17
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the offline audio sync workflow by triggering webhooks with audio bytes. The implementation is solid, correctly using asyncio.to_thread for blocking I/O and httpx for asynchronous network requests. I've provided one suggestion in backend/routers/sync.py to centralize error handling, which will improve code clarity and maintainability. After addressing this, the PR should be ready to merge.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully adds the functionality to send audio bytes via webhook for offline-synced files. It correctly refactors a synchronous network call to be asynchronous using httpx, which is a crucial improvement. My main feedback is to enhance performance by parallelizing the file reading operations. The current implementation reads files one by one, which can be slow. By processing them concurrently, the overall sync time can be significantly reduced.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a valuable feature to send audio-bytes webhooks for offline-synced files. The implementation is well-structured, using asyncio for concurrent processing. My review includes a few suggestions to improve performance and memory usage. Specifically, I recommend changing the webhook data type from bytearray to bytes to avoid unnecessary data copying, and running the webhook triggers as a background task to prevent blocking the main API response. These changes will make the new feature more efficient and robust.

42atomys and others added 3 commits December 26, 2025 18:54
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the offline audio sync workflow by ensuring that developer webhooks are triggered with audio bytes for offline-processed files. It correctly introduces a timestamp parameter to provide better context for these webhooks. A key improvement is the switch to httpx for asynchronous HTTP requests in send_audio_bytes_developer_webhook, which prevents blocking the event loop and aligns with best practices for async applications. The documentation has also been clearly updated. I have one high-severity suggestion to prevent a potential runtime error.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully enhances the offline audio sync workflow by triggering developer webhooks with audio bytes for offline-processed files. The addition of a timestamp query parameter is a valuable enhancement for providing accurate recording times. The implementation correctly leverages asyncio for concurrent processing of audio files and non-blocking webhook notifications. A significant improvement is the migration from the synchronous requests library to the asynchronous httpx for sending webhooks, which is crucial for the performance of an async application. My review includes a suggestion to extend this asynchronous pattern to another webhook function within the same file to ensure consistency and further improve performance.

Comment on lines +127 to +151
async def send_audio_bytes_developer_webhook(
uid: str, sample_rate: int, data: bytes, timestamp: int | None = None
):
print("send_audio_bytes_developer_webhook", uid)
# TODO: add a lock, send shorter segments, validate regex.
toggled = user_webhook_status_db(uid, WebhookType.audio_bytes)
if toggled:
webhook_url = get_user_webhook_db(uid, WebhookType.audio_bytes)
webhook_url = webhook_url.split(',')[0]
if not webhook_url:
return
webhook_url += f'?sample_rate={sample_rate}&uid={uid}'
try:
response = requests.post(
webhook_url, data=data, headers={'Content-Type': 'application/octet-stream'}, timeout=15
)
print('send_audio_bytes_developer_webhook:', webhook_url, response.status_code)
except Exception as e:
print(f"Error sending audio bytes to developer webhook: {e}")
else:
if not toggled:
return

webhook_url = get_user_webhook_db(uid, WebhookType.audio_bytes)
webhook_url = (webhook_url or '').split(',')[0]
if not webhook_url:
return

webhook_url += f'?sample_rate={sample_rate}&uid={uid}'
if timestamp is not None:
webhook_url += f'&timestamp={timestamp}'
try:
async with httpx.AsyncClient() as client:
response = await client.post(
webhook_url, content=data, headers={'Content-Type': 'application/octet-stream'}, timeout=15
)
print('send_audio_bytes_developer_webhook:', webhook_url, response.status_code)
except httpx.RequestError as e:
print(f"Error sending audio bytes to developer webhook: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This is a great improvement, changing requests to httpx to avoid blocking the event loop in this async function. To maintain consistency and improve performance across the module, consider applying the same pattern to the realtime_transcript_webhook function in this file, which currently uses a synchronous requests.post call and can block the event loop.

Here's how you could refactor realtime_transcript_webhook to use httpx:

async def realtime_transcript_webhook(uid, segments: List[dict]):
    print("realtime_transcript_webhook", uid)
    toggled = user_webhook_status_db(uid, WebhookType.realtime_transcript)

    if toggled:
        webhook_url = get_user_webhook_db(uid, WebhookType.realtime_transcript)
        if not webhook_url:
            return
        webhook_url += f'?uid={uid}'
        try:
            async with httpx.AsyncClient() as client:
                response = await client.post(
                    webhook_url,
                    json={'segments': segments, 'session_id': uid},
                    headers={'Content-Type': 'application/json'},
                    timeout=15,
                )
            print('realtime_transcript_webhook:', webhook_url, response.status_code)
            if response.status_code == 200:
                response_data = response.json()
                if not response_data:
                    return
                message = response_data.get('message', '')
                if len(message) > 5:
                    send_webhook_notification(uid, message)
        except httpx.RequestError as e:
            print(f"Error sending realtime transcript to developer webhook: {e}")
    else:
        return
References
  1. In an async application, use asynchronous HTTP clients like httpx instead of synchronous ones like requests to avoid blocking the event loop and degrading performance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be for another pull request honnestly

@smian1 smian1 requested review from beastoin and mdmohsin7 December 27, 2025 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: offline sync dont trigger any webhooks

1 participant