Skip to content

Conversation

@ataha322
Copy link
Contributor

@ataha322 ataha322 commented Dec 11, 2025

  • instruments OpenAI().audio.transcriptions.create()
  • captures audio duration
  • captures model name and other standard request/response parameters
  • capturing audio duration depends on mutagen, included as optional dependency.
  • user should import instrumentation with flag ["audio"] if he wishes to capture audio duration. This way mutagen will be included.
  • Still WIP, need to finish tests and consult with maintainers on semconv and overall review.

Hello maintainers!
This is still a draft, but I still would like to ask for a review because this is a lot of changes and I'm not as familiar with this code base. Mainly, these questions:

  1. There doesn't yet exist a semconv for openai transcription audio length (openai does pricing based on audio length). What should we do in such case? I left a few comments in the code where I came up with an attribute name.
  2. Does the overall implementation look right? I can't tell if I'm doing things idiomatically to your code style but I think I am.
  3. I know I'm missing cassettes for the test, I will record them later.
  4. Any other problems with this code, please let me know.

Here's how attributes look like in OTEL collector:

ScopeSpans #0
ScopeSpans SchemaURL: 
InstrumentationScope opentelemetry.instrumentation.openai.v1 0.49.7
Span #0
    Trace ID       : 59ebb8e41b43f1f75571bfa5e696ec70
    Parent ID      : 0347c5affa2dfcfd
    ID             : f767b4268f4a51fb
    Name           : paid.trace.openai.audio.transcriptions
    Kind           : Client
    Start time     : 2025-12-11 18:23:45.946191 +0000 UTC
    End time       : 2025-12-11 18:23:53.637282 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> external_customer_id: Str(circus)
     -> external_agent_id: Str(clown)
     -> gen_ai.system: Str(openai)
     -> gen_ai.request.model: Str(whisper-1)
     -> llm.headers: Str(None)
     -> llm.is_streaming: Bool(false)
     -> gen_ai.request.structured_output_schema: Str("text")
     -> gen_ai.openai.api_base: Str(https://api.openai.com/v1/)
     -> gen_ai.openai.audio.input.duration_seconds: Double(124.70857142857143)
     -> environment: Str(production)

Important

Adds OpenAI audio transcription instrumentation capturing audio duration and other parameters, with optional mutagen dependency and tests.

  • Instrumentation:
    • Adds transcription_wrapper and atranscription_wrapper in audio_wrappers.py to instrument Transcriptions.create and AsyncTranscriptions.create.
    • Captures audio duration using mutagen, model name, and other request/response parameters.
    • Adds mutagen as an optional dependency in pyproject.toml.
  • Configuration:
    • Users can import instrumentation with extras = ["audio"] to include mutagen.
  • Testing:
    • Adds test_audio_transcription.py with tests for both sync and async transcription using mock audio files.
  • Misc:
    • Updates _instrument and _uninstrument methods in __init__.py to wrap and unwrap transcription functions.
    • Raises questions about semantic conventions for audio length and code style.

This description was created by Ellipsis for b6db65e. You can customize this summary. It will automatically update as commits are pushed.

Summary by CodeRabbit

  • New Features

    • Added OpenTelemetry instrumentation for OpenAI audio transcription API calls, capturing comprehensive traces with request/response metadata, client information, and duration performance metrics.
  • Tests

    • Added test coverage for audio transcription tracing to validate telemetry collection and reporting functionality.

✏️ Tip: You can customize this high-level summary in your review settings.

* instruments OpenAI().audio.transcriptions.create()
* captures audio duration
* captures model name and other standard request/response parameters
* capturing audio duration depends on `mutagen`, included as optional
  dependency.
* user should import instrumentation with flag `["audio"]` if he wishes
  to capture audio duration. This way mutagen will be included.

* Still WIP, need to finish tests and consult with maintainers on
  semconv and overall review.
@coderabbitai
Copy link

coderabbitai bot commented Dec 11, 2025

Walkthrough

The changes introduce OpenTelemetry instrumentation for OpenAI audio transcription calls. A new audio_wrappers module provides synchronous and asynchronous wrapper functions for transcription methods, capturing span attributes, duration metrics, and error details. Supporting utilities and integration hooks are added to the instrumentation layer, with optional mutagen dependency for extracting audio duration metadata.

Changes

Cohort / File(s) Summary
Audio instrumentation
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py
New module with sync/async transcription wrappers, request/response handlers, and metrics collection. Includes audio duration extraction via file path and Mutagen, span attribute recording, exception handling, and histogram duration tracking.
Instrumentation utilities
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py
New wrapper factory _with_audio_telemetry_wrapper that adapts audio telemetry wrapper functions to wrapt-compatible format, mirroring existing embeddings telemetry pattern.
V1 integration
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py
Adds audio transcription support: imports audio wrappers, registers exception counter metric, patches Transcriptions.create and AsyncTranscriptions.create methods with new wrappers, adds corresponding unwraps.
Dependencies
packages/opentelemetry-instrumentation-openai/pyproject.toml
Adds optional dependency mutagen (^1.47.0) and declares new audio extras group.
Tests
packages/opentelemetry-instrumentation-openai/tests/traces/test_audio_transcription.py
New test module with sync and async transcription tests, verifying span name "openai.audio.transcriptions" and attributes (model, API base). Uses VCR for HTTP mocking.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Wrapper as Wrapper Layer
    participant Tracer
    participant Span
    participant Histogram as Metrics<br/>(Histogram)
    participant Counter as Metrics<br/>(Counter)
    participant OpenAI as OpenAI API
    
    Client->>Wrapper: call transcription_wrapper()
    activate Wrapper
    Note over Wrapper: Start timing
    Wrapper->>Tracer: start_as_current_span("openai.audio.transcriptions")
    activate Tracer
    Tracer->>Span: create span
    activate Span
    Wrapper->>Span: set request attributes<br/>(model, duration, api base)
    Wrapper->>OpenAI: execute transcription call
    activate OpenAI
    OpenAI-->>Wrapper: response
    deactivate OpenAI
    Note over Wrapper: End timing
    Wrapper->>Span: set response attributes
    Wrapper->>Histogram: record duration
    activate Histogram
    Histogram-->>Wrapper: metric recorded
    deactivate Histogram
    Wrapper->>Span: mark success
    Span-->>Wrapper: span ends
    deactivate Span
    deactivate Tracer
    Wrapper-->>Client: return response
    deactivate Wrapper
    
    rect rgb(200, 150, 255)
    Note over Wrapper,Counter: Error Path (on exception)
    Wrapper->>Counter: increment exception_counter
    activate Counter
    Counter-->>Wrapper: counted
    deactivate Counter
    Wrapper->>Histogram: record duration
    Wrapper->>Span: set error status + exception info
    Wrapper-->>Client: re-raise exception
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas requiring attention:

  • Audio duration extraction logic in _get_audio_duration(): verify Mutagen integration handles missing/malformed audio files gracefully and edge cases (e.g., file path without audio extension)
  • Span attribute handling in _handle_request(): check fallback mechanism for audio duration attribute key when semconv attribute is unavailable
  • Error handling paths: ensure exception types are correctly captured and recorded in both sync and async wrappers
  • Metrics recording in _set_transcription_metrics(): verify histogram attributes (server address, model) are consistently populated across different OpenAI API versions
  • Integration points in v1/init.py: confirm patch/unpatch order is correct and counter initialization handles disabled metrics case (None)
  • Test coverage: verify VCR cassettes exist and async test properly awaits the coroutine

Poem

🐰 Whispers now traced in digital streams,
Audio wrappers fulfill our dreams,
Spans and metrics dance in sync,
Capturing whisper-1's every blink,
OpenTelemetry hops on by! 🎵

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title contains a typo ('instrumentaiton' instead of 'instrumentation') and is only partially related to the changeset, as it describes adding audio transcription support but omits key implementation details like span creation, metrics, and wrapper functions. Fix the typo to 'feat(instrumentation): OpenAI audio transcription' for clarity and consistency with semantic versioning conventions.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ataha322
Copy link
Contributor Author

Hi @nirga
Not sure who'd be the right person, but since you reviewed my last PR, can you help me with this one too? Or refer anyone else might do that?

It's still a draft, but I need some consultancy on the questions that I described in that PR description.

@nirga
Copy link
Member

nirga commented Dec 11, 2025

Hey @ataha322 - can you mark the PR as ready to review and did the PR issues?

@ataha322 ataha322 marked this pull request as ready for review December 11, 2025 20:32
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to b6db65e in 2 minutes and 27 seconds. Click for details.
  • Reviewed 434 lines of code in 5 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py:195
  • Draft comment:
    Remove or update the commented-out span attribute code once the semconv is published.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The code already has a TODO comment on line 198 that says "TODO(Ata): come back here later when semconv is published". The automated tool's comment essentially says the same thing: "Remove or update the commented-out span attribute code once the semconv is published." This is redundant with the existing TODO. According to the rules, I should not keep comments that are obvious or unimportant, and this comment doesn't add any new information beyond what the developer already documented. The developer is clearly aware of this issue and has already marked it for future work. Perhaps the automated comment is more visible than an inline TODO and could serve as a reminder in the PR review process. Also, having it as a PR comment might ensure it gets tracked in a review system rather than just being buried in code. While PR comments can be more visible, the developer has already explicitly documented this with a TODO comment that includes their name and a clear explanation. The PR review process should catch obvious TODOs in new code if they're a concern. Adding a redundant comment doesn't provide additional value and violates the rule about not making obvious comments. The comment should be deleted because it's redundant with the existing TODO comment in the code (line 198). The developer has already documented this issue clearly, and the automated comment doesn't add any new information or actionable guidance.
2. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py:99
  • Draft comment:
    There's a typographical error in the comment on this line. It reads "if there are legal duration, record it". Consider revising it to something like "if there is a valid duration, record it".
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This is a comment about fixing a typo/grammatical error in a code comment. While it's technically correct that "legal duration" is awkward phrasing and "valid duration" would be better, this falls under the category of "obvious or unimportant" comments. The meaning is clear enough despite the grammatical issue. The rules specifically state "Do NOT make comments that are obvious or unimportant." A minor grammatical improvement to a comment is not a critical code change. Additionally, the word "legal" might be intentional (though awkward) to mean "valid/acceptable". This is a very minor style/grammar issue that doesn't affect functionality. The comment does point out a legitimate grammatical issue that would improve code readability. "Legal duration" is indeed awkward English, and "valid duration" is clearer. Code comments should be clear and well-written. This is an actionable suggestion that improves code quality. While the suggestion does improve clarity, the rules explicitly state not to make comments that are "obvious or unimportant." A minor grammatical improvement to a comment (not even to code) falls into this category. The current comment is understandable despite the awkward phrasing, and this is not a critical issue that requires attention. This comment should be deleted. It's pointing out a minor grammatical issue in a code comment that doesn't affect functionality or significantly impact readability. This falls under "obvious or unimportant" comments that should not be made according to the rules.
3. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py:22
  • Draft comment:
    It appears that the name atranscription_wrapper might be a typographical error. If this is intended to handle audio transcription, consider renaming it to something like audio_transcription_wrapper for clarity.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The comment is clearly incorrect. The codebase has a consistent pattern of prefixing async wrappers with 'a' - achat_wrapper, acompletion_wrapper, aembeddings_wrapper, and now atranscription_wrapper. This is not a typo but a deliberate naming convention. The comment shows a lack of understanding of the codebase's naming patterns. This is exactly the type of comment that should be deleted - it's suggesting a change that would actually break the consistency of the codebase. Could there be a reason why this particular wrapper should deviate from the established pattern? Maybe the 'a' prefix is confusing in this specific context? No, there's no reason to deviate from the pattern. The 'a' prefix for async is used consistently throughout the file for all async wrappers. Changing this one would make it inconsistent with the rest of the codebase. The comment is objectively wrong. This comment should be deleted. It incorrectly identifies a deliberate naming convention as a typo. The 'a' prefix for async wrappers is used consistently throughout the codebase (achat_wrapper, acompletion_wrapper, aembeddings_wrapper), and atranscription_wrapper follows this same pattern.
4. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py:277
  • Draft comment:
    The function name 'atranscription_wrapper' looks like it might be a typographical error. If its purpose is to wrap asynchronous transcription calls, consider renaming it to 'async_transcription_wrapper' for clarity and consistency.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% This comment is clearly incorrect. The codebase has an established naming convention where async wrappers are prefixed with 'a' rather than 'async_'. Examples include achat_wrapper, acompletion_wrapper, and aembeddings_wrapper. The new atranscription_wrapper follows this exact same pattern and is therefore consistent with the codebase style. The comment suggests changing it to async_transcription_wrapper, which would actually make it inconsistent with the rest of the codebase. This is a code quality suggestion that fails to recognize the existing conventions. Could there be a reason why the author might want to change the naming convention for this particular wrapper? Perhaps the 'a' prefix is considered unclear and they want to start using more explicit names going forward? Even if there were a desire to change naming conventions, that should be done consistently across the entire codebase, not just for one new function. The comment doesn't suggest refactoring all the other wrappers, so it's asking for inconsistency. This is clearly not understanding the existing pattern. The comment should be deleted because it incorrectly suggests that atranscription_wrapper is a typo when it actually follows the established naming convention in the codebase (e.g., achat_wrapper, acompletion_wrapper, aembeddings_wrapper). The suggestion would make the code inconsistent.

Workflow ID: wflow_OSgCpYeNswuAXV6E

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

span, 'gen_ai.openai.audio.input.duration_seconds', audio_duration
)
else:
print("REMOVE ME : ATA-DBG : COULD NOT READ AUDIO FILE WITH MUTAGEN")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the debug print statement and use the logger at an appropriate level instead.

Suggested change
print("REMOVE ME : ATA-DBG : COULD NOT READ AUDIO FILE WITH MUTAGEN")
logger.debug("COULD NOT READ AUDIO FILE WITH MUTAGEN")

end_time = time.time()
except Exception as e: # pylint: disable=broad-except
end_time = time.time()
duration = end_time - start_time if "start_time" in locals() else 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider initializing start_time before the try block to avoid checking if "start_time" in locals() in the exception handler.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py (1)

33-64: Consider supporting string/bytes paths in _get_audio_duration

Right now, any file that is a str/bytes path returns None without attempting duration extraction:

elif isinstance(file, (str, bytes)):
    # If it's a path string or bytes
    return None

If you expect callers to pass a path (e.g., file="path/to/file.mp3"), you could let mutagen handle those as well:

-    try:
-        # First check if it's a file-like object with a name attribute
-        if hasattr(file, "name"):
-            file_path = file.name
-        elif isinstance(file, (str, bytes)):
-            # If it's a path string or bytes
-            return None
-        else:
-            # If it's a file-like object without name, we can't easily determine duration
-            return None
+    try:
+        # Resolve a filesystem path from the input
+        if hasattr(file, "name"):
+            file_path = file.name
+        elif isinstance(file, (str, bytes)):
+            file_path = file
+        else:
+            # If it's a file-like object without name, we can't easily determine duration
+            return None

The existing broad exception handling already ensures failures here don’t impact user code.

If you adopt this, please verify it behaves correctly for both open file objects and plain path strings on your supported platforms.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5da12d8 and b6db65e.

📒 Files selected for processing (5)
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py (1 hunks)
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py (1 hunks)
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py (3 hunks)
  • packages/opentelemetry-instrumentation-openai/pyproject.toml (2 hunks)
  • packages/opentelemetry-instrumentation-openai/tests/traces/test_audio_transcription.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

  • packages/opentelemetry-instrumentation-openai/tests/traces/test_audio_transcription.py
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py
🧠 Learnings (2)
📚 Learning: 2025-12-02T21:09:48.690Z
Learnt from: duanyutong
Repo: traceloop/openllmetry PR: 3487
File: packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py:177-178
Timestamp: 2025-12-02T21:09:48.690Z
Learning: The opentelemetry-instrumentation-openai and opentelemetry-instrumentation-openai-agents packages must remain independent and not share code, so code duplication between them is acceptable.

Applied to files:

  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py
  • packages/opentelemetry-instrumentation-openai/pyproject.toml
📚 Learning: 2025-08-22T14:41:26.962Z
Learnt from: prane-eth
Repo: traceloop/openllmetry PR: 3336
File: packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py:8-8
Timestamp: 2025-08-22T14:41:26.962Z
Learning: In the openllmetry project, the `packaging` library is available as a transitive dependency through other packages (visible in poetry.lock) and doesn't need to be explicitly declared in pyproject.toml dependencies.

Applied to files:

  • packages/opentelemetry-instrumentation-openai/pyproject.toml
🧬 Code graph analysis (3)
packages/opentelemetry-instrumentation-openai/tests/traces/test_audio_transcription.py (2)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
  • SpanAttributes (64-245)
packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (1)
  • get_finished_spans (40-43)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py (2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py (2)
  • atranscription_wrapper (126-182)
  • transcription_wrapper (67-122)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py (1)
  • is_metrics_enabled (39-40)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py (2)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (1)
  • wrapper (421-434)
packages/opentelemetry-instrumentation-llamaindex/opentelemetry/instrumentation/llamaindex/utils.py (2)
  • wrapper (18-19)
  • wrapper (48-58)
🪛 Flake8 (7.3.0)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py

[error] 22-22: 'opentelemetry.semconv_ai.SpanAttributes' imported but unused

(F401)

🪛 Ruff (0.14.8)
packages/opentelemetry-instrumentation-openai/tests/traces/test_audio_transcription.py

10-10: Unused function argument: instrument_legacy

(ARG001)


37-37: Unused function argument: instrument_legacy

(ARG001)

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py

57-58: try-except-pass detected, consider logging the exception

(S110)


57-57: Do not catch blind exception: Exception

(BLE001)


60-60: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (3)
packages/opentelemetry-instrumentation-openai/pyproject.toml (1)

31-31: Optional mutagen dependency and audio extra wiring look correct

Defining mutagen as optional = true and exposing it via the audio extra is aligned with how Poetry handles optional features.

Please confirm ^1.47.0 matches the minimum version you want to support for duration extraction across your supported Python/OpenAI versions.

Also applies to: 51-54

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py (1)

86-106: Audio telemetry wrapper pattern matches existing utilities

The _with_audio_telemetry_wrapper helper is consistent with the embeddings/chat wrappers and correctly threads tracer, duration_histogram, and exception_counter into the inner wrapper; this looks idiomatic for this codebase.

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py (1)

21-24: Audio transcription wrapping is wired correctly; metric naming ready for future semconv

The sync/async Transcriptions.create hooks and corresponding unwrap calls are symmetrical and follow the same pattern as chat/embeddings/image instrumentation, reusing duration_histogram and an exception counter, which looks correct.

For the counter:

audio_transcription_exception_counter = meter.create_counter(
    # name=Meters.LLM_AUDIO_TRANSCRIPTIONS_EXCEPTIONS,  # TODO
    name="llm.openai.audio.transcriptions.exceptions",
    ...
)

using a provisional vendor-specific name is reasonable; just ensure you align it with the eventual Meters.LLM_AUDIO_TRANSCRIPTIONS_EXCEPTIONS (or equivalent) once the semconv metric is published to avoid a breaking rename later.

Please double‑check the intended final metric name in opentelemetry-semconv-ai (or its design doc) so the provisional string matches or can be aliased cleanly when the constant lands.

Also applies to: 254-282, 375-376

Comment on lines +22 to +25
from opentelemetry.semconv_ai import (
SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY,
SpanAttributes,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the audio_wrappers.py file
find . -type f -name "audio_wrappers.py" | head -20

Repository: traceloop/openllmetry

Length of output: 174


🏁 Script executed:

cat -n ./packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py

Repository: traceloop/openllmetry

Length of output: 8977


Remove unused SpanAttributes import to satisfy Flake8

SpanAttributes is imported but never used in this file—it only appears in commented-out code (line 196), which will trigger a Flake8 F401 violation. Remove it from the import:

 from opentelemetry.semconv_ai import (
     SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY,
-    SpanAttributes,
 )

Additionally, remove the debug print statement on line 203:

-            print("REMOVE ME : ATA-DBG : COULD NOT READ AUDIO FILE WITH MUTAGEN")

Re-introduce SpanAttributes once the audio duration attribute is published in semconv_ai.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from opentelemetry.semconv_ai import (
SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY,
SpanAttributes,
)
from opentelemetry.semconv_ai import (
SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY,
)
🧰 Tools
🪛 Flake8 (7.3.0)

[error] 22-22: 'opentelemetry.semconv_ai.SpanAttributes' imported but unused

(F401)

🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py
around lines 22–25, remove the unused SpanAttributes import from
opentelemetry.semconv_ai to fix the Flake8 F401 violation and delete the debug
print statement at line 203; keep the rest of the imports intact and re-add
SpanAttributes later if/when the audio duration attribute is published in
semconv_ai.

Comment on lines +185 to +204
@dont_throw
def _handle_request(span, kwargs, instance):
_set_request_attributes(span, kwargs, instance)
_set_client_attributes(span, instance)

# Extract and set audio duration
file_param = kwargs.get("file")
if file_param:
audio_duration = _get_audio_duration(file_param)
if audio_duration is not None:
# _set_span_attribute(
# span, SpanAttributes.LLM_OPENAI_AUDIO_INPUT_DURATION_SECONDS, audio_duration
# )
# TODO(Ata): come back here later when semconv is published
_set_span_attribute(
span, 'gen_ai.openai.audio.input.duration_seconds', audio_duration
)
else:
print("REMOVE ME : ATA-DBG : COULD NOT READ AUDIO FILE WITH MUTAGEN")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py | sed -n '185,210p'

Repository: traceloop/openllmetry

Length of output: 1110


🏁 Script executed:

# Check if the file exists and explore structure
fd audio_wrappers.py

Repository: traceloop/openllmetry

Length of output: 172


🏁 Script executed:

# Search for SpanAttributes to understand naming conventions
rg "SpanAttributes" packages/opentelemetry-instrumentation-openai --type py -A 3 -B 1

Repository: traceloop/openllmetry

Length of output: 50378


Remove debug print statement and refactor audio duration attribute naming

Two issues require attention:

  1. Debug output on line 203

The print("REMOVE ME : ATA-DBG : COULD NOT READ AUDIO FILE WITH MUTAGEN") statement will spam stdout when audio duration cannot be read. Remove it entirely or replace with a structured log:

         else:
-            print("REMOVE ME : ATA-DBG : COULD NOT READ AUDIO FILE WITH MUTAGEN")
  1. Hardcoded attribute key on line 200

Replace the hardcoded string 'gen_ai.openai.audio.input.duration_seconds' with a module-level constant. This will simplify the migration to SpanAttributes.LLM_OPENAI_AUDIO_INPUT_DURATION_SECONDS once it is published:

# At module top
AUDIO_INPUT_DURATION_ATTR = "gen_ai.openai.audio.input.duration_seconds"

# Then use
_set_span_attribute(span, AUDIO_INPUT_DURATION_ATTR, audio_duration)

This approach avoids hardcoding magic strings and makes future refactoring to the official semconv attribute straightforward.

🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/audio_wrappers.py
around lines 185 to 204, remove the debug print statement that spams stdout when
mutagen cannot read the audio file and instead either drop the message or
replace it with a structured logger call; also introduce a module-level constant
(e.g., AUDIO_INPUT_DURATION_ATTR) at the top of the file for the attribute key
and use that constant in the _set_span_attribute call instead of the hardcoded
string so future migration to
SpanAttributes.LLM_OPENAI_AUDIO_INPUT_DURATION_SECONDS is simple.

Comment on lines +9 to +32
@pytest.mark.vcr
def test_audio_transcription(instrument_legacy, span_exporter, openai_client):
# Create a mock audio file (in real test, use VCR cassette)
audio_file = io.BytesIO(b"fake audio content")
audio_file.name = "test_audio.mp3"

openai_client.audio.transcriptions.create(
file=audio_file,
model="whisper-1",
)

spans = span_exporter.get_finished_spans()
assert len(spans) == 1
assert spans[0].name == "openai.audio.transcriptions"

transcription_span = spans[0]
assert (
transcription_span.attributes[GenAIAttributes.GEN_AI_REQUEST_MODEL]
== "whisper-1"
)
assert (
transcription_span.attributes[SpanAttributes.LLM_OPENAI_API_BASE]
== "https://api.openai.com/v1/"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "test_audio_transcription.py" | head -20

Repository: traceloop/openllmetry

Length of output: 153


🏁 Script executed:

fd "test_audio_transcription.py"

Repository: traceloop/openllmetry

Length of output: 151


🏁 Script executed:

git ls-files | grep -i "audio_transcription"

Repository: traceloop/openllmetry

Length of output: 151


🏁 Script executed:

cat -n packages/opentelemetry-instrumentation-openai/tests/traces/test_audio_transcription.py

Repository: traceloop/openllmetry

Length of output: 2220


🏁 Script executed:

# Check conftest.py for VCR configuration in the same directory
cat -n packages/opentelemetry-instrumentation-openai/tests/traces/conftest.py 2>/dev/null || echo "conftest.py not found in traces dir"

Repository: traceloop/openllmetry

Length of output: 332


🏁 Script executed:

# Check parent conftest.py
cat -n packages/opentelemetry-instrumentation-openai/tests/conftest.py 2>/dev/null || echo "conftest.py not found in tests dir"

Repository: traceloop/openllmetry

Length of output: 6770


🏁 Script executed:

fd "cassettes" -type d packages/opentelemetry-instrumentation-openai/tests/

Repository: traceloop/openllmetry

Length of output: 235


🏁 Script executed:

find packages/opentelemetry-instrumentation-openai/tests -type d -name "*cassette*" -o -type d -name "*fixtures*"

Repository: traceloop/openllmetry

Length of output: 203


🏁 Script executed:

# Look for YAML or JSON cassette files that might be VCR recordings
find packages/opentelemetry-instrumentation-openai/tests -type f \( -name "*.yaml" -o -name "*.yml" -o -name "*.json" \) | head -20

Repository: traceloop/openllmetry

Length of output: 2678


🏁 Script executed:

find packages/opentelemetry-instrumentation-openai/tests/traces/cassettes -name "*audio*"

Repository: traceloop/openllmetry

Length of output: 47


🏁 Script executed:

ls -la packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/ | grep -i audio

Repository: traceloop/openllmetry

Length of output: 47


🏁 Script executed:

# List all cassettes in the traces cassettes directory
ls packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/

Repository: traceloop/openllmetry

Length of output: 283


🏁 Script executed:

# Check if there's a test_audio_transcription.py file in traces directory listing
ls packages/opentelemetry-instrumentation-openai/tests/traces/ | grep -i audio

Repository: traceloop/openllmetry

Length of output: 92


🏁 Script executed:

# Get complete file listing to see if test_audio_transcription is listed
ls packages/opentelemetry-instrumentation-openai/tests/traces/test_*.py | head -20

Repository: traceloop/openllmetry

Length of output: 1361


🏁 Script executed:

# Check if ruff or flake8 is configured and if ARG001 is enabled
find packages/opentelemetry-instrumentation-openai -name "pyproject.toml" -o -name "ruff.toml" -o -name ".flake8" | xargs grep -l "ARG001\|ARG\|flake8\|ruff" 2>/dev/null | head -5

Repository: traceloop/openllmetry

Length of output: 179


🏁 Script executed:

# Check root pyproject.toml for linting config
cat pyproject.toml 2>/dev/null | grep -A 20 "\[tool.ruff\]\|\[tool.flake8\]" | head -40

Repository: traceloop/openllmetry

Length of output: 47


🏁 Script executed:

cat packages/opentelemetry-instrumentation-openai/.flake8

Repository: traceloop/openllmetry

Length of output: 178


🏁 Script executed:

cat packages/opentelemetry-instrumentation-openai/pyproject.toml | head -60

Repository: traceloop/openllmetry

Length of output: 1597


VCR cassettes must be recorded before these tests can run; fake audio content will cause API errors

Both test_audio_transcription and test_audio_transcription_async are decorated with @pytest.mark.vcr but have no cassette files recorded. Without cassettes, the tests will either make live API calls or fail immediately since the fake audio content (b"fake audio content") cannot be processed by the real OpenAI Whisper API. You must:

  1. Record VCR cassettes with real audio files (e.g., a valid MP3) against the actual OpenAI API, then commit them to the cassettes directory
  2. Once cassettes are in place, run tests in playback mode to verify they pass without hitting the live API

Additionally, the instrument_legacy fixture parameter is unused in both test bodies but required for instrumentation side effects. Consider renaming to _instrument_legacy (with # noqa: ARG001) to signal intentional non-use, though this may not be enforced by your current linting setup.

Also applies to: 35-60

🧰 Tools
🪛 Ruff (0.14.8)

10-10: Unused function argument: instrument_legacy

(ARG001)

@ataha322
Copy link
Contributor Author

I see the issues, will fix them next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants