Skip to content

Conversation

@MohamedRadyA
Copy link

Agent Online Evaluation Notebooks

This directory contains comprehensive sample notebooks for evaluating AI agents using the Azure AI Projects SDK with online evaluation capabilities. These notebooks demonstrate how to assess various aspects of agent performance and response quality using modern batch evaluation patterns.

Overview

The Agent Online Evaluation notebooks provide complete coverage of Azure AI evaluation capabilities specifically designed for agent scenarios. Each notebook uses the modern Azure AI Projects SDK with standardized SourceFileContentContent(item={}) format for consistent and maintainable evaluation workflows.

Evaluators Included (14 Total)

Agent-Specific Evaluators

Quality Evaluators

  • Coherence - Measures natural flow, readability, and logical progression of responses
  • Fluency - Assesses linguistic quality, grammar, and syntax
  • Groundedness - Validates responses are grounded in provided context
  • Relevance - Evaluates relevance of responses to user queries
  • Response Completeness - Compares responses against ground truth expectations

Key Features

  • Modern SDK Integration: Uses azure-ai-projects with AIProjectClient and OpenAI evals API
  • Standardized Format: Consistent SourceFileContentContent(item={}) format across all notebooks
  • Complete Type Coverage: All Union type variants demonstrated (str vs List[dict], dict vs List[dict])
  • Batch Evaluation: Efficient single evaluation run for multiple test cases using run_evaluator helper
  • Practical Examples: Real-world scenarios covering various quality levels and edge cases
  • Flexible Data Sources: Support for both string and array inputs with anyOf schema patterns

Each notebook includes detailed documentation, prerequisite setup, scoring system explanation, comprehensive samples demonstrating all input type variants, and batch evaluation examples.

Checklist

  • I have read the contribution guidelines
  • I have coordinated with the docs team (mldocs@microsoft.com) if this PR deletes files or changes any file names or file extensions.
  • This notebook or file is added to the CODEOWNERS file, pointing to the author or the author's team.

@MohamedRadyA MohamedRadyA requested a review from a team as a code owner November 13, 2025 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant