Skip to content

Conversation

@stkillen
Copy link
Contributor

@stkillen stkillen commented Nov 25, 2025

Summary

This PR adds support for multi-language testing, copilot capability tracking, and test run frequency classification. These additions enable better organization and automation of AI tests across different languages and execution scenarios.

  1. Multi-Language Support
    Added support for language for test inputs in the Data-Driven test extension. Localized versions of datasets are linked.
    Added support in the AI Test Tool for suites to have one or more languages, based on the localized versions of datasets.

  2. Copilot Capability Tracking
    Added Copilot Capability field to AIT Test Suite table. This enables categorization of test suites by the Copilot capability they test

  3. Updated test suite import/export
    Updated the XML import/export functionality to support all new fields. Example XML structure:

<AITSuite Code="W1-SOA" Description="..." Dataset="..." DefaultLanguage="en-US" Capability="Sales Order Agent" ...>
  <Language Tag="cs-CZ" />
  <Language Tag="da-DK" />
  <Language Tag="en-US" />
  ...
</AITSuite>

Work Item(s)

Fixes AB#578022

@github-actions github-actions bot added this to the Version 28.0 milestone Nov 25, 2025
@stkillen stkillen marked this pull request as ready for review November 27, 2025 08:24
@stkillen stkillen requested review from a team as code owners November 27, 2025 08:24
Editable = false;
ToolTip = 'Specifies the version of the current test run. It is used for comparing the results of the current test run with the results of the previous test run. The version will be stored in the Log entries.';
}
field(14; "Copilot Capability"; Enum "Copilot Capability")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this?
It could be solved in the different ways if we need to filter out the things differently. Do we need to store this information on the suite level?

Do we want to limit the test suites to test only single copilot capabilites?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds better categorization and filtering options for test suites. Test suites clearly linked to capabilities will also make it easier to create dashboards, etc., to get count of tests per feature or results per feature.

I think it's a fair restriction to make that a test suite tests one capability. This is also true for our existing test suites.

Copy link
Contributor

@nikolakukrika nikolakukrika Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stkillen Are we sure that partners will use it in the same way? Are we sure that partners will use copilot capabilities.
For both us and partners: Is it a must to have a capability for unit testing, e.g. I just want to test a part of the prompt?
We could have it as a filter, and introduce handling of the zero/not specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partners have to register their capability to use the AI SDK. Thus, there's no way to test e.g. prompt changes without testing a specific capability that you registered. The only way is if you build your own SDK and still want to use the test tool. If that's case, they can opt out of using the field as it's optional. But for all MS features and partner features built using the AI SDK, it makes sense.

I can introduce handling for 0 as "Not Specified" if there's any scenarios where it's not relevant.

/// <returns>The display name of the language.</returns>
procedure GetLanguageDisplayName(LanguageID: Integer): Text
var
WindowsLanguage: Record "Windows Language";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the correct table. Should we use languages for supported languages? I could be wrong too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using the IETF language tag to store the languages in the test suite. This seems to be best practice for storing languages and it's only available on the Windows Language table.


table 149035 "AIT Test Suite Language"
{
Caption = 'AI Test Suite Language';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to persist this table? Can it be a temp table?
Second question - do we need a table at all? Can we get lists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say so. We want the test suite to list the languages that are available.

Since it's part of the test suite definition, it also makes sense to persist it.

E.g. like this:

<AITSuite Code="W1-SOA" Description="..." Dataset="..." DefaultLanguage="en-US" Capability="Sales Order Agent" ...>
  <Language Tag="cs-CZ" />
  <Language Tag="da-DK" />
  <Language Tag="en-US" />
  ...
</AITSuite>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stkillen I would still calculate it on the fly. I'm not against if you want to keep it in the table, however the less places it is defined in, less headaches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will calculate these instead of storing


local procedure GetDefaultLanguageID(): Integer
begin
exit(1033); // en-US
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be blank? I think we should distinguish between en-US and blank. If we want to use integers it can be zero, but blank would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why though?

If it's for backwards compatibility: having an input dataset with the "wrong" language won't matter, it will still run. But we should update all our datasets with the language they're in.

I think it's an advantage to enforce some standards for datasets. Without standards we'll see a lot of inconsistencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I use the default language for grouping languages. So all datasets are grouped with English dataset first and then localized versions, e.g.:

DATASET 1 (English)

  • DATASET 1 (Danish)
  • DATASET 1 (Spanish)

DATASET 2 (English)

  • ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a default could be changed globally. If you want to enforce the language, it is OK.
I was just wondering do we have any scenarios with blank (not specified value). If not we can drop it.

@stkillen stkillen changed the title [AI Test Tool] Add Language Support, Capability and Frequency [AI Test Tool] Add Language Support and Capability Dec 1, 2025
AITTestMethodLine."Input Dataset" := TestInputGroupLanguageVersion.Code;
AITTestMethodLine.Modify();
end;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor double blank lines. You can bulk replace these.

page 149046 "AIT Test Suite Languages"
{
Caption = 'Languages';
PageType = List;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look into using standard dialog to avoid having both List and Listpart

Rec.Indentation := 1;
end;

local procedure UpdateGroup()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this logic; always create a parent group with empty language and add all language versions (including English) as a child group.

/// Updates the test method lines of the specified test suite to use the selected language version.
/// </summary>
/// <param name="AITTestSuite"></param>
procedure UpdateLanguagesForTestSuite(AITTestSuite: Record "AIT Test Suite")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, pass language to runner. Each input dataset should point to the parent group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants