-
Notifications
You must be signed in to change notification settings - Fork 294
[AI Test Tool] Add Language Support and Capability #5623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
src/Tools/AI Test Toolkit/src/TestSuite/AITALTestSuiteMgt.Codeunit.al
Outdated
Show resolved
Hide resolved
src/Tools/AI Test Toolkit/src/TestSuite/AITRunFrequency.Enum.al
Outdated
Show resolved
Hide resolved
| Editable = false; | ||
| ToolTip = 'Specifies the version of the current test run. It is used for comparing the results of the current test run with the results of the previous test run. The version will be stored in the Log entries.'; | ||
| } | ||
| field(14; "Copilot Capability"; Enum "Copilot Capability") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
It could be solved in the different ways if we need to filter out the things differently. Do we need to store this information on the suite level?
Do we want to limit the test suites to test only single copilot capabilites?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adds better categorization and filtering options for test suites. Test suites clearly linked to capabilities will also make it easier to create dashboards, etc., to get count of tests per feature or results per feature.
I think it's a fair restriction to make that a test suite tests one capability. This is also true for our existing test suites.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stkillen Are we sure that partners will use it in the same way? Are we sure that partners will use copilot capabilities.
For both us and partners: Is it a must to have a capability for unit testing, e.g. I just want to test a part of the prompt?
We could have it as a filter, and introduce handling of the zero/not specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partners have to register their capability to use the AI SDK. Thus, there's no way to test e.g. prompt changes without testing a specific capability that you registered. The only way is if you build your own SDK and still want to use the test tool. If that's case, they can opt out of using the field as it's optional. But for all MS features and partner features built using the AI SDK, it makes sense.
I can introduce handling for 0 as "Not Specified" if there's any scenarios where it's not relevant.
| /// <returns>The display name of the language.</returns> | ||
| procedure GetLanguageDisplayName(LanguageID: Integer): Text | ||
| var | ||
| WindowsLanguage: Record "Windows Language"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is the correct table. Should we use languages for supported languages? I could be wrong too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using the IETF language tag to store the languages in the test suite. This seems to be best practice for storing languages and it's only available on the Windows Language table.
src/Tools/AI Test Toolkit/src/TestSuite/AITTestSuiteLanguage.Codeunit.al
Outdated
Show resolved
Hide resolved
|
|
||
| table 149035 "AIT Test Suite Language" | ||
| { | ||
| Caption = 'AI Test Suite Language'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to persist this table? Can it be a temp table?
Second question - do we need a table at all? Can we get lists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say so. We want the test suite to list the languages that are available.
Since it's part of the test suite definition, it also makes sense to persist it.
E.g. like this:
<AITSuite Code="W1-SOA" Description="..." Dataset="..." DefaultLanguage="en-US" Capability="Sales Order Agent" ...>
<Language Tag="cs-CZ" />
<Language Tag="da-DK" />
<Language Tag="en-US" />
...
</AITSuite>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stkillen I would still calculate it on the fly. I'm not against if you want to keep it in the table, however the less places it is defined in, less headaches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will calculate these instead of storing
src/Tools/Test Framework/Test Runner/src/DataDrivenTest/DataInputs/TestInputGroup.Table.al
Outdated
Show resolved
Hide resolved
|
|
||
| local procedure GetDefaultLanguageID(): Integer | ||
| begin | ||
| exit(1033); // en-US |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be blank? I think we should distinguish between en-US and blank. If we want to use integers it can be zero, but blank would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why though?
If it's for backwards compatibility: having an input dataset with the "wrong" language won't matter, it will still run. But we should update all our datasets with the language they're in.
I think it's an advantage to enforce some standards for datasets. Without standards we'll see a lot of inconsistencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, I use the default language for grouping languages. So all datasets are grouped with English dataset first and then localized versions, e.g.:
DATASET 1 (English)
- DATASET 1 (Danish)
- DATASET 1 (Spanish)
DATASET 2 (English)
- ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a default could be changed globally. If you want to enforce the language, it is OK.
I was just wondering do we have any scenarios with blank (not specified value). If not we can drop it.
| AITTestMethodLine."Input Dataset" := TestInputGroupLanguageVersion.Code; | ||
| AITTestMethodLine.Modify(); | ||
| end; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor double blank lines. You can bulk replace these.
| page 149046 "AIT Test Suite Languages" | ||
| { | ||
| Caption = 'Languages'; | ||
| PageType = List; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look into using standard dialog to avoid having both List and Listpart
| Rec.Indentation := 1; | ||
| end; | ||
|
|
||
| local procedure UpdateGroup() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this logic; always create a parent group with empty language and add all language versions (including English) as a child group.
| /// Updates the test method lines of the specified test suite to use the selected language version. | ||
| /// </summary> | ||
| /// <param name="AITTestSuite"></param> | ||
| procedure UpdateLanguagesForTestSuite(AITTestSuite: Record "AIT Test Suite") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead, pass language to runner. Each input dataset should point to the parent group.
Summary
This PR adds support for multi-language testing, copilot capability tracking, and test run frequency classification. These additions enable better organization and automation of AI tests across different languages and execution scenarios.
Multi-Language Support
Added support for language for test inputs in the Data-Driven test extension. Localized versions of datasets are linked.
Added support in the AI Test Tool for suites to have one or more languages, based on the localized versions of datasets.
Copilot Capability Tracking
Added Copilot Capability field to AIT Test Suite table. This enables categorization of test suites by the Copilot capability they test
Updated test suite import/export
Updated the XML import/export functionality to support all new fields. Example XML structure:
Work Item(s)
Fixes AB#578022