Add minute range support to Dimensional TimeSlice Source Crawler framework #6368
+659
−43
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This change enhances the Dimensional Crawler framework to support minute-level time ranges for historical data ingestion.
Previously, the crawler relied on hour-based granularity when determining whether to run historical or incremental syncs. As a result, sub-hour ranges such as PT15M or PT30M were rounded down to zero, incorrectly triggering incremental sync and skipping historical data pulls.
This update replaces hour-based tracking with minute-based tracking across the framework and the Office365 source plugin, ensuring correct historical ingestion for any ISO-8601 duration expressed in minutes or hours.
How
Framework Updates
Office365 Plugin Updates
Is this change backward compatible?
Yes.
Testing
Unit / Functional Validation
Integration Verification
Local pipeline run succeeded:
2025-12-26T12:33:58,162 [pool-7-thread-4] INFO org.opensearch.dataprepper.plugins.source.microsoft_office365.auth.Office365AuthenticationProvider - Getting new access token for Office 365 Management API 2025-12-26T12:33:58,162 [pool-7-thread-4] INFO org.opensearch.dataprepper.plugins.aws.AwsSecretsSupplier - Retrieving latest secrets in aws:secrets:m365_secret. 2025-12-26T12:33:58,405 [pool-7-thread-4] INFO org.opensearch.dataprepper.plugins.aws.AwsSecretsSupplier - Finished retrieving latest secret in aws:secrets:m365_secret. 2025-12-26T12:33:58,406 [pool-7-thread-4] INFO org.opensearch.dataprepper.plugins.aws.AwsSecretsSupplier - Retrieving latest secrets in aws:secrets:m365_secret. 2025-12-26T12:33:58,651 [pool-7-thread-4] INFO org.opensearch.dataprepper.plugins.aws.AwsSecretsSupplier - Finished retrieving latest secret in aws:secrets:m365_secret. 2025-12-26T12:33:58,869 [pool-7-thread-4] INFO org.opensearch.dataprepper.plugins.source.microsoft_office365.auth.Office365AuthenticationProvider - Received new access token. Expires in 3599 seconds 2025-12-26T12:33:58,869 [pool-7-thread-5] INFO org.opensearch.dataprepper.plugins.source.microsoft_office365.auth.Office365AuthenticationProvider - Getting new access token for Office 365 Management API 2025-12-26T12:33:58,869 [pool-7-thread-5] INFO org.opensearch.dataprepper.plugins.aws.AwsSecretsSupplier - Retrieving latest secrets in aws:secrets:m365_secret. 2025-12-26T12:33:59,111 [pool-7-thread-5] INFO org.opensearch.dataprepper.plugins.aws.AwsSecretsSupplier - Finished retrieving latest secret in aws:secrets:m365_secret. 2025-12-26T12:33:59,111 [pool-7-thread-5] INFO org.opensearch.dataprepper.plugins.aws.AwsSecretsSupplier - Retrieving latest secrets in aws:secrets:m365_secret. 2025-12-26T12:33:59,354 [pool-7-thread-5] INFO org.opensearch.dataprepper.plugins.aws.AwsSecretsSupplier - Finished retrieving latest secret in aws:secrets:m365_secret. 2025-12-26T12:33:59,529 [pool-7-thread-5] INFO org.opensearch.dataprepper.plugins.source.microsoft_office365.auth.Office365AuthenticationProvider - Received new access token. Expires in 3599 seconds {"CreationTime":"2025-12-26T06:24:11","Id":"cdf867b5-5bcc-4a18-9e8e-0f24e17fe798","Operation":"Update user.","OrganizationId":"e822651b-5027-4253-83f5-904854601a3b","RecordType":8,"ResultStatus":"Success","UserKey":"Not Available","UserType":4,"Version":1,"Workload":"AzureActiveDirectory","ObjectId":"demo.m3connector@trianzazuresb.onmicrosoft.com","UserId":"ServicePrincipal_3616d279-e97d-48d3-af3e-74ed7de78faf","AzureActiveDirectoryEventType":1,"ExtendedProperties":[{"Name":"additionalDetails","Value":"{\"UserType\":\"Member\",\"User-Agent\":\"Apache-HttpClient/4.5.13 (Java/17.0.17)\"}"},{"Name":"extendedAuditEventCategory","Value":"User"}],"ModifiedProperties":[{"Name":"JobTitle","NewValue":"[\r\n \"Updated by Canary test at 2025-12-26T06:24:11.153502640Z\"\r\n]","OldValue":"[\r\n \"Updated by Canary test at 2025-12-26T06:22:44.545736667Z\"\r\n]"},{"Name":"Included Updated Properties","NewValue":"JobTitle","OldValue":""},{"Name":"TargetId.UserType","NewValue":"Member","OldValue":""},{"Name":"ActorId.ServicePrincipalNames","NewValue":"fb6b0f13-8f1e-4a28-a772-d32d3133da23","OldValue":""},{"Name":"SPN","NewValue":"fb6b0f13-8f1e-4a28-a772-d32d3133da23","OldValue":""}],"Actor":[{"ID":"entraId_app","Type":1},{"ID":"fb6b0f13-8f1e-4a28-a772-d32d3133da23","Type":2},{"ID":"ServicePrincipal_3616d279-e97d-48d3-af3e-74ed7de78faf","Type":2},{"ID":"3616d279-e97d-48d3-af3e-74ed7de78faf","Type":2},{"ID":"ServicePrincipal","Type":2}],"ActorContextId":"e822651b-5027-4253-83f5-904854601a3b","InterSystemsId":"10d1c797-d154-4726-8dde-4cc559babbed","IntraSystemId":"9401b745-76c4-4518-891d-365a8739882e","SupportTicketId":"","Target":[{"ID":"User_0fba2a0b-2680-45c1-9ae6-d20b74edb3ec","Type":2},{"ID":"0fba2a0b-2680-45c1-9ae6-d20b74edb3ec","Type":2},{"ID":"User","Type":2},{"ID":"demo.m3connector@trianzazuresb.onmicrosoft.com","Type":5},{"ID":"1003200511619FF9","Type":3}],"TargetContextId":"e822651b-5027-4253-83f5-904854601a3b"} 2025-12-26T12:34:46,572 [pool-7-thread-1] INFO org.opensearch.dataprepper.plugins.source.source_crawler.base.DimensionalTimeSliceCrawler - Total partitions created in this crawl: 5.0 2025-12-26T12:35:46,588 [pool-7-thread-1] INFO org.opensearch.dataprepper.plugins.source.source_crawler.base.DimensionalTimeSliceCrawler - Total partitions created in this crawl: 5.0 2025-12-26T12:36:46,608 [pool-7-thread-1] INFO org.opensearch.dataprepper.plugins.source.source_crawler.base.DimensionalTimeSliceCrawler - Total partitions created in this crawl: 5.0Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.