Skip to content

Conversation

@muralibasani
Copy link
Contributor

@muralibasani muralibasani commented Oct 8, 2025

About this change - What it does

For the below tests
test_regression_config_for_inexisting_object_should_not_throw
test_regression_soft_delete_schemas_should_be_registered

  • After events are produced, a timout is necessary when getting the future object. timeout is added now.
  • Also producer acks is set to all, so that we wait for event to be produced for sure.
  • Added another assert to see if the offset is seen.

References: #xxxxx

Why this way

@github-actions
Copy link

github-actions bot commented Oct 8, 2025

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  src/karapace/core/coordinator
  schema_coordinator.py
  src/karapace/kafka_rest_apis
  consumer_manager.py
Project Total  

This report was generated by python-coverage-comment-action

@muralibasani muralibasani marked this pull request as ready for review October 8, 2025 13:30
@muralibasani muralibasani requested a review from a team as a code owner October 8, 2025 13:30
producer.flush()
msg = future.result()

schema_reader._offset_watcher.wait_for_offset(msg.offset(), timeout=5)
Copy link
Contributor

@jjaakola-aiven jjaakola-aiven Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a blocking call. If offset is marked as seen the schema must be in the in memory database.
Can you enable the strict mode in the config for the test and see if there is an exception catched when handing the message?
Option is also to see if the timeout is elapsed, the call returns the boolean if condition was met. And on timeout the return value is False. That can be asserted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was hard to get any logs.
Enabling strict mode, did not trigger any exception.

I thought we needed some timeouts when getting future. Added. wdyt

@muralibasani muralibasani force-pushed the fix-flaky-schema-delete branch 2 times, most recently from 0d380db to 19f8508 Compare October 13, 2025 09:29
@muralibasani muralibasani changed the title Fix flakiness of delete schema Fix flakiness of tests Oct 13, 2025
@muralibasani muralibasani force-pushed the fix-flaky-schema-delete branch 2 times, most recently from daec2e2 to 97695a4 Compare October 15, 2025 07:19
msg = future.result()

schema_reader._offset_watcher.wait_for_offset(msg.offset(), timeout=5)
msg = future.result(timeout=2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some justification for this to the commit message, I have no idea why we would add a timeout here, as we are flushing right before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, don't have a hard reason for this , but generally I think it is resource intensive sometimes.
And timeout on the future object is provided by the api to handle such cases where processes are slow.

)
producer.flush()
msg = future.result()
msg = future.result(timeout=2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we have flushed, waiting should not change anything. Have we actually observed something that would hint at this solving a problem we have? (I.e., I think this would raise if the future is not complete when calling .result(). Have we seen such exceptions?)

config.admin_metadata_max_age = 2
config.group_id = group_id
config.topic_name = topic_name
config.producer_acks = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add reasoning in commit message, why this change?

Copy link
Contributor Author

@muralibasani muralibasani Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking this will wait until the produced message is acked, adding some more wait time

Comment on lines -203 to +205
msg = future.result()
msg = future.result(timeout=2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

schema_reader._offset_watcher.wait_for_offset(msg.offset(), timeout=5)

seen = schema_reader._offset_watcher.wait_for_offset(msg.offset(), timeout=5)
assert seen is True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we stop asserting here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it back.

@muralibasani muralibasani force-pushed the fix-flaky-schema-delete branch from 97695a4 to 35ae5b4 Compare November 5, 2025 13:20
@muralibasani muralibasani force-pushed the fix-flaky-schema-delete branch from 02ffda7 to 3d51de0 Compare November 13, 2025 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants