-
Notifications
You must be signed in to change notification settings - Fork 483
Optionaly use TF slice instead of TFcounter for CCDB cache validation #14652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
REQUEST FOR PRODUCTION RELEASES: This will add The following labels are available |
|
Error while checking build/O2/fullCI_slc9 for 5acfb86 at 2025-09-05 03:03: Full log here. |
|
If I understand correctly, you check that the timeSlice counter did not jump much and if not, you keep the cached CCDB object. Particularly in high-load situations, when the MI50 nodes go into backpressure, and the MI100 nodes not, there can easily be a difference in the processing delay of order of 1 minute between TFs arriving at the calib node from different reco nodes. Thus I think the tfCounter is the much safer choice. If we allow a timeslice difference of let's say 4, and in that range we can have a tfCounter difference of 200, shouldn't we then just allow a tfCounter difference of 200 for CCDB fetching? It should have a similar effect, but at least then we have a real limit for the lifetime of the validity. |
|
@davidrohr, I know, it is to address this: https://ali-bookkeeping.cern.ch/?page=log-detail&id=134453. We need an effective way to prescale the CCDB queries on the aggregator node and given that the TFCounters arriving quasi-unordered and with the large spread, prescaling with the TFcounter difference is not effective. |
5acfb86 to
432055c
Compare
|
@ehellbar I've tested extra options of this PR as (imposing isOnline mode) on the interpolation workflow.
|
432055c to
b57e509
Compare
|
@shahor02 : But TFCounters do not arrive fully unordered, they are shuffled within the processing latency of the EPNs. So if we allow something like a TFCounter difference of +/- 2 minutes or so, wouldn't that work? And it would be more precise and more explicit than a timeslice-based prescaling. |
|
Why it will be more precise with the update on |
|
ok, if you want higher update rate for most cases, I would use the logical or of both conditions. |
…TFcounter for CCDB cache validation is N!=0 If --condition-tf-per-query-multiplier value is negative, the prescaling is simply applied to tfCounter%|query_rate| (or timeslice%|query_rate| if --condition-use-slice-for-prescaling is asked) If N>0, then enforce a check if the abs difference between the last checked and current TFCounters (not slices!) exceeds N, even if the slices difference is less than the requested check rate.
b57e509 to
3a25c58
Compare
|
OK, modified the If it is N>0, then enforce a check if the abs difference between the last checked and current TFCounters (not slices!) exceeds N, |
No description provided.