Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change setting allowing size > 0 queries in request cache to be an int threshold #16570

Merged

Conversation

peteralfonsi
Copy link
Contributor

@peteralfonsi peteralfonsi commented Nov 5, 2024

Description

Followup to #16484. @kkhatua pointed out we may want to set some maximum size for a query to enter the cache, rather than just allowing or not allowing all size > 0 queries. For example, sizes as large as 10,000 are sometimes used, and letting them in may take up too much cache space. Also, queries with extremely large sizes spend more time in the fetch phase, so they benefit less from the request cache.

This PR changes the new setting to be an int, which is the maximum size allowed into the cache. The default value is 0, so the earlier behavior is still maintained.

Based on benchmarking so far, a reasonable non-zero value for this setting might be around 200, but benchmarks of very large sizes are still ongoing.

Since the original PR is not in any release, I think it's safe to rename the setting, but please let me know if this isn't true. We also shouldn't have to update the changelog as its original entry was not specific about the setting name/values.

Related doc PR: opensearch-project/documentation-website#8684

Tested in UT/IT and manually.

Related Issues

Resolves #16485

Check List

  • Functionality includes testing.
  • [N/A] API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Nov 6, 2024

❕ Gradle check result for c07030d: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Nov 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.07%. Comparing base (0363aa7) to head (c07030d).
Report is 39 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16570      +/-   ##
============================================
+ Coverage     72.00%   72.07%   +0.06%     
- Complexity    65038    65054      +16     
============================================
  Files          5313     5314       +1     
  Lines        303454   303534      +80     
  Branches      43910    43919       +9     
============================================
+ Hits         218510   218776     +266     
+ Misses        67040    66812     -228     
- Partials      17904    17946      +42     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@peteralfonsi
Copy link
Contributor Author

@dbwiddis, would you be able to take a look at this one if you have time? It's a minor update to (#16484) which you reviewed recently.

Copy link
Member

@dbwiddis dbwiddis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a question.

@dbwiddis
Copy link
Member

@kkhatua can you take a look at this when you get a chance?

@kkhatua
Copy link
Member

kkhatua commented Nov 18, 2024

@dbwiddis LGTM.... @peteralfonsi and I discussed this and since the cache entry is only appended with a list of DocIDs, it made sense to make it a single setting ranging from 0 to K, rather than a boolean, with an upper bound on it.

@peteralfonsi
I see that while the range is set as 0 to 10_000, is there a reason we are not even trying to set a default of 200 (which you mentioned in the comments) ?

@peteralfonsi
Copy link
Contributor Author

peteralfonsi commented Nov 18, 2024

@kkhatua my thinking was we should keep the default as 0 for now since it matches earlier behavior of not caching size > 0 queries at all. This is preferable since, even though I've run some nyc_taxis based benchmarks that look good, we don't know for sure if turning this on would have bad effects for some kinds of workloads. For example, @upasagar mentioned he's seen workloads where the vast majority of all requests are size > 0, and suddenly caching them would be a significant change.

So, I think it makes more sense for this to be opt-in rather than opt-out until we have more results from actual clusters in production that have enabled it. Then we can suggest more tuning.

@kkhatua kkhatua self-requested a review November 18, 2024 19:25
Copy link
Member

@kkhatua kkhatua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 .... Ship it!

             /|~~~   
           ///|      
         /////|      
       ///////|      
     /////////|      
   \==========|===/  
~~~~~~~~~~~~~~~~~~~~~

@jainankitk jainankitk merged commit 2ac64a6 into opensearch-project:main Nov 19, 2024
71 of 72 checks passed
@jainankitk jainankitk added the backport 2.x Backport to 2.x branch label Nov 19, 2024
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 19, 2024
Signed-off-by: Peter Alfonsi <[email protected]>
Co-authored-by: Peter Alfonsi <[email protected]>
(cherry picked from commit 2ac64a6)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
jainankitk pushed a commit that referenced this pull request Nov 20, 2024
…6688)

(cherry picked from commit 2ac64a6)

Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Peter Alfonsi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Search:Performance skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Add setting controlling whether to allow size > 0 requests in the request cache
4 participants