-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add search "EXPLAIN" endpoint #17992
Conversation
Related: #10047 |
83999c6
to
41e480a
Compare
89b5980
to
fbf0c43
Compare
The data tiering feature needs a way to indicate to the user that a search request will be reaching into indices that are stored in the warm tier. Those requests are possibly slow and expensive, so we want to inform the user about this fact. This new endpoint can be queried from the UI with a regular search request, before it is executed. It returns which indices will be used for the search and whether those indices are stored in the warm tier or not. Furthermore it contains the timerange info for each index, so the frontend can give the user suggestions on how to shorten the query so it does not reach the warm tiered indices. This kind of functionality fits perfectly into a more generic "explain" endpoint. So instead of a adding a special endpoint for just the warm tier index range info, we decided to build a more generic version which also contains a preview of the generated OS/ES query and a list of search validation errors. Example response: ``` { "search_id": "647f0565d060431199a12e96", "search": { "queries": { "a1647eb6-a064-4fe6-b459-1e4267d3f659": { "search_types": { "22249f29-f042-4bd8-b745-252b00a35891": { "query_string": "{\"from\":0,\"size\":0,\"query\":{\"bool\":{\"must\":[{\"bool\":{\"filter\":[{\"match_all\":{\"boost\":1.0}},{\"bool\":{\"adjust_pure_negative\":true,\"boost\":1.0}}],\"adjust_pure_negative\":true,\"boost\":1.0}},{\"range\":{\"timestamp\":{\"from\":\"2023-09-11 20:55:50.185\",\"to\":\"2024-01-18 14:49:10.185\",\"include_lower\":true,\"include_upper\":false,\"boost\":1.0}}},{\"terms\":{\"streams\":[\"63d6d52ebf9c684b3da2deb3\",\"63a5ab32e71520111ed3ce06\",\"000000000000000000000001\"],\"boost\":1.0}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"track_total_hits\":2147483647,\"aggregations\":{\"agg\":{\"filters\":{\"filters\":[{\"bool\":{\"should\":[{\"exists\":{\"field\":\"source\",\"boost\":1.0}}],\"adjust_pure_negative\":true,\"boost\":1.0}}],\"other_bucket\":true,\"other_bucket_key\":\"_other_\"},\"aggregations\":{\"agg\":{\"terms\":{\"script\":{\"source\":\"(doc.containsKey('source') && doc['source'].size() > 0\\n? doc['source'].size() > 1\\n ? doc['source']\\n : String.valueOf(doc['source'].value)\\n: \\\"(Empty Value)\\\")\\n\",\"lang\":\"painless\"},\"size\":10,\"min_doc_count\":1,\"shard_min_doc_count\":0,\"show_term_doc_count_error\":false,\"order\":[{\"_count\":\"desc\"},{\"_key\":\"asc\"}]}}}},\"timestamp-min\":{\"min\":{\"field\":\"timestamp\"}},\"timestamp-max\":{\"max\":{\"field\":\"timestamp\"}}}}", "searched_index_ranges": [ { "index_name": "graylog_0", "begin": 0, "end": 0, "is_warm_tiered": false }, { "index_name": "bar_1512", "begin": 1705589036047, "end": 1705589284808, "is_warm_tiered": false }, { "index_name": "bar_1513", "begin": 0, "end": 0, "is_warm_tiered": false }, { "index_name": "bar_warm_1511", "begin": 1705588785906, "end": 1705589035782, "is_warm_tiered": true } ] }, "5e9a9bfe-7a97-4835-86fd-896f40b20531": { "query_string": "{\"from\":0,\"size\":0,\"query\":{\"bool\":{\"must\":[{\"bool\":{\"filter\":[{\"match_all\":{\"boost\":1.0}},{\"bool\":{\"adjust_pure_negative\":true,\"boost\":1.0}}],\"adjust_pure_negative\":true,\"boost\":1.0}},{\"range\":{\"timestamp\":{\"from\":\"2023-09-11 20:55:50.185\",\"to\":\"2024-01-18 14:49:10.185\",\"include_lower\":true,\"include_upper\":false,\"boost\":1.0}}},{\"terms\":{\"streams\":[\"63d6d52ebf9c684b3da2deb3\",\"63a5ab32e71520111ed3ce06\",\"000000000000000000000001\"],\"boost\":1.0}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"track_total_hits\":2147483647,\"aggregations\":{\"agg\":{\"filters\":{\"filters\":[{\"bool\":{\"should\":[{\"exists\":{\"field\":\"source\",\"boost\":1.0}}],\"adjust_pure_negative\":true,\"boost\":1.0}}],\"other_bucket\":true,\"other_bucket_key\":\"_other_\"},\"aggregations\":{\"agg\":{\"terms\":{\"script\":{\"source\":\"(doc.containsKey('source') && doc['source'].size() > 0\\n? doc['source'].size() > 1\\n ? doc['source']\\n : String.valueOf(doc['source'].value)\\n: \\\"(Empty Value)\\\")\\n\",\"lang\":\"painless\"},\"size\":15,\"min_doc_count\":1,\"shard_min_doc_count\":0,\"show_term_doc_count_error\":false,\"order\":[{\"_count\":\"desc\"},{\"_key\":\"asc\"}]}}}},\"timestamp-min\":{\"min\":{\"field\":\"timestamp\"}},\"timestamp-max\":{\"max\":{\"field\":\"timestamp\"}}}}", "searched_index_ranges": [ { "index_name": "graylog_0", "begin": 0, "end": 0, "is_warm_tiered": false } ] } } } } }, "search_errors": [ ] } ``` Use uniqe class names for Search API The API generator cannot handle identical class names under the same endpoint. The resulting code will fail with errors like: ``` [INFO] $ /home/marco/code/graylog-project-repos/graylog2-server/graylog2-web-interface/node_modules/.bin/tsc [INFO] target/api/Search.ts(153,11): error TS6196: 'IndexRangeResult' is declared but never used. ```
fbf0c43
to
197f831
Compare
…n-api # Conflicts: # graylog-storage-elasticsearch7/src/test/java/org/graylog/storage/elasticsearch7/views/ElasticsearchBackendTest.java # graylog-storage-opensearch2/src/test/java/org/graylog/storage/opensearch2/views/OpenSearchBackendTest.java # graylog2-web-interface/src/views/components/searchbar/queryvalidation/QueryValidation.tsx
…n-api # Conflicts: # graylog-storage-elasticsearch7/src/test/java/org/graylog/storage/elasticsearch7/views/ElasticsearchBackendTest.java # graylog-storage-opensearch2/src/test/java/org/graylog/storage/opensearch2/views/OpenSearchBackendTest.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the UI and it looks good. I added some suggestions.
@@ -182,6 +183,36 @@ const deduplicateExplanations = (explanations: Explanations | undefined): Explan | |||
return deduplicated; | |||
}; | |||
|
|||
const warmTierExplanations = (searchesWarmTier: boolean, warmTierIndices: Array<IndexRange>) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move the warm tier logic to its own component and use Plugin
system. This component already supports it, with the views.elements.validationErrorExplanation
key.
@@ -240,9 +244,21 @@ const Widget = ({ id, editing, widget, title, position, onPositionsChange }: Pro | |||
const { config } = widget; | |||
const isFocused = focusedWidget?.id === id; | |||
|
|||
const view = useView(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here also, we could move the logic related to the warm tier into WidgetWarTierAlert
and pass the necessary props. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would second that, to prevent the widget component getting too cluttered.
…n-api # Conflicts: # graylog-storage-elasticsearch7/src/test/java/org/graylog/storage/elasticsearch7/views/ElasticsearchBackendTest.java # graylog-storage-opensearch2/src/test/java/org/graylog/storage/opensearch2/views/OpenSearchBackendTest.java # graylog2-server/src/main/java/org/graylog/plugins/views/search/engine/QueryEngine.java # graylog2-server/src/main/java/org/graylog/plugins/views/search/engine/SearchExecutor.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good. I cannot test it completely, because I am missing a data tier setup right now, but the explain functionality works fine for me. I just added some suggestions for code (structure) improvements inline.
...sticsearch7/src/main/java/org/graylog/storage/elasticsearch7/views/ElasticsearchBackend.java
Outdated
Show resolved
Hide resolved
.build(); | ||
final Search search = Search.builder().queries(ImmutableSet.of(query)).build(); | ||
final SearchJob job = new SearchJob("deadbeef", search, "admin", "test-node-id"); | ||
final GeneratedQueryContext generatedQueryContext = backend.generate(query, Set.of(), DateTimeZone.UTC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is ViewsUtils.createTestContext
now, which makes creating a context a bit easier (and also future signature changes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review 🙏
I don't see how we could pass the actual query using ViewsUtils.createTestContext()
@@ -240,9 +244,21 @@ const Widget = ({ id, editing, widget, title, position, onPositionsChange }: Pro | |||
const { config } = widget; | |||
const isFocused = focusedWidget?.id === id; | |||
|
|||
const view = useView(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would second that, to prevent the widget component getting too cluttered.
...og2-web-interface/src/views/components/searchbar/queryvalidation/WarmTierQueryValidation.tsx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UI looks good to me.
There one comment that still need to be addressed
https://github.com/Graylog2/graylog2-server/pull/17992/files#r1524540279
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested and looks good!
Merging this now. |
This PR adds Warm Tier warnings for dashboard widgets and search validation.
The data tiering feature needs a way to indicate to the user that a search request will be reaching into indices that are stored in the warm tier.
Those requests are possibly slow and expensive, so we want to inform the user about this fact.
This new endpoint can be queried from the UI with a regular search request, before it is executed.
It returns which indices will be used for the search and whether those indices are stored in the warm tier or not.
Furthermore it contains the timerange info for each index, so the frontend can give the user suggestions on how to shorten the query so it does not reach the warm tiered indices.
This kind of functionality fits perfectly into a more generic "explain" endpoint.
So instead of a adding a special endpoint for just the warm tier index range info, we decided to build a more generic version which also contains a preview of the generated OS/ES query and a list of search validation errors.
Example response:
refs https://github.com/Graylog2/graylog-plugin-enterprise/issues/6453
/jpd Graylog2/graylog-plugin-enterprise#6702
/nocl