LTR plugin digest #26

noCharger · 2023-11-15T17:02:47Z

Workflow

Core mapping: Grade (from judgment) - Features (feature name 1, feature name 2, ...) - document identifier

Sequence Diagram

Step 1: Create ltr index

ltr index conatains metadata about features and models

curl -X PUT "localhost:9200/_ltr"
{"acknowledged":true,"shards_acknowledged":true,"index":".ltrstore"}%

Step 2: Create feature set

Features are templated OpenSearch Queries. Users can select and experiment with features.

A feature set is a list of features (with unique names) that has been grouped together for logging & model evaluation.

POST _ltr/_featureset/more_movie_features
{
   "featureset": {
        "features": [
            {
                "name": "title_query",
                "params": [
                    "keywords"
                ],
                "template_language": "mustache",
                "template": {
                    "match": {
                        "title": "{{keywords}}"
                    }
                }
            },
            {
                "name": "title_query_boost",
                "params": [
                    "some_multiplier"
                ],
                "template_language": "derived_expression",
                "template": "title_query * some_multiplier"
            },
            {
                "name": "custom_title_query_boost",
                "params": [
                    "some_multiplier"
                ],
                "template_language": "script_feature",
                "template": {
                    "lang": "painless",
                    "source": "params.feature_vector.get('title_query') * (long)params.some_multiplier",
                    "params": {
                        "some_multiplier": "some_multiplier"
                    }
                }
            }
        ]
   }
}

Step 3: Logging feature values with docs

POST tmdb/_search
{
    "query": {
        "bool": {
            "filter": [
                {
                    "terms": {
                        "_id": ["7555", "1370", "1369"]
                    }
                },
                {
                    "sltr": {
                        "_name": "logged_featureset",
                        "featureset": "more_movie_features",
                        "params": {
                            "keywords": "rambo"
                        }
                }}
            ]
        }
    },
    "ext": {
        "ltr_log": {
            "log_specs": {
                "name": "log_entry1",
                "named_query": "logged_featureset"
            }
        }
    }
}

The SLTR query is rewritten into a ranker query, which has a list of disjunct queries, each of which is rewritten from features (Query Phase).
Use a named query (_name) to label all docs the SLTR query matched (MatchedQueriesPhase)
Ranker query has a HitLogConsumer to log features (feature name, score as the value) by append DocumentField on each SearchHit (LoggingFetchSubPhase)

        public void process(HitContext hitContext) throws IOException {
            if (scorer != null && scorer.iterator().advance(hitContext.docId()) == hitContext.docId()) {
                loggers.forEach((l) -> l.nextDoc(hitContext.hit()));
                // Scoring will trigger log collection
                scorer.score();
            }
        }

        void nextDoc(SearchHit hit) {
            DocumentField logs = hit.getFields().get(FIELD_NAME);
            if (logs == null) {
                logs = newLogField();
                hit.setDocumentField(FIELD_NAME, logs);
            }
            Map<String, List<Map<String, Object>>> entries = logs.getValue();
            rebuild();
            currentHit = hit;
            entries.put(name, currentLog);
        }

Logs in search response

"fields": {
          "_ltrlog": [
            {
              "log_entry1": [
                {
                  "name": "1",
                  "value": 0.25069216
                },
                {
                  "name": "2",
                  "value": 0.226041
                }
              ]
            }
          ]
        },

Search with models

Rescore Phase with SLTR Query: In the rescore phase, you apply the SLTR model to rerank the top documents returned by the query phase based on the features defined in your learning-to-rank model.

The text was updated successfully, but these errors were encountered:

msfroh · 2023-11-15T17:11:29Z

Should this be part of a plugin README? Maybe contribute to the doc website once we launch?

macohen · 2023-11-20T12:44:56Z

Generally, I see the "how to use it" documentation in the doc website. The how to build it/details of how it works should go into the repo as a README (great idea). The sequence diagram should be part of the README along with details about the code. Requests and responses should go into the docs. Also, even if we just have a self-install plugin, but it works we can add this to the documentation site.

@noCharger do you want to make an attempt at this separation when you get a chance? BTW, nice job on the diagram. Maybe good for a review in an upcoming public search relevance meeting.

cc: @epugh for any feedback...

github-project-automation bot added this to Search Project Board Nov 15, 2023

github-project-automation bot moved this to 🆕 New in Search Project Board Nov 15, 2023

noCharger moved this from 🆕 New to 👀 In review in Search Project Board Nov 15, 2023

github-actions bot added the untriaged label Nov 15, 2023

msfroh added documentation Improvements or additions to documentation and removed untriaged labels Nov 15, 2023

noCharger mentioned this issue Dec 13, 2023

[FEATURE] Support remote inference on LTR plugin #27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LTR plugin digest #26

LTR plugin digest #26

noCharger commented Nov 15, 2023 •

edited

Loading

msfroh commented Nov 15, 2023 •

edited

Loading

macohen commented Nov 20, 2023

LTR plugin digest #26

LTR plugin digest #26

Comments

noCharger commented Nov 15, 2023 • edited Loading

Workflow

Sequence Diagram

Step 1: Create ltr index

Step 2: Create feature set

Step 3: Logging feature values with docs

Search with models

msfroh commented Nov 15, 2023 • edited Loading

macohen commented Nov 20, 2023

noCharger commented Nov 15, 2023 •

edited

Loading

msfroh commented Nov 15, 2023 •

edited

Loading