diff --git a/_automating-configurations/workflow-templates.md b/_automating-configurations/workflow-templates.md new file mode 100644 index 0000000000..22b0a6e488 --- /dev/null +++ b/_automating-configurations/workflow-templates.md @@ -0,0 +1,144 @@ +--- +layout: default +title: Workflow templates +nav_order: 25 +--- + +# Workflow templates + +OpenSearch provides several workflow templates for some common machine learning (ML) use cases, such as semantic or conversational search. + +You can specify a workflow template when you call the [Create Workflow API]({{site.url}}{{site.baseurl}}/automating-configurations/api/create-workflow/). To provision the workflow, specify `provision=true` as a query parameter. For example, you can configure [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/) by using the `local_neural_sparse_search_bi_encoder` workflow template, as shown in the following request: + +```json +POST /_plugins/_flow_framework/workflow?use_case=local_neural_sparse_search_bi_encoder +``` +{% include copy-curl.html %} + +The workflow created using this template performs the following configuration steps: + +- Deploys the default pretrained sparse encoding model (`amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`) +- Creates an ingest pipeline that contains a `sparse_encoding` processor, which converts the text in a document field to vector embeddings using the deployed model +- Creates a sample index for sparse search, specifying the default pipeline as the newly created ingest pipeline + +## Parameters + +Each workflow template has a defined schema and a set of APIs with predefined defaults for each step. For more information about template parameter default values, see [Supported workflow templates](#supported-workflow-templates). + +### Overwriting default values + +To overwrite the default values, provide the new values in the request body when sending a create workflow request. For example, the following request changes the Cohere model, the name of the `text_embedding` processor output field, and the name of the sparse index: + +```json +POST /_plugins/_flow_framework/workflow?use_case=semantic_search_with_cohere_embedding +{ + "create_connector.model" : "embed-multilingual-v3.0", + "text_embedding.field_map.output": "book_embedding", + "create_index.name": "sparse-book-index" +} +``` +{% include copy-curl.html %} + +## Example + +In this example, you'll configure the `semantic_search_with_cohere_embedding` workflow template. The workflow created using this template performs the following configuration steps: + +- Deploys a Cohere externally hosted model +- Creates an ingest pipeline using the model +- Creates a sample k-NN index and configures a search pipeline to define the default model ID for that index + +### Step 1: Create and provision the workflow + +Send the following request to create and provision a workflow using the `semantic_search_with_cohere_embedding` workflow template. The only required request body field for this template is the API key for the Cohere Embed model: + +```json +POST /_plugins/_flow_framework/workflow?use_case=semantic_search_with_cohere_embedding&provision=true +{ + "create_connector.credential.key" : "" +} +``` +{% include copy-curl.html %} + +OpenSearch responds with a workflow ID for the created workflow: + +```json +{ + "workflow_id" : "8xL8bowB8y25Tqfenm50" +} +``` + +The workflow in the previous step creates a default k-NN index. The default index name is `my-nlp-index`: + +```json +{ + "create_index.name": "my-nlp-index" +} +``` + +For all default parameter values for this workflow template, see [Cohere Embed semantic search defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-semantic-search-defaults.json). + +### Step 2: Ingest documents into the index + +To ingest documents into the index created in the previous step, send the following request: + +```json +PUT /my-nlp-index/_doc/1 +{ + "passage_text": "Hello world", + "id": "s1" +} +``` +{% include copy-curl.html %} + +### Step 3: Perform vector search + +To perform a vector search on your index, use a [`neural` query]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/) clause: + +```json +GET /my-nlp-index/_search +{ + "_source": { + "excludes": [ + "passage_embedding" + ] + }, + "query": { + "neural": { + "passage_embedding": { + "query_text": "Hi world", + "k": 100 + } + } + } +} +``` +{% include copy-curl.html %} + +## Viewing workflow resources + +The workflow you created provisioned all the necessary resources for semantic search. To view the provisioned resources, call the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/) and provide the `workflowID` for your workflow: + +```json +GET /_plugins/_flow_framework/workflow/8xL8bowB8y25Tqfenm50/_status +``` +{% include copy-curl.html %} + +## Supported workflow templates + +| Template name | Description | Required parameters | Defaults | +| `bedrock-titan-embedding_model_deploy` | Creates and deploys an Amazon Bedrock embedding model (by default, `titan-embed-text-v1`).| `create_connector.credential.access_key`, `create_connector.credential.secret_key`, `create_connector.credential.session_token` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/bedrock-titan-embedding-defaults.json)| +| `bedrock-titan-multimodal_model_deploy ` | Creates and deploys an Amazon Bedrock multimodal embedding model (by default, `titan-embed-image-v1`). | `create_connector.credential.access_key`, `create_connector.credential.secret_key`, `create_connector.credential.session_token` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/bedrock-titan-multimodal-defaults.json). | +| `cohere-embedding_model_deploy`| Creates and deploys a Cohere embedding model (by default, `embed-english-v3.0`). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-defaults.json) | +| `cohere-chat_model_deploy` | Creates and deploys a Cohere chat model (by default, Cohere Command). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-chat-defaults.json) | +| `open_ai_embedding_model_deploy` | Creates and deploys an OpenAI embedding model (by default, `text-embedding-ada-002`). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/openai-embedding-defaults.json) | +| `openai-chat_model_deploy` | Creates and deploys an OpenAI chat model (by default, `gpt-3.5-turbo`). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/openai-chat-defaults.json) | +| `local_neural_sparse_search_bi_encoder` | Configures [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/):
- Deploys a pretrained sparse encoding model
- Creates an ingest pipeline with a sparse encoding processor
- Creates a sample index to use for sparse search, specifying the newly created pipeline as default pipeline | None |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/local-sparse-search-biencoder-defaults.json) | +| `semantic_search` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/):
- Creates an ingest pipeline with a `text_embedding` processor and a k-NN index
You must provide a model ID of the text embedding model to use. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) | +| `semantic_search_with_query_enricher` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) similarly to the `semantic_search` template. Adds a [`query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) search processor that sets a default model ID is defaulted for neural queries. You must provide a model ID of the text embedding model to use. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-query-enricher-defaults.json) | +| `semantic_search_with_cohere_embedding` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) and deploys a Cohere embedding model. Adds a [`query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) search processor that sets a default model ID is defaulted for neural queries. You must provide the API key for the Cohere model. | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-semantic-search-defaults.json) | +| `multi_modal_search` | Configures an ingest pipeline with a `text_image_embedding` processor and a k-NN index for [multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/multimodal-search/). You must provide a model ID of the multimodal embedding model to use. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/multi-modal-search-defaults.json) | +| `multi_modal_search_with_bedrock_titan_multi_modal` | Deploys an Amazon Bedrock multimodal model and configures an ingest pipeline with a `text_image_embedding` processor and a k-NN index for [multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/multimodal-search/). You must provide your AWS credentials. | `create_connector.credential.access_key`, `create_connector.credential.secret_key`, `create_connector.credential.session_token` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/multimodal-search-bedrock-titan-defaults.json) | +| `hybrid_search` | Configures [hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/):
- Creates an ingest pipeline, a k-NN index and a search pipeline with a `normalization_processor`. You must provide a model ID of the text embedding model to use. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/hybrid-search-defaults.json) | +| `conversational_search_with_llm_deploy` | Deploys an LLM model (by default, Cohere Chat) and configures a search pipeline with a `retrieval_augmented_generation` processor for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/conversational-search-defaults.json) | + +