Skip to content
This repository has been archived by the owner on Aug 16, 2022. It is now read-only.

Commit

Permalink
Merge pull request #20 from opendistro/master
Browse files Browse the repository at this point in the history
merge
  • Loading branch information
ashwinkumar12345 authored Feb 18, 2021
2 parents 47ddf27 + 6db6437 commit dc77c72
Show file tree
Hide file tree
Showing 34 changed files with 693 additions and 159 deletions.
15 changes: 8 additions & 7 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,22 @@ source "https://rubygems.org"
#
# This will help ensure the proper Jekyll version is running.
# Happy Jekylling!
gem "jekyll", "~> 3.8.5"
# gem "jekyll", "~> 3.9.0"

# This is the default theme for new Jekyll sites. You may change this to anything you like.
gem "just-the-docs", "~> 0.3.3"

# If you want to use GitHub Pages, remove the "gem "jekyll"" above and
# uncomment the line below. To upgrade, run `bundle update github-pages`.
# gem "github-pages", group: :jekyll_plugins

gem 'github-pages', group: :jekyll_plugins

# If you have any plugins, put them here!
group :jekyll_plugins do
# gem "jekyll-feed", "~> 0.6"
gem "jekyll-remote-theme"
gem "jekyll-redirect-from"
end
# group :jekyll_plugins do
# # gem "jekyll-feed", "~> 0.6"
# gem "jekyll-remote-theme"
# gem "jekyll-redirect-from"
# end

# Windows does not include zoneinfo files, so bundle the tzinfo-data gem
gem "tzinfo-data", platforms: [:mingw, :mswin, :x64_mingw, :jruby]
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,10 +201,14 @@ If you're making major changes to the documentation and need to see the rendered

Use `curl -XGET https://localhost:9200 -u admin:admin -k` to verify the Elasticsearch version.

1. Update the plugin compatibility table in `docs/install/plugin.md` and `docs/kibana/plugins.md`.
1. Update the plugin compatibility table in `docs/install/plugin.md`.

Use `curl -XGET https://localhost:9200/_cat/plugins -u admin:admin -k` to get the correct version strings.

1. Update the plugin compatibility table in `docs/kibana/plugins.md`.

Use `docker ps` to find the ID for the Kibana node. Then use `docker exec -it <kibana-node-id> /bin/bash` to get shell access. Finally, run `./bin/kibana-plugin list` to get the plugins and version strings.

1. Run a build (`build.sh`), and look for any warnings or errors you introduced.
1. Verify that the individual plugin download links in `docs/install/plugins.md` and `docs/kibana/plugins.md` work.
1. Check for any other bad links (`check-links.sh`). Expect a few false positives for the `localhost` links.
Expand Down
4 changes: 2 additions & 2 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ baseurl: "/for-elasticsearch-docs" # the subpath of your site, e.g. /blog
url: "https://opendistro.github.io" # the base hostname & protocol for your site, e.g. http://example.com
permalink: pretty

odfe_version: 1.12.0
es_version: 7.10.0
odfe_version: 1.13.0
es_version: 7.10.2

# Build settings
markdown: kramdown
Expand Down
251 changes: 251 additions & 0 deletions docs/async/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,251 @@
---
layout: default
title: Asynchronous Search
nav_order: 51
has_children: true
---

# Asynchronous Search

Searching large volumes of data can take a long time, especially if you're searching across warm nodes or multiple remote clusters.

Asynchronous search lets you run search requests that run in the background. You can monitor the progress of these searches and get back partial results as they become available. After the search finishes, you can save the results to examine at a later time.

## REST API

To perform an asynchronous search, send requests to `_opendistro/_asynchronous_search`, with your query in the request body:

```json
POST _opendistro/_asynchronous_search
```

You can specify the following options.

Options | Description | Default value | Required
:--- | :--- |:--- |:--- |
`wait_for_completion_timeout` | Specifies the amount of time that you plan to wait for the results. You can see whatever results you get within this time just like in a normal search. You can poll the remaining results based on an ID. The maximum value is 300 seconds. | 1 second | No
`keep_on_completion` | Specifies whether you want to save the results in the cluster after the search is complete. You can examine the stored results at a later time. | `false` | No
`keep_alive` | Specifies the amount of time that the result is saved in the cluster. For example, `2d` means that the results are stored in the cluster for 48 hours. The saved search results are deleted after this period or if the search is cancelled. Note that this includes the query execution time. If the query overruns this time, the process cancels this query automatically. | 12 hours | No

#### Sample request

```json
POST _opendistro/_asynchronous_search/?pretty&size=10&wait_for_completion_timeout=1ms&keep_on_completion=true&request_cache=false
{
"aggs": {
"city": {
"terms": {
"field": "city",
"size": 10
}
}
}
}
```

#### Sample response

```json
{
"*id*": "FklfVlU4eFdIUTh1Q1hyM3ZnT19fUVEUd29KLWZYUUI3TzRpdU5wMjRYOHgAAAAAAAAABg==",
"state": "RUNNING",
"start_time_in_millis": 1599833301297,
"expiration_time_in_millis": 1600265301297,
"response": {
"took": 15,
"timed_out": false,
"terminated_early": false,
"num_reduce_phases": 4,
"_shards": {
"total": 21,
"successful": 4,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 807,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"city": {
"doc_count_error_upper_bound": 16,
"sum_other_doc_count": 403,
"buckets": [
{
"key": "downsville",
"doc_count": 1
},
....
....
....
{
"key": "blairstown",
"doc_count": 1
}
]
}
}
}
}
```

#### Response parameters

Options | Description
:--- | :---
`id` | The ID of an asynchronous search. Use this ID to monitor the progress of the search, get its partial results, and/or delete the results. If the asynchronous search finishes within the timeout period, the response doesn't include the ID because the results aren't stored in the cluster.
`state` | Specifies whether the search is still running or if it has finished, and if the results persist in the cluster. The possible states are `RUNNING`, `COMPLETED`, and `PERSISTED`.
`start_time_in_millis` | The start time in milliseconds.
`expiration_time_in_millis` | The expiration time in milliseconds.
`took` | The total time that the search is running.
`response` | The actual search response.
`num_reduce_phases` | The number of times that the coordinating node aggregates results from batches of shard responses (5 by default). If this number increases compared to the last retrieved results, you can expect additional results to be included in the search response.
`total` | The total number of shards that run the search.
`successful` | The number of shard responses that the coordinating node received successfully.
`aggregations` | The partial aggregation results that have been completed by the shards so far.

## Get partial results

After you submit an asynchronous search request, you can request partial responses with the ID that you see in the asynchronous search response.

```json
GET _opendistro/_asynchronous_search/<ID>?pretty
```

#### Sample response

```json
{
"id": "Fk9lQk5aWHJIUUltR2xGWnpVcWtFdVEURUN1SWZYUUJBVkFVMEJCTUlZUUoAAAAAAAAAAg==",
"state": "STORE_RESIDENT",
"start_time_in_millis": 1599833907465,
"expiration_time_in_millis": 1600265907465,
"response": {
"took": 83,
"timed_out": false,
"_shards": {
"total": 20,
"successful": 20,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1000,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "bank",
"_type": "_doc",
"_id": "1",
"_score": 1,
"_source": {
"email": "[email protected]",
"city": "Brogan",
"state": "IL"
}
},
{....}
]
},
"aggregations": {
"city": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 997,
"buckets": [
{
"key": "belvoir",
"doc_count": 2
},
{
"key": "aberdeen",
"doc_count": 1
},
{
"key": "abiquiu",
"doc_count": 1
}
]
}
}
}
}
```

After the response is successfully persisted, you get back the `STORE_RESIDENT` state in the response.

You can poll the ID with the `wait_for_completion_timeout` parameter to wait for the results received for the time that you specify.

For asynchronous searches with `keep_on_completion` as `true` and a sufficiently long `keep_alive` time, you can keep polling the IDs until the search finishes. If you don’t want to periodically poll each ID, you can retain the results in your cluster with the `keep_alive` parameter and come back to it at a later time.

## Delete searches and results

You can use the DELETE API operation to delete any ongoing asynchronous search by its ID. If the search is still running, it’s canceled. If the search is complete, the saved search results are deleted.

```json
DELETE _opendistro/_asynchronous_search/<ID>?pretty
```

#### Sample response

```json
{
"acknowledged": "true"
}
```

## Monitor stats

You can use the stats API operation to monitor asynchronous searches that are running, completed, and/or persisted.

```json
GET _opendistro/_asynchronous_search/stats
```

#### Sample response

```json
{
"_nodes": {
"total": 8,
"successful": 8,
"failed": 0
},
"cluster_name": "264071961897:asynchronous-search",
"nodes": {
"JKEFl6pdRC-xNkKQauy7Yg": {
"asynchronous_search_stats": {
"submitted": 18236,
"initialized": 112,
"search_failed": 56,
"search_completed": 56,
"rejected": 18124,
"persist_failed": 0,
"cancelled": 1,
"running_current": 399,
"persisted": 100
}
}
}
}
```

#### Response parameters

Options | Description
:--- | :---
`submitted` | The number of asynchronous search requests that were submitted.
`initialized` | The number of asynchronous search requests that were initialized.
`rejected` | The number of asynchronous search requests that were rejected.
`search_completed` | The number of asynchronous search requests that completed with a successful response.
`search_failed` | The number of asynchronous search requests that completed with a failed response.
`persisted` | The number of asynchronous search requests whose final result successfully persisted in the cluster.
`persist_failed` | The number of asynchronous search requests whose final result failed to persist in the cluster.
`running_current` | The number of asynchronous search requests that are running on a given coordinator node.
`cancelled` | The number of asynchronous search requests that were canceled while the search was running.
76 changes: 76 additions & 0 deletions docs/async/security.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
layout: default
title: Asynchronous Search Security
nav_order: 2
parent: Asynchronous Search
has_children: false
---

# Asynchronous search security

You can use the security plugin with asynchronous searches to limit non-admin users to specific actions. For example, you might want some users to only be able to submit or delete asynchronous searches, while you might want others to only view the results.

All asynchronous search indices are protected as system indices. Only a super admin user or an admin user with a Transport Layer Security (TLS) certificate can access system indices. For more information, see [System indices](../../security/configuration/system-indices/).

## Basic permissions

As an admin user, you can use the security plugin to assign specific permissions to users based on which API operations they need access to. For a list of supported APIs operations, see [Asynchronous search](../).

The security plugin has two built-in roles that cover most asynchronous search use cases: `asynchronous_search_full_access` and `asynchronous_search_read_access`. For descriptions of each, see [Predefined roles](../../security/access-control/users-roles/#predefined-roles).

If these roles don’t meet your needs, mix and match individual asynchronous search permissions to suit your use case. Each action corresponds to an operation in the REST API. For example, the `cluster:admin/opendistro/asynchronous_search/delete` permission lets you delete a previously submitted asynchronous search.

## (Advanced) Limit access by backend role

Use backend roles to configure fine-grained access to asynchronous searches based on roles. For example, users of different departments in an organization can view asynchronous searches owned by their own department.

First, make sure that your users have the appropriate [backend roles](../../security/access-control/). Backend roles usually come from an [LDAP server](../../security/configuration/ldap/) or [SAML provider](../../security/configuration/saml/). However, if you use the internal user database, you can use the REST API to [add them manually](../../security/access-control/api/#create-user).

Now when users view asynchronous search resources in Kibana (or make REST API calls), they only see asynchronous searches that are submitted by users who have a subset of the backend role.
For example, consider two users: `judy` and `elon`.

`judy` has an IT backend role:

```json
PUT _opendistro/_security/api/internalusers/judy
{
"password": "judy",
"backend_roles": [
"IT"
],
"attributes": {}
}
```

`elon` has an admin backend role:

```json
PUT _opendistro/_security/api/internalusers/elon
{
"password": "elon",
"backend_roles": [
"admin"
],
"attributes": {}
}
```

Both `judy` and `elon` have full access to asynchronous search:

```json
PUT _opendistro/_security/api/rolesmapping/async_full_access
{
"backend_roles": [],
"hosts": [],
"users": [
"judy",
"elon"
]
}
```

Because they have different backend roles, an asynchronous search submitted by `judy` will not be visible to `elon` and vice versa.

`judy` needs to have at least the superset of all roles that `elon` has to see `elon`'s asynchronous searches.

For example, if `judy` has five backend roles and `elon` one has one of these roles, then `judy` can see asynchronous searches submitted by `elon`, but `elon` can’t see the asynchronous searches submitted by `judy`. This means that `judy` can perform GET and DELETE operations on asynchronous searches that are submitted by `elon`, but not the reverse.
Loading

0 comments on commit dc77c72

Please sign in to comment.