-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation for k-NN Faiss SQfp16 #6249
Add documentation for k-NN Faiss SQfp16 #6249
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @naveentatikonda! A couple of suggestions, and then we'll move this PR to editorial review.
f73a858
to
9ef2e13
Compare
@kolchfa-aws Thanks for reviewing it. I have addressed your review comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws Please see my comments and changes and let me know if you have any questions. Thanks!
@kolchfa-aws @natebower I need to make more changes to this existing documentation. Will address all the review comments and update it on Monday. |
Unfortunately, we need to postpone this feature to 2.13 due to some build related issues. @kolchfa-aws can you pls help to update the labels on the PR and github issue. Thanks! |
@naveentatikonda - Has anything changed, or is this content good to go? Thanks! |
This documentation needs to be updated. I will make changes this week. Thanks! |
9ef2e13
to
3541335
Compare
fd2db18
to
7459e8d
Compare
Signed-off-by: Naveen Tatikonda <[email protected]>
7459e8d
to
8980923
Compare
@naveentatikonda I addressed your comments. |
Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
…tion-website into add_knn_sqfp16
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
@kolchfa-aws can this be merged? |
@jmazanec15 @naveentatikonda requested a tech review on this PR. Once the tech review is done, we will do an editorial review and then we'll merge. |
@jmazanec15 I'm waiting for @vamshin to review this PR before moving it to editorial review |
_search-plugins/knn/knn-index.md
Outdated
## Lucene byte vector | ||
|
||
Starting with k-NN plugin version 2.9, you can use `byte` vectors with the `lucene` engine in order to reduce the amount of storage space needed. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector). | ||
|
||
## SIMD optimization for the Faiss engine | ||
|
||
Starting with version 2.13, the k-NN plugin supports [Single Instruction Multiple Data (SIMD)](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data) processing if the underlying hardware supports SIMD instructions (AVX2 on x64 architecture and Neon on ARM64 architecture). SIMD is supported by default on Linux machines only for the Faiss engine. SIMD architecture helps boost the overall performance by improving indexing throughput and reducing search latency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SIMD is supported by default on Linux machines only for the Faiss engine.
SIMD should be CPU architecture dependent right? Why do we say only Linux machine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, SIMD is CPU architecture dependent. But, right now we are running into some issues on Windows OS due to some limitations with compiler and supporting SIMD for linux OS and mac OS (for development only). So, that's the reason we are explicitly calling it out that it works on linux.
_search-plugins/knn/knn-index.md
Outdated
You can use encoders to reduce the memory footprint of a k-NN index at the expense of search accuracy. faiss has | ||
several encoder types, but the plugin currently only supports *flat* and *pq* encoding. | ||
You can use encoders to reduce the memory footprint of a k-NN index at the expense of search accuracy. Faiss has | ||
several encoder types, but the plugin currently only supports `flat`, `pq`, and `sq` encoding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Faiss has
several encoder types, but the plugin currently only supports flat
, pq
, and sq
encoding
k-NN plugin currently supports flat
, pq
, and sq
encoders from Faiss library?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
|
||
Parameter name | Required | Default | Updatable | Description | ||
:--- | :--- | :-- | :--- | :--- | ||
`type` | false | `fp16` | false | The type of scalar quantization to be used to encode 32-bit float vectors into the corresponding type. As of OpenSearch 2.13, only the `fp16` encoder type is supported. For the `fp16` encoder, vector values must be in the [-65504.0, 65504.0] range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the
fp16
encoder, vector values must be in the [-65504.0, 65504.0] range.
By default fp16
encoder expects vector values to be in the [-65504.0, 65504.0] range.
Also lets add above as Note and probably bold/highlight
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We normally don't format sentences as a note in the parameter table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. Shall we add a note about this inside faiss scalar quantization section ?
_search-plugins/knn/knn-index.md
Outdated
@@ -221,6 +322,8 @@ If you want to use less memory and index faster than HNSW, while maintaining sim | |||
|
|||
If memory is a concern, consider adding a PQ encoder to your HNSW or IVF index. Because PQ is a lossy encoding, query quality will drop. | |||
|
|||
If you want to reduce the memory requirements by a factor of 2 (with very minimal loss of search quality) or by a factor of 4 (with a significant drop in search quality), consider vector quantization. To learn more about vector quantization options, see [k-NN vector quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can reduce the memory footprint by factor of 2 by using fp_16 encoder technique(provide link?) with minimal loss in search quality. If your vector dimensions fit in the byte range [-128, 128] we recommend using byte quantizer(provide link?) to cut down memory footprint by factor of 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The byte range is [-128, 127], correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, byte range is [-128 to 127]
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naveentatikonda @kolchfa-aws Please see my comments and changes and let me know if you have any questions. Thanks!
|
||
Optionally, you can specify the parameters in `method.parameters.encoder`. For more information about parameters within the `encoder` object, see [SQ parameters]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#sq-parameters). | ||
|
||
The `fp16` encoder converts 32-bit vectors into their 16-bit counterparts. For this encoder type, the vector values must be in the [-65504.0, 65504.0] range. To define handling out-of-range values, the preceding request specifies the `clip` parameter. By default, this parameter is `false` and any vectors containing out-of-range values are rejected. When `clip` is set to `true` (as in the preceding request), out-of-range vector values are rounded up or down so that they are in the supported range. For example, if the original 32-bit vector is `[65510.82, -65504.1]`, the vector will indexed as a 16-bit vector `[65504.0, -65504.0]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we mean by "To define handling"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded.
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Description
Add documentation for the new k-NN faiss encoder
SQfp16
which quantizes 32 bit float vectors into 16 bit float values using Scalar Quantization results in memory optimization with a very minimal loss of precision. It also boosts the overall performance by enabling the SIMD support(vector dimension must be multiple of8
) on Linux and Mac OS.Issues Resolved
Closes #5038
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.