8343689: AArch64: Optimize MulReduction implementation #225
+320
−110
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add a reduce_mul intrinsic SVE specialization for >= 256-bit long vectors. It multiplies halves of the source vector using SVE instructions to get to a 128-bit long vector that fits into a SIMD&FP register. After that point, existing ASIMD implementation is used.
Benchmarks results for an AArch64 CPU with support for SVE with 256-bit vector length:
Benchmarks results for an AArch64 CPU with support for SVE with 512-bit vector length:
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/panama-vector.git pull/225/head:pull/225
$ git checkout pull/225
Update a local copy of the PR:
$ git checkout pull/225
$ git pull https://git.openjdk.org/panama-vector.git pull/225/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 225
View PR using the GUI difftool:
$ git pr show -t 225
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/panama-vector/pull/225.diff
Using Webrev
Link to Webrev Comment