Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Half factorization #1712

Merged
merged 9 commits into from
Dec 3, 2024
Merged

Half factorization #1712

merged 9 commits into from
Dec 3, 2024

Conversation

yhmtsai
Copy link
Member

@yhmtsai yhmtsai commented Oct 25, 2024

this pr adds the factorization with half support.

Hip does not support atomic on the 16bits type currently

NVHPC 23.3 seems to handle assignment index with optimization wrongly on a custom class when IndexType is long. We set the index explicitly with volatile to solve it. NVHPC24.1 seem to fixed this issue.
https://godbolt.org/z/srYhGndKn

TODO:

  • add the fix of tri solve with half

@yhmtsai yhmtsai added the 1:ST:WIP This PR is a work in progress. Not ready for review. label Oct 25, 2024
@yhmtsai yhmtsai self-assigned this Oct 25, 2024
@ginkgo-bot ginkgo-bot added reg:testing This is related to testing. type:solver This is related to the solvers type:factorization This is related to the Factorizations reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. mod:all This touches all Ginkgo modules. labels Oct 25, 2024
@yhmtsai yhmtsai force-pushed the half_factorization branch from 3db59fd to cd9677a Compare October 28, 2024 16:12
@yhmtsai yhmtsai force-pushed the half_factorization branch from cd9677a to 5e5cd03 Compare October 28, 2024 17:19
@yhmtsai yhmtsai force-pushed the half_factorization branch from 5e5cd03 to c276034 Compare October 29, 2024 09:17
@yhmtsai yhmtsai force-pushed the half_factorization branch from c276034 to bbefde6 Compare October 29, 2024 18:21
@yhmtsai yhmtsai mentioned this pull request Oct 30, 2024
12 tasks
@yhmtsai yhmtsai added this to the Ginkgo 1.9.0 milestone Oct 30, 2024
@yhmtsai yhmtsai force-pushed the half_factorization branch from bbefde6 to 72d9d50 Compare November 4, 2024 14:24
@yhmtsai yhmtsai force-pushed the half_factorization branch from 72d9d50 to 88967e6 Compare November 4, 2024 18:15
@yhmtsai yhmtsai added 1:ST:ready-for-review This PR is ready for review and removed 1:ST:WIP This PR is a work in progress. Not ready for review. labels Nov 5, 2024
@yhmtsai yhmtsai force-pushed the half_factorization branch from 88967e6 to e667ec0 Compare November 5, 2024 18:03
@yhmtsai yhmtsai force-pushed the half_solver branch 2 times, most recently from 50ae4c1 to bba40e0 Compare November 7, 2024 14:40
@yhmtsai yhmtsai force-pushed the half_factorization branch from e667ec0 to c32201d Compare November 7, 2024 14:40
@MarcelKoch MarcelKoch self-requested a review November 11, 2024 11:25
Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. I have a question regarding atomics and hip. The latest ROCm shows support for fp16 atomic operations: https://rocm.docs.amd.com/en/latest/reference/precision-support.html#atomic-operations-support, but TBH I can't figure out what operations exactly they mean with that. Did you try anything in that regard?

PairTypenameNameGenerator);


TYPED_TEST(ParIlut, KernelThresholdSelectIsEquivalentToRef)
{
using value_type = typename TestFixture::value_type;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of the tests here are missing SKIP_HALF if compiling for HIP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not support compute_l_u_factors in hip, but the others still works with half precision in HIP

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got your meaning now

@@ -212,13 +212,15 @@ struct CudaSolveStruct : gko::solver::SolveStruct {

size_type work_size{};

// TODO: In nullptr is considered nullptr_t not casted to const
// it does not work in cuda110/100 images
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
// it does not work in cuda110/100 images
// Explicitly cast `nullptr` to `const ValueType*` to prevent compiler issues with cuda 10/11

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is more on the host compiler side because it goes through our binding first with specfic type

cuda/solver/common_trs_kernels.cuh Outdated Show resolved Hide resolved
cuda/solver/common_trs_kernels.cuh Outdated Show resolved Hide resolved
cuda/solver/common_trs_kernels.cuh Outdated Show resolved Hide resolved
cuda/solver/common_trs_kernels.cuh Outdated Show resolved Hide resolved
cuda/solver/common_trs_kernels.cuh Outdated Show resolved Hide resolved
hip/components/memory.hip.hpp Outdated Show resolved Hide resolved
reference/factorization/par_ilut_kernels.cpp Outdated Show resolved Hide resolved
test/factorization/lu_kernels.cpp Show resolved Hide resolved
@yhmtsai yhmtsai force-pushed the half_factorization branch 2 times, most recently from d66627a to b58712a Compare November 30, 2024 01:30
@yhmtsai yhmtsai force-pushed the half_solver branch 2 times, most recently from 3a98c11 to ac216bc Compare November 30, 2024 18:36
@yhmtsai yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. 1:ST:skip-full-test and removed 1:ST:ready-for-review This PR is ready for review labels Dec 3, 2024
Base automatically changed from half_solver to develop December 3, 2024 01:22
@yhmtsai yhmtsai force-pushed the half_factorization branch from 53a1d80 to e0e42b0 Compare December 3, 2024 01:24
@yhmtsai yhmtsai merged commit 304755d into develop Dec 3, 2024
7 of 11 checks passed
@yhmtsai yhmtsai deleted the half_factorization branch December 3, 2024 01:26
@ginkgo-bot
Copy link
Member

Error: PR already merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-to-merge This PR is ready to merge. 1:ST:skip-full-test mod:all This touches all Ginkgo modules. reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. reg:testing This is related to testing. type:factorization This is related to the Factorizations type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants