Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Float.toBits and Float.fromBits #6094

Merged
merged 1 commit into from
Nov 15, 2024
Merged

feat: add Float.toBits and Float.fromBits #6094

merged 1 commit into from
Nov 15, 2024

Conversation

leodemoura
Copy link
Member

This PR adds raw transmutation of floating-point numbers to and from UInt64. Floats and UInts share the same endianness across all supported platforms. The IEEE 754 standard precisely specifies the bit layout of floats. Note that Float.toBits is distinct from Float.toUInt64, which attempts to preserve the numeric value rather than the bitwise value.

closes #6071

This PR adds raw transmutation of floating-point numbers to and from `UInt64`. Floats and UInts share the same endianness across all supported platforms. The IEEE 754 standard precisely specifies the bit layout of floats. Note that `Float.toBits` is distinct from `Float.toUInt64`, which attempts to preserve the numeric value rather than the bitwise value.
Copy link

@seanmcl seanmcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow that was fast. Thanks!

@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc November 15, 2024 19:44 Inactive
@digama0
Copy link
Collaborator

digama0 commented Nov 15, 2024

I don't think this is sound unless you clear NaN payloads first, or take a proof that the float is a canonical NaN. Because float implementations are nondeterministic in NaN payload bits and this violates the expectation of opaque functions being functional.

@leodemoura leodemoura added this pull request to the merge queue Nov 15, 2024
@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Nov 15, 2024
@leanprover-community-bot
Copy link
Collaborator

Mathlib CI status (docs):

  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 688ee4c88722ca191981246f7732f25caed81cac --onto 9a8543347796e52070ff7936661ae48fcebfea60. (2024-11-15 19:46:29)

@seanmcl
Copy link

seanmcl commented Nov 15, 2024

I don't think this is sound unless you clear NaN payloads first

Canonicalization would be costly. Is what's missing is that no pair of nans should compare equal, even identical bit patterns?

Merged via the queue into master with commit ecbaeff Nov 15, 2024
18 checks passed
@digama0
Copy link
Collaborator

digama0 commented Nov 15, 2024

I don't think this is sound unless you clear NaN payloads first

Canonicalization would be costly. Is what's missing is that no pair of nans should compare equal, even identical bit patterns?

The opposite: two nans with different payloads should be equal according to lean's = relation. This is impossible if you have Float.toBits because UInt64 obviously can distinguish them, unless you make this function unsafe. Effectively, Float is really a quotient type and toBits is a variation on Quot.unquot.

@seanmcl
Copy link

seanmcl commented Nov 15, 2024

Why should they be equal? IEEE-754 semantics is encodable in Lean.

@digama0
Copy link
Collaborator

digama0 commented Nov 15, 2024

If NaNs are not treated as lean-equal, then you have much worse consequences: Float.add and friends cannot be functions of type Float -> Float -> Float at all, because they can produce different results given the same input, which is not something lean functions can do (provably). Taking a quotient allows for the nondeterminism allowed by the IEEE spec to be unobservable in lean which restores soundness, but it does so at the cost of making Float.toBits unsafe unless it targets not UInt64 itself but rather a quotient thereof (where the quotient relation is "these two bit patterns are equal, or they are both encodings of NaN").

IMO the most pragmatic solution which solves peoples' immediate needs here is to just provide this function as unsafe and let people wrap it with a safe function if they decide they don't want to care about this issue.

@nomeata
Copy link
Collaborator

nomeata commented Nov 15, 2024

I don't think this is sound

The function is defined as opaque - doesn't that mean that logically, its an unspecified function, and soundness is not under threat? Or am I missing something here?

@digama0
Copy link
Collaborator

digama0 commented Nov 15, 2024

Even opaque functions have logical constraints, because you can prove for any lean function that x = y -> f x = f y. In this case, the requirement is that when you call a lean function with the same input you get the same output every time, i.e. it's a logically pure function. This is not true if the underlying computation is nondeterministic. IEEE-754 is generally bit-precise, but it is deliberately underspecified around NaN payload bits, and a combination of compiler optimizations and hardware behavior mean that it is possible to observe actual nondeterminism in practice, although it requires some work to get it to appear directly in tests.

@nomeata
Copy link
Collaborator

nomeata commented Nov 15, 2024

Ok, I think get the issue now. Hmm, tricky.

@shigoel
Copy link

shigoel commented Nov 15, 2024

This may be of interest: double-float support in the ACL2 theorem prover:
https://www.cs.utexas.edu/~moore/publications/double-float.pdf

@seanmcl
Copy link

seanmcl commented Nov 15, 2024

Float.add and friends cannot be functions of type Float -> Float -> Float at all, because they can produce different results given the same input

So an example of this concern is something like having distinct NaN bit patterns, nan1 and nan2, and some expression like nan + 7 returns nan1 and nan2 nondeterministically, as allowed by the IEEE spec. I'm not an expert. Is there evidence this really happens on the same processor?

@digama0
Copy link
Collaborator

digama0 commented Nov 16, 2024

Two common reasons you might encounter nondeterminism like that:

  • The processor puts information regarding the instruction on which the failure happened into the payload bits as a kind of exception handling. This is not stable since you can easily perform the same operation on the same inputs in two different places.
  • One of the nan operations was evaluated by the hardware and the other one was evaluated by the compiler via constant propagation, and they differ on NaN handling. This was a recent source of issues in the Rust language, which tries to be correct around this kind of thing, and you can see that the spec they landed on is explicitly nondeterministic because they were not able to promise more than this under the circumstances. (I recommend reading that RFC, it is a good introduction to these issues.)

@digama0
Copy link
Collaborator

digama0 commented Nov 16, 2024

Note that this is not the first time this issue has come up, see #1459 for nondeterminism leakage through the Float.toString function.

@nomeata
Copy link
Collaborator

nomeata commented Nov 16, 2024

Does #6097 address these concerns raised here?

@digama0
Copy link
Collaborator

digama0 commented Nov 16, 2024

Yes it does. Was there also a function exposed for converting float arrays to byte arrays? That might need the same treatment.

JovanGerb pushed a commit to JovanGerb/lean4 that referenced this pull request Jan 21, 2025
This PR adds raw transmutation of floating-point numbers to and from
`UInt64`. Floats and UInts share the same endianness across all
supported platforms. The IEEE 754 standard precisely specifies the bit
layout of floats. Note that `Float.toBits` is distinct from
`Float.toUInt64`, which attempts to preserve the numeric value rather
than the bitwise value.

closes leanprover#6071
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-library Library toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC: Float <-> UInt8/ByteArray conversions
6 participants