Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract ML-KEM SIMD code to C #296

Merged
merged 99 commits into from
Jun 4, 2024
Merged
Changes from 1 commit
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
8c02e75
workarounds for charon
franziskuskiefer May 20, 2024
594ad84
Merge branch 'dev' into franziskus/ml-kem-c-extraction
franziskuskiefer May 20, 2024
9d0c481
refactor: kill `GenericOperations` in favor of functions
W95Psp May 21, 2024
f85d218
refactor: kill strings in unreachable
W95Psp May 21, 2024
1dfd5d5
inlining
W95Psp May 23, 2024
a2bc1fb
kill two debug_asserts that cause trait issues in Charon
W95Psp May 24, 2024
3b3164c
commit C snapshot
W95Psp May 24, 2024
2180172
wip
W95Psp May 24, 2024
d712003
wip
W95Psp May 24, 2024
8511062
update sha3 api to hide internal state
franziskuskiefer May 24, 2024
886f965
add dummy intrinsics file for extraction of avx2
franziskuskiefer May 20, 2024
35d28a0
F* extraction of poly-avx2
franziskuskiefer May 20, 2024
5232b29
move neon hash functions into sha3 for ml-kem
franziskuskiefer May 22, 2024
ec4959c
inline hashing and avx2 hax extraction
franziskuskiefer May 22, 2024
b46607b
changes for F* extraction
franziskuskiefer May 22, 2024
3239743
updated F*
franziskuskiefer May 22, 2024
aca1b2a
portable rejection sampling for neon for now
franziskuskiefer May 23, 2024
6f0495d
feat(ci/F*): get rid of timestamps in patch files
W95Psp May 22, 2024
4e777f8
chore(kyber/fstar): update patches following hacspec/hax#676
W95Psp May 22, 2024
67618a4
Added scaffolding to get work started on ML-DSA. (#289)
xvzcf May 23, 2024
6ca5766
wip
W95Psp May 24, 2024
9226a0b
fixup
franziskuskiefer May 26, 2024
4411d7f
actually inline neon/avx2
franziskuskiefer May 26, 2024
8738c18
fixups and allow portable only extraction
franziskuskiefer May 26, 2024
4a445d0
fixup avx2
franziskuskiefer May 26, 2024
b13f986
fixup neon
franziskuskiefer May 26, 2024
a84d517
fixups for avx2 extraction
franziskuskiefer May 27, 2024
e78b2d7
Merge branch 'ml-kem-c-extraction-inlining-crates' of github.com:crys…
franziskuskiefer May 27, 2024
0b6dd0f
allow extracting everywhere
franziskuskiefer May 27, 2024
66c2e55
fixup c.sh
franziskuskiefer May 27, 2024
03461b6
add new config c2.yml
franziskuskiefer May 27, 2024
cd1de6b
externalize intrinsics into a `libcrux-intrinsics` crate
W95Psp May 28, 2024
d950056
fixup neon; add new extraction (broken)
franziskuskiefer May 28, 2024
6127191
update extraction
franziskuskiefer May 28, 2024
4c90f8e
externalize more intrinsics
W95Psp May 28, 2024
2a53025
fix: assert_debug, expect 32, not 16
W95Psp May 28, 2024
40bb6e8
kill `avx2/debug` module
W95Psp May 28, 2024
1cbc9a7
kill stale intrinsics modules
W95Psp May 28, 2024
15a7b29
check in `c` extractions
W95Psp May 28, 2024
f75fee4
Revert "check in `c` extractions"
W95Psp May 28, 2024
c7724dd
check in a snapshot of the `c` folder in `c-snapshot`
W95Psp May 28, 2024
3015d42
made sha3 safe
karthikbhargavan May 28, 2024
99909f5
add a script for extracting c
W95Psp May 28, 2024
49ca8f7
work around eurydice issues in sha3
franziskuskiefer May 28, 2024
c8a38e2
use config to extract
franziskuskiefer May 28, 2024
12fb7ff
fixup sha3
franziskuskiefer May 28, 2024
e12075e
eurydice workaround
franziskuskiefer May 28, 2024
ed89380
A config to do separate bundling
msprotz May 28, 2024
fe31398
testing one wrapper before going deeper
karthikbhargavan May 29, 2024
39e5d0d
testing one wrapper before going deeper
karthikbhargavan May 29, 2024
04843f2
re-extract into `c-snapshot/`
W95Psp May 29, 2024
540556f
c.yaml: replicate neon options for avx2
W95Psp May 29, 2024
7536322
sha3 c.yaml
franziskuskiefer May 29, 2024
88b6c42
Fix bundling
msprotz May 29, 2024
c243e01
c code sha3 with build
franziskuskiefer May 29, 2024
8515d8b
drop chunk
franziskuskiefer May 29, 2024
3cc1486
Dummies to make compilation work without hand edits
msprotz May 29, 2024
38dd756
missing intrinsics and functions
karthikbhargavan May 29, 2024
cd9ad91
patch for intrinsics ordering
karthikbhargavan May 29, 2024
29e23ec
fixup build
franziskuskiefer May 29, 2024
49b7904
add sha3 test
franziskuskiefer May 29, 2024
9df56fd
fix intrinsics
franziskuskiefer May 29, 2024
165e74e
refresh
msprotz May 29, 2024
798f69c
refresh source
msprotz May 29, 2024
2c373e6
And fix up the config for neon, too
msprotz May 29, 2024
ba2cbcb
Single run of Eurydice -- much easier
msprotz May 30, 2024
1366394
Merge branch 'lucas/extract-intrinsics' of github.com:cryspen/libcrux…
msprotz May 30, 2024
98edcf1
WIP simplifying build
msprotz May 30, 2024
b015f6f
Snapshot of the C code now lives in libcrux-ml-kem/c
msprotz May 30, 2024
0382b43
Regenerate the code on AVX2... and update config so that AVX2 code-ge…
msprotz May 30, 2024
bc9b3ac
Restore some config tweaks, reinstate the build of the sha3 test
msprotz May 30, 2024
af4e0cf
Tweaks, good progress on whole mlkem compilation
msprotz May 30, 2024
a0b0b93
Work around aeneasverif/eurydice#12
msprotz May 30, 2024
381a0a4
Use a two-dimensional array now that it's supported
msprotz May 30, 2024
0ac31ce
It compiles
msprotz May 30, 2024
d89558c
zomg
msprotz May 30, 2024
3b85af7
updated C build and mlkem tests
franziskuskiefer May 30, 2024
8b112db
Regenerate the code with revised general-purpose slice index operation
msprotz May 30, 2024
34f92e1
With loop unrolling
msprotz May 30, 2024
5b5b73e
benches
karthikbhargavan May 30, 2024
08aee25
bench
karthikbhargavan May 30, 2024
746f356
benches
karthikbhargavan May 30, 2024
87a5209
a snapshot of hand edits to speed up sha3
karthikbhargavan May 30, 2024
8f297df
wip bundling
msprotz May 30, 2024
5bc3ad6
Regenerated C code with fresh config
msprotz May 31, 2024
4efb699
Revised config
msprotz May 31, 2024
a91effc
Refresh
msprotz May 31, 2024
a9888dd
clang-format
franziskuskiefer May 31, 2024
67d7b0f
update boringssl benchmarks
franziskuskiefer May 31, 2024
a62ee49
readme for C code
franziskuskiefer May 31, 2024
95c3ab5
inline sha3
franziskuskiefer Jun 3, 2024
ab7694b
Merge branch 'dev' into franziskus/mlkem-c
franziskuskiefer Jun 3, 2024
be73700
rustfmt
franziskuskiefer Jun 3, 2024
a5394b9
fixup ml-dsa again
franziskuskiefer Jun 3, 2024
cd246f5
Add more instructions to c README.md
franziskuskiefer Jun 3, 2024
295f500
drop c experiments
franziskuskiefer Jun 4, 2024
09f13ef
drop more files
franziskuskiefer Jun 4, 2024
8246792
keep supporting old rustc
franziskuskiefer Jun 4, 2024
bc3565b
Merge branch 'dev' into franziskus/mlkem-c
franziskuskiefer Jun 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fixup avx2
  • Loading branch information
franziskuskiefer committed May 26, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
commit 4a445d09751a0ac096cb36976400ffa5b9e7afd0
3 changes: 2 additions & 1 deletion libcrux-ml-kem/Cargo.toml
Original file line number Diff line number Diff line change
@@ -20,10 +20,11 @@ libcrux-sha3 = { version = "0.0.2-pre.2", path = "../libcrux-sha3" }
hax-lib = { git = "https://github.com/hacspec/hax/" }

[features]
default = []
default = ["std"]
simd128 = ["libcrux-sha3/simd128"]
simd256 = ["libcrux-sha3/simd256"]
tests = [] # Test utilities. DO NOT USE.
std = []

[dev-dependencies]
rand = { version = "0.8" }
3 changes: 3 additions & 0 deletions libcrux-ml-kem/src/lib.rs
Original file line number Diff line number Diff line change
@@ -41,6 +41,9 @@
#![warn(rust_2018_idioms, unused_lifetimes, unused_qualifications)]
#![allow(clippy::needless_range_loop)]

#[cfg(feature = "std")]
extern crate std;

pub(crate) mod hax_utils;

// This module is declared here since otherwise, hax reports the following error:
29 changes: 8 additions & 21 deletions libcrux-ml-kem/src/vector.rs
Original file line number Diff line number Diff line change
@@ -29,7 +29,7 @@ pub(crate) use neon::SIMD128Vector;
#[cfg(feature = "simd256")]
mod avx2;
#[cfg(feature = "simd256")]
pub(crate) use libcrux_polynomials_avx2::SIMD256Vector;
pub(crate) use avx2::SIMD256Vector;

/// Values having this type hold a representative 'x' of the Kyber field.
/// We use 'fe' as a shorthand for this type.
@@ -212,11 +212,6 @@ fn zero() -> PortableVector {
}
}

#[inline(always)]
fn to_i16_array(v: PortableVector) -> [i16; FIELD_ELEMENTS_IN_VECTOR] {
v.elements
}

#[inline(always)]
fn from_i16_array(array: &[i16]) -> PortableVector {
PortableVector {
@@ -269,14 +264,14 @@ fn shift_right<const SHIFT_BY: i32>(mut v: PortableVector) -> PortableVector {
v
}

#[inline(always)]
fn shift_left<const SHIFT_BY: i32>(mut lhs: PortableVector) -> PortableVector {
for i in 0..FIELD_ELEMENTS_IN_VECTOR {
lhs.elements[i] = lhs.elements[i] << SHIFT_BY;
}
// #[inline(always)]
// fn shift_left<const SHIFT_BY: i32>(mut lhs: PortableVector) -> PortableVector {
// for i in 0..FIELD_ELEMENTS_IN_VECTOR {
// lhs.elements[i] = lhs.elements[i] << SHIFT_BY;
// }

lhs
}
// lhs
// }

#[inline(always)]
fn cond_subtract_3329(mut v: PortableVector) -> PortableVector {
@@ -1077,10 +1072,6 @@ impl Operations for PortableVector {
zero()
}

fn to_i16_array(v: Self) -> [i16; FIELD_ELEMENTS_IN_VECTOR] {
to_i16_array(v)
}

fn from_i16_array(array: &[i16]) -> Self {
from_i16_array(array)
}
@@ -1105,10 +1096,6 @@ impl Operations for PortableVector {
shift_right::<{ SHIFT_BY }>(v)
}

fn shift_left<const SHIFT_BY: i32>(v: Self) -> Self {
shift_left::<{ SHIFT_BY }>(v)
}

fn cond_subtract_3329(v: Self) -> Self {
cond_subtract_3329(v)
}
14 changes: 2 additions & 12 deletions libcrux-ml-kem/src/vector/avx2.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use libcrux_traits::Operations;
use super::traits::Operations;

#[cfg(test)]
mod debug;
@@ -7,7 +7,7 @@ mod debug;
#[cfg(not(any(eurydice, hax)))]
mod intrinsics;
#[cfg(not(any(eurydice, hax)))]
pub(crate) use crate::intrinsics::*;
pub(crate) use intrinsics::*;
#[cfg(any(eurydice, hax))]
mod intrinsics_extraction;
#[cfg(any(eurydice, hax))]
@@ -51,10 +51,6 @@ impl Operations for SIMD256Vector {
zero()
}

fn to_i16_array(v: Self) -> [i16; 16] {
to_i16_array(v)
}

fn from_i16_array(array: &[i16]) -> Self {
from_i16_array(array)
}
@@ -89,12 +85,6 @@ impl Operations for SIMD256Vector {
}
}

fn shift_left<const SHIFT_BY: i32>(vector: Self) -> Self {
Self {
elements: arithmetic::shift_left::<{ SHIFT_BY }>(vector.elements),
}
}

fn cond_subtract_3329(vector: Self) -> Self {
Self {
elements: arithmetic::cond_subtract_3329(vector.elements),
13 changes: 7 additions & 6 deletions libcrux-ml-kem/src/vector/avx2/arithmetic.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
use crate::*;
use libcrux_traits::{FIELD_MODULUS, INVERSE_OF_MODULUS_MOD_MONTGOMERY_R};
use crate::vector::{traits::INVERSE_OF_MODULUS_MOD_MONTGOMERY_R, FIELD_MODULUS};

use super::*;

#[inline(always)]
pub(crate) fn add(lhs: Vec256, rhs: Vec256) -> Vec256 {
@@ -26,10 +27,10 @@ pub(crate) fn shift_right<const SHIFT_BY: i32>(vector: Vec256) -> Vec256 {
mm256_srai_epi16::<{ SHIFT_BY }>(vector)
}

#[inline(always)]
pub(crate) fn shift_left<const SHIFT_BY: i32>(vector: Vec256) -> Vec256 {
mm256_slli_epi16::<{ SHIFT_BY }>(vector)
}
// #[inline(always)]
// pub(crate) fn shift_left<const SHIFT_BY: i32>(vector: Vec256) -> Vec256 {
// mm256_slli_epi16::<{ SHIFT_BY }>(vector)
// }

#[inline(always)]
pub(crate) fn cond_subtract_3329(vector: Vec256) -> Vec256 {
5 changes: 3 additions & 2 deletions libcrux-ml-kem/src/vector/avx2/compress.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
use crate::*;
use libcrux_traits::FIELD_MODULUS;
use crate::vector::FIELD_MODULUS;

use super::*;

// Multiply the 32-bit numbers contained in |lhs| and |rhs|, and store only
// the upper 32 bits of the resulting product.
4 changes: 4 additions & 0 deletions libcrux-ml-kem/src/vector/avx2/debug.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
#![allow(unsafe_code)]

use std::println;

#[cfg(target_arch = "x86")]
use core::arch::x86::*;
#[cfg(target_arch = "x86_64")]
4 changes: 4 additions & 0 deletions libcrux-ml-kem/src/vector/avx2/intrinsics.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
// TODO: Move this into a separate intrinsics crate.

#![allow(unsafe_code)]

#[cfg(target_arch = "x86")]
pub(crate) use core::arch::x86::*;
#[cfg(target_arch = "x86_64")]
2 changes: 1 addition & 1 deletion libcrux-ml-kem/src/vector/avx2/ntt.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use crate::*;
use super::*;

#[inline(always)]
pub(crate) fn ntt_layer_1_step(
2 changes: 1 addition & 1 deletion libcrux-ml-kem/src/vector/avx2/portable.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
pub use libcrux_traits::FIELD_ELEMENTS_IN_VECTOR;
use crate::vector::FIELD_ELEMENTS_IN_VECTOR;

type FieldElement = i16;

5 changes: 2 additions & 3 deletions libcrux-ml-kem/src/vector/avx2/sampling.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
use crate::{
use super::{
super::traits::FIELD_MODULUS,
serialize::{deserialize_12, serialize_1},
*,
};

use libcrux_traits::FIELD_MODULUS;

const REJECTION_SAMPLE_SHUFFLE_TABLE: [[u8; 16]; 256] = [
[
0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
6 changes: 3 additions & 3 deletions libcrux-ml-kem/src/vector/avx2/serialize.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use crate::*;
use super::*;

#[inline(always)]
pub(crate) fn serialize_1(vector: Vec256) -> [u8; 2] {
@@ -388,7 +388,7 @@ pub(crate) fn deserialize_10(bytes: &[u8]) -> Vec256 {

#[inline(always)]
pub(crate) fn serialize_11(vector: Vec256) -> [u8; 22] {
let input = portable::from_i16_array(crate::to_i16_array(SIMD256Vector { elements: vector }));
let input = portable::from_i16_array(to_i16_array(SIMD256Vector { elements: vector }));

portable::serialize_11(input)
}
@@ -397,7 +397,7 @@ pub(crate) fn serialize_11(vector: Vec256) -> [u8; 22] {
pub(crate) fn deserialize_11(bytes: &[u8]) -> Vec256 {
let output = portable::deserialize_11(bytes);

crate::from_i16_array(&portable::to_i16_array(output)).elements
from_i16_array(&portable::to_i16_array(output)).elements
}

#[inline(always)]
8 changes: 0 additions & 8 deletions libcrux-ml-kem/src/vector/neon.rs
Original file line number Diff line number Diff line change
@@ -18,10 +18,6 @@ impl Operations for SIMD128Vector {
ZERO()
}

fn to_i16_array(v: Self) -> [i16; 16] {
to_i16_array(v)
}

fn from_i16_array(array: &[i16]) -> Self {
from_i16_array(array)
}
@@ -46,10 +42,6 @@ impl Operations for SIMD128Vector {
shift_right::<SHIFT_BY>(v)
}

fn shift_left<const SHIFT_BY: i32>(v: Self) -> Self {
shift_left::<SHIFT_BY>(v)
}

fn cond_subtract_3329(v: Self) -> Self {
cond_subtract_3329(v)
}
12 changes: 6 additions & 6 deletions libcrux-ml-kem/src/vector/neon/simd128ops.rs
Original file line number Diff line number Diff line change
@@ -73,12 +73,12 @@ pub(crate) fn shift_right<const SHIFT_BY: i32>(mut v: SIMD128Vector) -> SIMD128V
v
}

#[inline(always)]
pub(crate) fn shift_left<const SHIFT_BY: i32>(mut lhs: SIMD128Vector) -> SIMD128Vector {
lhs.low = _vshlq_n_s16::<SHIFT_BY>(lhs.low);
lhs.high = _vshlq_n_s16::<SHIFT_BY>(lhs.high);
lhs
}
// #[inline(always)]
// pub(crate) fn shift_left<const SHIFT_BY: i32>(mut lhs: SIMD128Vector) -> SIMD128Vector {
// lhs.low = _vshlq_n_s16::<SHIFT_BY>(lhs.low);
// lhs.high = _vshlq_n_s16::<SHIFT_BY>(lhs.high);
// lhs
// }

#[inline(always)]
pub(crate) fn cond_subtract_3329(mut v: SIMD128Vector) -> SIMD128Vector {
3 changes: 1 addition & 2 deletions libcrux-ml-kem/src/vector/traits.rs
Original file line number Diff line number Diff line change
@@ -7,7 +7,6 @@ pub(crate) trait Operations: Copy + Clone {
#[allow(non_snake_case)]
fn ZERO() -> Self;

fn to_i16_array(v: Self) -> [i16; 16];
fn from_i16_array(array: &[i16]) -> Self;

// Basic arithmetic
@@ -18,7 +17,7 @@ pub(crate) trait Operations: Copy + Clone {
// Bitwise operations
fn bitwise_and_with_constant(v: Self, c: i16) -> Self;
fn shift_right<const SHIFT_BY: i32>(v: Self) -> Self;
fn shift_left<const SHIFT_BY: i32>(v: Self) -> Self;
// fn shift_left<const SHIFT_BY: i32>(v: Self) -> Self;

// Modular operations
fn cond_subtract_3329(v: Self) -> Self;
1 change: 1 addition & 0 deletions libcrux-sha3/src/lib.rs
Original file line number Diff line number Diff line change
@@ -419,6 +419,7 @@ pub mod neon {
}
#[cfg(all(feature = "simd128", target_arch = "aarch64"))]
type KeccakState2Internal = KeccakState<2, core::arch::aarch64::uint64x2_t>;
#[allow(dead_code)]
#[cfg(not(all(feature = "simd128", target_arch = "aarch64")))]
pub struct KeccakState2 {
state: [crate::portable::KeccakState1; 2],