diff --git a/book/src/SUMMARY.md b/book/src/SUMMARY.md index 1ae5454019..f30839da67 100644 --- a/book/src/SUMMARY.md +++ b/book/src/SUMMARY.md @@ -12,7 +12,7 @@ - [Multiplying Polynomials](./fundamentals/zkbook_multiplying_polynomials.md) - [Fast Fourier Transform](./fundamentals/zkbook_fft.md) -# Cryptographic tools +# Cryptographic Tools - [Commitments](./fundamentals/zkbook_commitment.md) - [Polynomial Commitments](./plonk/polynomial_commitments.md) @@ -27,56 +27,46 @@ - [Half Gate](./fundamentals/zkbook_2pc/halfgate.md) - [Full Description](./fundamentals/zkbook_2pc/fulldesc.md) - [Fixed-Key-AES Hashes](./fundamentals/zkbook_2pc/fkaes.md) - - [Oblivious Transfer](./fundamentals/zkbook_2pc/ot.md) - [Base OT](./fundamentals/zkbook_2pc/baseot.md) - [OT Extension](./fundamentals/zkbook_2pc/ote.md) - - [Full Protocol](./fundamentals/zkbook_2pc/2pc.md) - -# Proof systems - -- [Overview](./fundamentals/proof_systems.md) -- [zk-SNARKs](./fundamentals/zkbook_plonk.md) -- [Custom constraints](./fundamentals/custom_constraints.md) +- [Proof Systems](./fundamentals/proof_systems.md) + - [zk-SNARKs](./fundamentals/zkbook_plonk.md) # Background on PLONK - [Overview](./plonk/overview.md) -- [Glossary](./plonk/glossary.md) + - [Glossary](./plonk/glossary.md) - [Domain](./plonk/domain.md) - [Lagrange Basis in Multiplicative Subgroups](./plonk/lagrange.md) -- [Non-interaction with Fiat-Shamir](./plonk/fiat_shamir.md) +- [Non-Interactivity via Fiat-Shamir](./plonk/fiat_shamir.md) - [Plookup](./plonk/plookup.md) - [Maller's Optimization](./plonk/maller.md) +- [Zero-Column Approach to Zero-Knowledge](./plonk/zkpm.md) # Kimchi - [Overview](./kimchi/overview.md) - - [Arguments](./kimchi/arguments.md) - - [Custom Gates](./kimchi/gates.md) - - [Permutation](./kimchi/permut.md) - - [Lookup](./kimchi/lookup.md) +- [Arguments](./kimchi/arguments.md) +- [Final Check](./kimchi/final_check.md) +- [Maller's Optimization for Kimchi](./kimchi/maller_15.md) +- [Lookup Tables](./kimchi/lookup.md) + - [Extended Lookup Tables](./kimchi/extended-lookup-tables.md) +- [Custom Constraints](./kimchi/custom_constraints.md) +- [Custom Gates](./kimchi/gates.md) + - [Foreign Field Addition](./kimchi/foreign_field_add.md) + - [Foreign Field Multiplication](./kimchi/foreign_field_mul.md) + - [Keccak](./kimchi/keccak.md) + # Pickles & Inductive Proof Systems - [Overview](./fundamentals/zkbook_ips.md) - [Accumulation](./pickles/accumulation.md) - [Deferred Computation](./pickles/deferred.md) -- [Passthough & Me-Only](./pickles/passthrough.md) - -# RFCs - -- [RFC 0: Alternative Zero-Knowledge](./plonk/zkpm.md) -- [RFC 1: Final Check](./plonk/final_check.md) -- [RFC 2: Maller's Optimization for Kimchi](./plonk/maller_15.md) -- [RFC 3: Plookup Integration in Kimchi](./rfcs/3-lookup.md) -- [RFC 4: Extended Lookup Tables](./rfcs/extended-lookup-tables.md) -- [RFC 5: Foreign Field Addition](./rfcs/foreign_field_add.md) -- [RFC 6: Foreign Field Multiplication](./rfcs/foreign_field_mul.md) -- [RFC 7: Keccak](./rfcs/keccak.md) -# Specifications +# Technical Specifications - [Poseidon hash](./specs/poseidon.md) - [Polynomial Commitment](./specs/poly-commitment.md) diff --git a/book/src/fundamentals/custom_constraints.md b/book/src/fundamentals/custom_constraints.md index e2c7256d68..8773465d1c 100644 --- a/book/src/fundamentals/custom_constraints.md +++ b/book/src/fundamentals/custom_constraints.md @@ -1,58 +1 @@ -This section explains how to design and add a custom constraint to our `proof-systems` library. - -PLONK is an AIOP. That is, it is a protocol in which the prover sends polynomials as messages and the verifier sends random challenges, and then evaluates the prover's polynomials and performs some final checks on the outputs. - -PLONK is very flexible. It can be customized with constraints specific to computations of interest. For example, in Mina, we use a PLONK configuration called kimchi that has custom constraints for poseidon hashing, doing elliptic curve operations, and more. - -A "PLONK configuration" specifies -- The set of types of constraints that you would like to be able to enforce. We will describe below how these types of constraints are specified. -- A number of "eq-able" columns `W` -- A number of "advice" columns `A` - -Under such configuration, a circuit is specified by -- A number of rows `n` -- A vector `cs` of constraint-types of length `n`. I.e., a vector that specifies, for each row, which types of constraints should be enforced on that row. -- A vector `eqs : Vec<(Position, Position)>` of equalities to enforce, where `struct Position { row: usize, column: usize }`. E.g., if the pair `(Position { row: 0, col: 8 }, Position { row: 10, col: 2 })` is in `eqs`, then the circuit is saying the entries in those two positions should be equal, or in other words that they refer to the same value. This is where the distinction between "eq-able" and "advice" columns comes in. The `column` field of a position in the `eqs` array can only refer to one of the first `W` columns. Equalities cannot be enforced on entries in the `A` columns after that. - -Then, given such a circuit, PLONK lets you produce proofs for the statement - -> I know `W + A` "column vectors" of field elements `vs: [Vec; W + A]` such that for each row index `i < n`, the constraint of type `cs[i]` holds on the values `[vs[0][i], ..., vs[W+A - 1][i], vs[0][i+1], ..., vs[W+A - 1][i+1]` and all the equalities in `eqs` hold. I.e., for `(p1, p2)` in `eqs` we have `vs[p1.col][p1.row] == vs[p2.col][p2.row]`. So, a constraint can check the values in two adjacent rows. - -## Specifying a constraint - -Mathematically speaking, a constraint is a multivariate polynomial over the variables $c_{\mathsf{Curr},i}, \dots, v_{\mathsf{Curr}, W+A-1}, v_{\mathsf{Next}, 0}, \dots, v_{\mathsf{Next}, W+A-1}$. In other words, there is one variable corresponding to the value of each column in the "current row" and one variable correspond to the value of each column in the "next row". - -In Rust, $v_{r, i}$ is written `E::cell(Column::Witness(i), r)`. So, for example, the variable $v_{\mathsf{Next}, 3}$ is written -`E::cell(Column::Witness(3), CurrOrNext::Next)`. - - - - let w = |i| v(Column::Witness(i)); -Let's - -## Defining a PLONK configuration - -The art in proof systems comes from knowing how to design a PLONK configuration to ensure maximal efficiency for the sorts of computations you are trying to prove. That is, how to choose the numbers of columns `W` and `A`, and how to define the set of constraint types. - -Let's describe the trade-offs involved here. - -The majority of the proving time for the PLONK prover is in -- committing to the `W + A` column polynomials, which have length equal to the number of rows `n` -- committing to the "permutation accumulator polynomial, which has length `n`. -- committing to the quotient polynomial, which reduces to computing `max(k, W)` MSMs of size `n`, where `k` is the max degree of a constraint. -- performing the commitment opening proof, which is mostly dependent on the number of rows `n`. - -So all in all, the proving time is approximately equal to the time to perform `W + A + 1 + max(k - 1, W)` MSMs of size `n`, plus the cost of an opening proof for polynomials of degree `n - 1`. - -and maybe -- computing the combined constraint polynomial, which has degree `k * n` where `k` is the maximum degree of a constraint - -- Increasing `W` and `A` increase proof size, and they potentially impact the prover-time as the prover must compute polynomial commitments to each column, and computing a polynomial commitment corresponds to doing one MSM (multi-scalar multiplication, also called a multi-exponentiation.) - - However, often increasing the number of columns allows you to decrease the number of rows required for a given computation. For example, if you can perform one Poseidon hash in 36 rows with 5 total columns, then you can also perform it in 12 (= 36 / 3) rows with 15 (= 5 * 3) total columns. - - **Decreasing the number of rows (even while keeping the total number of table entries the same) is desirable because it reduces the cost of the polynomial commitment opening proof, which is dominated by a factor linear in the number of rows, and barely depends on the number of columns.** - - Increasing the number of columns also increases verifier time, as the verifier must perform one scalar-multiplication and one hash per column. Proof length is also affected by a larger number of columns, as more polynomials need to be committed and sent along to the verifier. - -There is typically some interplay between these +# Custom constraints diff --git a/book/src/fundamentals/proof_systems.md b/book/src/fundamentals/proof_systems.md index 84b5f8fe97..d281048974 100644 --- a/book/src/fundamentals/proof_systems.md +++ b/book/src/fundamentals/proof_systems.md @@ -1,31 +1,29 @@ -# Overview +# Proof Systems Design Overview Many modern proof systems (and I think all that are in use) are constructed according to the following recipe. 1. You start out with a class of computations. 2. You devise a way to *arithmetize* those computations. That is, to express your computation as a statement about polynomials. - + More specifically, you describe what is often called an "algebraic interactive oracle proof" (AIOP) that encodes your computation. An AIOP is a protocol describing an interaction between a prover and a verifier, in which the prover sends the verifier some "polynomial oracles" (basically a black box function that given a point evaluates a polynomial at that point), the verifier sends the prover random challenges, and at the end, the verifier queries the prover's polynomials at points of its choosing and makes a decision as to whether it has been satisfied by the proof. 3. An AIOP is an imagined interaction between parties. It is an abstract description of the protocol that will be "compiled" into a SNARK. There are several "non-realistic" aspects about it. One is that the prover sends the verifier black-box polynomials that the verifier can evaluate. These polynomials have degree comparable to the size of the computation being verified. If we implemented these "polynomial oracles" by having the prover really send the $O(n)$ size polynomials (say by sending all their coefficients), then we would not have a zk-SNARK at all, since the verifier would have to read this linearly sized polynomial so we would lose succinctness, and the polynomials would not be black-box functions, so we may lose zero-knowledge. - + Instead, when we concretely instantiate the AIOP, we have the prover send constant-sized, hiding *polynomial commitments*. Then, in the phase of the AIOP where the verifier queries the polynomials, the prover sends an *opening proof* for the polynomial commitments which the verifier can check, thus simulating the activity of evaluating the prover's polynomials on your own. - + So this is the next step of making a SNARK: instantiating the AIOP with a polynomial commitment scheme of one's choosing. There are several choices here and these affect the properties of the SNARK you are constructing, as the SNARK will inherit efficiency and setup properties of the polynomial commitment scheme used. -4. An AIOP describes an interactive protocol between the verifier and the prover. In reality, typically, we also want our proofs to be non-interactive. - +4. An AIOP describes an interactive protocol between the verifier and the prover. In reality, typically, we also want our proofs to be non-interactive. + This is accomplished by what is called the [Fiat--Shamir transformation](). The basic idea is this: all that the verifier is doing is sampling random values to send to the prover. Instead, to generate a "random" value, the prover simulates the verifier by hashing its messages. The resulting hash is used as the "random" challenge. - + At this point we have a fully non-interactive proof. Let's review our steps. - + 1. Start with a computation. - + 2. Translate the computation into a statement about polynomials and design a corresponding AIOP. - - 3. Compile the AIOP into an interactive protocol by having the prover send hiding polynomial commitments instead of polynomial oracles. - - 4. Get rid of the verifier-interaction by replacing it with a hash function. I.e., apply the Fiat--Shamir transform. + 3. Compile the AIOP into an interactive protocol by having the prover send hiding polynomial commitments instead of polynomial oracles. + 4. Get rid of the verifier-interaction by replacing it with a hash function. I.e., apply the Fiat--Shamir transform. diff --git a/book/src/kimchi/custom_constraints.md b/book/src/kimchi/custom_constraints.md new file mode 100644 index 0000000000..e2c7256d68 --- /dev/null +++ b/book/src/kimchi/custom_constraints.md @@ -0,0 +1,58 @@ +This section explains how to design and add a custom constraint to our `proof-systems` library. + +PLONK is an AIOP. That is, it is a protocol in which the prover sends polynomials as messages and the verifier sends random challenges, and then evaluates the prover's polynomials and performs some final checks on the outputs. + +PLONK is very flexible. It can be customized with constraints specific to computations of interest. For example, in Mina, we use a PLONK configuration called kimchi that has custom constraints for poseidon hashing, doing elliptic curve operations, and more. + +A "PLONK configuration" specifies +- The set of types of constraints that you would like to be able to enforce. We will describe below how these types of constraints are specified. +- A number of "eq-able" columns `W` +- A number of "advice" columns `A` + +Under such configuration, a circuit is specified by +- A number of rows `n` +- A vector `cs` of constraint-types of length `n`. I.e., a vector that specifies, for each row, which types of constraints should be enforced on that row. +- A vector `eqs : Vec<(Position, Position)>` of equalities to enforce, where `struct Position { row: usize, column: usize }`. E.g., if the pair `(Position { row: 0, col: 8 }, Position { row: 10, col: 2 })` is in `eqs`, then the circuit is saying the entries in those two positions should be equal, or in other words that they refer to the same value. This is where the distinction between "eq-able" and "advice" columns comes in. The `column` field of a position in the `eqs` array can only refer to one of the first `W` columns. Equalities cannot be enforced on entries in the `A` columns after that. + +Then, given such a circuit, PLONK lets you produce proofs for the statement + +> I know `W + A` "column vectors" of field elements `vs: [Vec; W + A]` such that for each row index `i < n`, the constraint of type `cs[i]` holds on the values `[vs[0][i], ..., vs[W+A - 1][i], vs[0][i+1], ..., vs[W+A - 1][i+1]` and all the equalities in `eqs` hold. I.e., for `(p1, p2)` in `eqs` we have `vs[p1.col][p1.row] == vs[p2.col][p2.row]`. So, a constraint can check the values in two adjacent rows. + +## Specifying a constraint + +Mathematically speaking, a constraint is a multivariate polynomial over the variables $c_{\mathsf{Curr},i}, \dots, v_{\mathsf{Curr}, W+A-1}, v_{\mathsf{Next}, 0}, \dots, v_{\mathsf{Next}, W+A-1}$. In other words, there is one variable corresponding to the value of each column in the "current row" and one variable correspond to the value of each column in the "next row". + +In Rust, $v_{r, i}$ is written `E::cell(Column::Witness(i), r)`. So, for example, the variable $v_{\mathsf{Next}, 3}$ is written +`E::cell(Column::Witness(3), CurrOrNext::Next)`. + + + + let w = |i| v(Column::Witness(i)); +Let's + +## Defining a PLONK configuration + +The art in proof systems comes from knowing how to design a PLONK configuration to ensure maximal efficiency for the sorts of computations you are trying to prove. That is, how to choose the numbers of columns `W` and `A`, and how to define the set of constraint types. + +Let's describe the trade-offs involved here. + +The majority of the proving time for the PLONK prover is in +- committing to the `W + A` column polynomials, which have length equal to the number of rows `n` +- committing to the "permutation accumulator polynomial, which has length `n`. +- committing to the quotient polynomial, which reduces to computing `max(k, W)` MSMs of size `n`, where `k` is the max degree of a constraint. +- performing the commitment opening proof, which is mostly dependent on the number of rows `n`. + +So all in all, the proving time is approximately equal to the time to perform `W + A + 1 + max(k - 1, W)` MSMs of size `n`, plus the cost of an opening proof for polynomials of degree `n - 1`. + +and maybe +- computing the combined constraint polynomial, which has degree `k * n` where `k` is the maximum degree of a constraint + +- Increasing `W` and `A` increase proof size, and they potentially impact the prover-time as the prover must compute polynomial commitments to each column, and computing a polynomial commitment corresponds to doing one MSM (multi-scalar multiplication, also called a multi-exponentiation.) + + However, often increasing the number of columns allows you to decrease the number of rows required for a given computation. For example, if you can perform one Poseidon hash in 36 rows with 5 total columns, then you can also perform it in 12 (= 36 / 3) rows with 15 (= 5 * 3) total columns. + + **Decreasing the number of rows (even while keeping the total number of table entries the same) is desirable because it reduces the cost of the polynomial commitment opening proof, which is dominated by a factor linear in the number of rows, and barely depends on the number of columns.** + + Increasing the number of columns also increases verifier time, as the verifier must perform one scalar-multiplication and one hash per column. Proof length is also affected by a larger number of columns, as more polynomials need to be committed and sent along to the verifier. + +There is typically some interplay between these diff --git a/book/src/rfcs/extended-lookup-tables.md b/book/src/kimchi/extended-lookup-tables.md similarity index 99% rename from book/src/rfcs/extended-lookup-tables.md rename to book/src/kimchi/extended-lookup-tables.md index ed8d29bf7c..a2205ddebd 100644 --- a/book/src/rfcs/extended-lookup-tables.md +++ b/book/src/kimchi/extended-lookup-tables.md @@ -1,6 +1,6 @@ -# RFC: Extended lookup tables +# Extended lookup tables -This RFC proposes an extension to our use of lookup tables using the PLOOKUP +This (old) RFC proposes an extension to our use of lookup tables using the PLOOKUP multiset inclusion argument, so that values within lookup tables can be chosen after the constraint system for a circuit has been fixed. diff --git a/book/src/plonk/final_check.md b/book/src/kimchi/final_check.md similarity index 100% rename from book/src/plonk/final_check.md rename to book/src/kimchi/final_check.md diff --git a/book/src/rfcs/foreign_field_add.md b/book/src/kimchi/foreign_field_add.md similarity index 100% rename from book/src/rfcs/foreign_field_add.md rename to book/src/kimchi/foreign_field_add.md diff --git a/book/src/rfcs/foreign_field_mul.md b/book/src/kimchi/foreign_field_mul.md similarity index 100% rename from book/src/rfcs/foreign_field_mul.md rename to book/src/kimchi/foreign_field_mul.md diff --git a/book/src/kimchi/keccak.md b/book/src/kimchi/keccak.md new file mode 100644 index 0000000000..9c00ad6d7c --- /dev/null +++ b/book/src/kimchi/keccak.md @@ -0,0 +1,168 @@ +# Keccak Gate + +The Keccak gadget is comprised of 3 circuit gates (`Xor16`, `Rot64`, and `Zero`) + +Keccak works with 64-bit words. The state is represented using $5\times 5$ matrix +of 64 bit words. Each compression step of Keccak consists of 24 rounds. Let us +denote the state matrix with $A$ (indexing elements as $A[x,y]$), from which we derive +further states as follows in each round. Each round then consists of the following 5 steps: + +$$ +\begin{align} +C[x] &= A[x,0] \oplus A[x,1] \oplus A[x,2] \oplus A[x,3] \oplus A[x,4] \\ +D[x] &= C[x-1] \oplus ROT(C[x+1],1) \\ +E[x,y] &= A[x,y] \oplus D[x] \\ +B[y,2x+3y] &= ROT(E[x,y],\rho[x,y]) \\ +F[x,y] &= B[x,y] \oplus ((NOT\ B[x+1,y]) AND\ B[x+2,y]) \\ +Fp[0,0] &= F[0,0] \oplus RC +\end{align} +$$ + +for $0\leq x, y \leq 4$ and $\rho[x,y]$ is the rotation offset defined for Keccak. +The values are in the table below extracted from the Keccak reference + + +| | x = 3 | x = 4 | x = 0 | x = 1 | x = 2 | +| ----- | ----- | ----- | ----- | ----- | ----- | +| y = 2 | 155 | 231 | 3 | 10 | 171 | +| y = 1 | 55 | 276 | 36 | 300 | 6 | +| y = 0 | 28 | 91 | 0 | 1 | 190 | +| y = 4 | 120 | 78 | 210 | 66 | 253 | +| y = 3 | 21 | 136 | 105 | 45 | 15 | + +## Design Approach: + +The atomic operations are XOR, ROT, NOT, AND. In the sections below, we will describe +the gates for these operations. Below are some common approaches followed in their design. + +To fit within 15 wires, we first decompose each word into its lower and upper 32-bit +components. A gate for an atomic operation works with those 32-bit components at a time. + +Before we describe the specific gate design approaches, below are some constraints in the +Kimchi framework that dictated those approaches. +* only 4 lookups per row +* only first 7 columns are available to the permutation polynomial + +## Rot64 + +It is clear from the definition of the rotation gate that its constraints are complete +(meaning that honest instances always satisfy the constraints). It is left to be proven +that the proposal is sound. In this section, we will give a proof that as soon as we +perform the range checks on the excess and shifted parts of the input, only one possible +assignment satisfies the constraints. This means that there is no dishonest instance that +can make the constraints pass. We will also give an example where one could find wrong +rotation witnesses that would satisfy the constraints if we did not check the range. + +### Necessity of range checks + +First of all, we will illustrate the necessity of range-checks with a simple example. +For the sake of readability, we will use some toy field lengths. In particular, let us +assume that our words have 4 bits, meaning all of the elements between `0x0` and `0xF`. +Next, we will be using the native field $\mathbb{F}_{32}$. + +As we will later see, this choice of field lengths is not enough to perform any 4-bit +rotation, since the operations in the constraints would overflow the native field. +Nonetheless, it will be sufficient for our example where we will only rotate by 1 bit. + +Assume we want to rotate the word `0b1101` (meaning 13) by 1 bit to the left. This gives +us the rotated word `0b1011` (meaning 11). The excess part of the word is `0b1`, whereas +the shifted part corresponds to `0b1010`. We recall the constraints for the rotation gate: + +$$ +\begin{align*} +word \cdot 2^{rot} &= excess \cdot 2^{len} + shifted \\ +rotated &= excess + shifted +\end{align*} +$$ + +Applied to our example, this results in the following equations: + +$$ +\begin{align*} +13 \cdot 2 &= excess \cdot 16 + shifted \\ +11 &= excess + shifted +\end{align*} +$$ + +We can easily check that the proposed values of the shifted `0b1010=10` and the excess +`0b1=1` values satisfy the above constraint because $26 = 1 \cdot 16 + 10$ and $11 = 1 + 10$. +Now, the question is: _can we find another value for excess and shifted, such that their addition results in an incorrect rotated word?_ + +The answer to this question is yes, due to __diophantine equations__. We basically want to find $x,y$ such that $26 = x \cdot 16 + y (\text{ mod } 32)$. The solution to this equation is: + +$$ +\begin{align*} +\forall k \in [0 \ldots 31]: x &= k \ \land \\ +y &= 26 - 16 \cdot k +\end{align*} +$$ + +We chose these word and field lengths to better understand the behaviour of the solution. Here, we can see two "classes" of evaluations. + +- If we choose an even $k$, then $y$ will have the following shape: + - $$26 - 16 \cdot (2 \cdot n) \iff 26 - 32n \equiv_{32} 26 $$ + - Meaning, if $x = 2n$ then $y = 26$. + +- If on the other hand, we chose an odd $k$, then $y$ will have the following shape instead: + - $$26 - 16 \cdot (2 \cdot n + 1) \iff 26 - 32n - 16 \equiv_{32} 26 - 16 = 10$$ + - Meaning, if $x = 2n+1$ then $y = 10$. + +Thus, possible solutions to the diophantine equation are: + +$$ +\begin{align*} +x &= 0, 1, 2, 3, 4, 5 \ldots \\ +y &= 26, 10, 26, 10, 26, 10 \ldots +\end{align*} +$$ + +Note that our valid witness is part of that set of solutions, meaning $x=1$ and $y=10$. Of course, we can also find another dishonest instantiation such as $x=0$ and $y=26$. Perhaps one could think that we do not need to worry about this case, because the resulting rotation word would be $0+26=26$, and if we later use that result as an input to a subsequent gate such as XOR, the value $26$ would not fit and at some point the constraint system would complain. Nonetheless, we still have other solutions to worry about, such as $(x=3, y=10)$ or $(x=5, y=10)$, since they would result in a rotated word that would fit in the word length of 4 bits, yet would be incorrect (not equal to $11$). + +All of the above incorrect solutions differ in one thing: they have different bit lengths. This means that we need to range check the values for the excess and shifted witnesses to make sure they have the correct length. + +### Sufficiency of range checks + +In the following, we will give a proof that performing range checks for these values is not only necessary but also sufficient to prove that the rotation gate is sound. In other words, we will prove there are no two possible solutions of the decomposition constraint that have the correct bit lengths. Now, for the sake of robustness, we will consider 64-bit words and fields with at least twice the bit length of the words (as is our case). + +We will proceed by __contradiction__. Suppose there are two different solutions to the following diophantic equation: + +$$ +\begin{align*} +\forall k \in \mathbb{F}_n: x &= k \ \land \\ +y &= w \cdot 2^r - 2^{64} \cdot k +\end{align*} +$$ + +where $k$ is a parameter to instantiate the solutions, $w$ is the word to be rotated, $r$ is the rotation amount, and $n$ is the field length. + +Then, that means that there are two different solutions, $(0\leq x=a<2^r, 0\leq y=b<2^{64})$ and $(0\leq x=a'<2^r, 0\leq y=b'<2^{64})$ with at least $a \neq a'$ or $b \neq b'$. We will show that this is impossible. + +If both are solutions to the same equation, then: +$$ +\begin{align*} +w \cdot 2^r &= a \cdot 2^{64} + b \\ +w \cdot 2^r &= a'\cdot 2^{64} + b' +\end{align*} +$$ +means that $a \cdot 2^{64} + b = a'\cdot 2^{64} + b'$. Moving terms to the left side, we have an equivalent equality: $2^{64}(a-a') + (b-b')=0 \mod{n}$. There are three cases to consider: + +- $a = a'$ and $b \neq b'$: then $(b - b') \equiv_n 0$ and this can only happen if $b' = b + kn$. But since $n > 2^{64}$, then $b'$ cannot be smaller than $2^{64}$ as it was assumed. CONTRADICTION. + +- $b = b'$ and $a \neq a'$: then $2^{64}(a - a') \equiv_n 0$ and this can only happen if $a' = a + kn$. But since $n > 2^r$, then $a'$ cannot be smaller than $2^r$ as it was assumed. CONTRADICTION. + +- $a\neq a'$ and $b \neq b'$: then we have something like $2^{64} \alpha + \beta \equiv_n 0$. + - This means $\beta \equiv_n -2^{64} \alpha = k \cdot n - 2^{64} \alpha$ for any $k$. + - According to the assumption, both $0\leq a<2^r$ and $0\leq a'<2^r$. This means, the difference $\alpha:=(a - a')$ lies anywhere in between the following interval: + $$1 - 2^r \leq \alpha \leq 2^r - 1$$ + - We plug in this interval to the above equation to obtain the following interval for $\beta$: + $$k\cdot n - 2^{64}(1-2^r)\leq \beta \leq k\cdot n - 2^{64}(2^r - 1) $$ + - We look at this interval from both sides of the inequality: $\beta \geq kn - 2^{64} + 2^{64+r}$ and $\beta \leq kn + 2^{64} - 2^{64+r}$ and we wonder if $kn - 2^{64} + 2^{64+r} \leq kn + 2^{64} - 2^{64+r}$ is at all possible. We rewrite as follows: + $$ + \begin{align*} + 2^{64+r} - 2^{64} &\leq 2^{64} - 2^{64+r}\\ + 2\cdot2^{64+r} &\leq 2\cdot2^{64} \\ + 2^{64+r} &\leq 2^{64} + \end{align*} + $$ + - But this can only happen if $r\leq 0$, which is impossible since we assume $0 < r < 64$. CONTRADICTION. +- EOP. diff --git a/book/src/kimchi/lookup.md b/book/src/kimchi/lookup.md index 9989ac5557..a261672716 100644 --- a/book/src/kimchi/lookup.md +++ b/book/src/kimchi/lookup.md @@ -1,3 +1,318 @@ -## Lookup +# Plookup in Kimchi -TO-DO \ No newline at end of file +In 2020, [plookup](https://eprint.iacr.org/2020/315.pdf) showed how to create lookup proofs. Proofs that some witness values are part of a [lookup table](https://en.wikipedia.org/wiki/Lookup_table). Two years later, an independent team published [plonkup](https://eprint.iacr.org/2022/086) showing how to integrate Plookup into Plonk. + +This document specifies how we integrate plookup in kimchi. It assumes that the reader understands the basics behind plookup. + +## Overview + +We integrate plookup in kimchi with the following differences: + +* we snake-ify the sorted table instead of wrapping it around (see later) +* we allow fixed-ahead-of-time linear combinations of columns of the queries we make +* we only use a single table (XOR) at the moment of this writing +* we allow several lookups (or queries) to be performed within the same row +* zero-knowledgeness is added in a specific way (see later) + +The following document explains the protocol in more detail + +### Recap on the grand product argument of plookup + +As per the Plookup paper, the prover will have to compute three vectors: + +* $f$, the (secret) **query vector**, containing the witness values that the prover wants to prove are part of the lookup table. +* $t$, the (public) **lookup table**. +* $s$, the (secret) concatenation of $f$ and $t$, sorted by $t$ (where elements are listed in the order they are listed in $t$). + +Essentially, plookup proves that all the elements in $f$ are indeed in the lookup table $t$ if and only if the following multisets are equal: + +* $\{(1+\beta)f, \text{diff}(t)\}$ +* $\text{diff}(\text{sorted}(f, t))$ + +where $\text{diff}$ is a new set derived by applying a "randomized difference" between every successive pairs of a vector. For example: + +* $f = \{5, 4, 1, 5\}$ +* $t = \{1, 4, 5\}$ +* $\{\color{blue}{(1+\beta)f}, \color{green}{\text{diff}(t)}\} = \{\color{blue}{(1+\beta)5, (1+\beta)4, (1+\beta)1, (1+\beta)5}, \color{green}{1+\beta 4, 4+\beta 5}\}$ +* $\text{diff}(\text{sorted}(f, t)) = \{1+\beta 1, 1+\beta 4, 4+\beta 4, 4+\beta 5, 5+\beta 5, 5+\beta 5\}$ + +> Note: This assumes that the lookup table is a single column. You will see in the next section how to address lookup tables with more than one column. + +The equality between the multisets can be proved with the permutation argument of plonk, which would look like enforcing constraints on the following accumulator: + +* init: $\mathsf{acc}_0 = 1$ +* final: $\mathsf{acc}_n = 1$ +* for every $0 < i \leq n$: + $$ + \mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{(\gamma + (1+\beta) f_{i-1}) \cdot (\gamma + t_{i-1} + \beta t_i)}{(\gamma + s_{i-1} + \beta s_{i})(\gamma + s_{n+i-1} + \beta s_{n+i})} + $$ + +Note that the plookup paper uses a slightly different equation to make the proof work. It is possible that the proof would work with the above equation, but for simplicity let's just use the equation published in plookup: + +$$ +\mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{(1+\beta) \cdot (\gamma + f_{i-1}) \cdot (\gamma(1 + \beta) + t_{i-1} + \beta t_i)}{(\gamma(1+\beta) + s_{i-1} + \beta s_{i})(\gamma(1+\beta) + s_{n+i-1} + \beta s_{n+i})} +$$ + +> Note: in plookup $s$ is longer than $n$ ($|s| = |f| + |t|$), and thus it needs to be split into multiple vectors to enforce the constraint at every $i \in [[0;n]]$. This leads to the two terms in the denominator as shown above, so that the degree of $\gamma (1 + \beta)$ is equal in the nominator and denominator. + +### Lookup tables + +Kimchi uses a single **lookup table** at the moment of this writing; the XOR table. The XOR table for values of 1 bit is the following: + + +| l | r | o | +| --- | --- | --- | +| 1 | 0 | 1 | +| 0 | 1 | 1 | +| 1 | 1 | 0 | +| 0 | 0 | 0 | + +Whereas kimchi uses the XOR table for values of $4$ bits, which has $2^{8}$ entries. + +Note: the $(0, 0, 0)$ **entry** is at the very end on purpose (as it will be used as dummy entry for rows of the witness that don't care about lookups). + +### Querying the table + +The plookup paper handles a vector of lookups $f$ which we do not have. So the first step is to create such a table from the witness columns (or registers). To do this, we define the following objects: + +* a **query** tells us what registers, in what order, and scaled by how much, are part of a query +* a **query selector** tells us which rows are using the query. It is pretty much the same as a [gate selector](). + +Let's go over the first item in this section. + +For example, the following **query** tells us that we want to check if $r_0 \oplus r_2 = 2\cdot r_1$ + +| l | r | o | +| :---: | :---: | :---: | +| 1, $r_0$ | 1, $r_2$ | 2, $r_1$ | + +The grand product argument for the lookup consraint will look like this at this point: + +$$ +\mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{(1+\beta) \cdot {\color{green}(\gamma + w_0(g^i) + j \cdot w_2(g^i) + j^2 \cdot 2 \cdot w_1(g^i))} \cdot (\gamma(1 + \beta) + t_{i-1} + \beta t_i)}{(\gamma(1+\beta) + s_{i-1} + \beta s_{i})(\gamma(1+\beta) + s_{n+i-1} + \beta s_{n+i})} +$$ + +Not all rows need to perform queries into a lookup table. We will use a query selector in the next section to make the constraints work with this in mind. + +### Query selector + +The associated **query selector** tells us on which rows the query into the XOR lookup table occurs. + +| row | query selector | +| :---: | :------------: | +| 0 | 1 | +| 1 | 0 | + + +Both the (XOR) lookup table and the query are built-ins in kimchi. The query selector is derived from the circuit at setup time. Currently only the ChaCha gates make use of the lookups. + +With the selectors, the grand product argument for the lookup constraint has the following form: + +$$ +\mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{(1+\beta) \cdot {\color{green}\mathsf{query}} \cdot (\gamma(1 + \beta) + t_{i-1} + \beta t_i)}{(\gamma(1+\beta) + s_{i-1} + \beta s_{i})} +$$ + +where $\color{green}{\mathsf{query}}$ is constructed so that a dummy query ($0 \oplus 0 = 0$) is used on rows that don't have a query. + +$$ +\begin{align} +\mathsf{query} := &\ \mathsf{selector} \cdot (\gamma + w_0(g^i) + j \cdot w_2(g^i) + j^2 \cdot 2 \cdot w_1(g^i)) + \\ +&\ (1- \mathsf{selector}) \cdot (\gamma + 0 + j \cdot 0 + j^2 \cdot 0) +\end{align} +$$ + +### Supporting multiple queries + +Since we would like to allow multiple table lookups per row, we define multiple **queries**, where each query is associated with a **lookup selector**. + +At the moment of this writing, the `ChaCha` gates all perform $4$ queries in a row. Thus, $4$ is trivially the largest number of queries that happen in a row. + +**Important**: to make constraints work, this means that each row must make $4$ queries. Potentially some or all of them are dummy queries. + +For example, the `ChaCha0`, `ChaCha1`, and `ChaCha2` gates will jointly apply the following 4 XOR queries on the current and following rows: + +| l | r | o | - | l | r | o | - | l | r | o | - | l | r | o | +| :---: | :---: | :----: | --- | :---: | :---: | :----: | --- | :---: | :---: | :----: | --- | :---: | :----: | :----: | +| 1, $r_3$ | 1, $r_7$ | 1, $r_{11}$ | - | 1, $r_4$ | 1, $r_8$ | 1, $r_{12}$ | - | 1, $r_5$ | 1, $r_9$ | 1, $r_{13}$ | - | 1, $r_6$ | 1, $r_{10}$ | 1, $r_{14}$ | + +which you can understand as checking for the current and following row that + +$$ +\begin{align} +r_3 \oplus r_7 &= r_{11}\\ +r_4 \oplus r_8 &= r_{12}\\ +r_5 \oplus r_9 &= r_{13}\\ +r_6 \oplus r_{10} &= r_{14} +\end{align} +$$ + +The `ChaChaFinal` also performs $4$ (somewhat similar) queries in the XOR lookup table. In total this is $8$ different queries that could be associated to $8$ selector polynomials. + +### Grouping queries by queries pattern + +Associating each query with a selector polynomial is not necessarily efficient. To summarize: + +* the `ChaCha0`, `ChaCha1`, and `ChaCha2` gates that in total make $4$ queries into the XOR table +* the `ChaChaFinal` gate makes another $4$ different queries into the XOR table + +Using the previous section's method, we'd have to use $8$ different lookup selector polynomials for each of the different $8$ queries. Since there's only $2$ use-cases, we can simply group them by **queries patterns** to reduce the number of lookup selector polynomials to $2$. + +The grand product argument for the lookup constraint looks like this now: + +$$ +\mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{{\color{green}(1+\beta)^4 \cdot \mathsf{query}} \cdot (\gamma(1 + \beta) + t_{i-1} + \beta t_i)}{(\gamma(1+\beta) + s_{i-1} + \beta s_{i})\times \ldots} +$$ + +where $\color{green}{\mathsf{query}}$ is constructed as: + +$$ +\begin{align} +\mathsf{query} = &\ \mathsf{selector}_1 \cdot \mathsf{pattern}_1 + \\ +&\ \mathsf{selector}_2 \cdot \mathsf{pattern}_2 + \\ +&\ (1 - \mathsf{selector}_1 - \mathsf{selector}_2) \cdot (\gamma + 0 + j \cdot 0 + j^2 \cdot 0)^4 +\end{align} +$$ + +where, for example the first pattern for the `ChaCha0`, `ChaCha1`, and `ChaCha2` gates looks like this: + +$$ +\begin{align} +\mathsf{pattern}_1 = &\ (\gamma + w_3(g^i) + j \cdot w_7(g^i) + j^2 \cdot w_{11}(g^i)) \cdot \\ +&\ (\gamma + w_4(g^i) + j \cdot w_8(g^i) + j^2 \cdot w_{12}(g^i)) \cdot \\ +&\ (\gamma + w_5(g^i) + j \cdot w_9(g^i) + j^2 \cdot w_{13}(g^i)) \cdot \\ +&\ (\gamma + w_6(g^i) + j \cdot w_{10}(g^i) + j^2 \cdot w_{14}(g^i)) \cdot \\ +\end{align} +$$ + +Note that there's now $4$ dummy queries, and they only appear when none of the lookup selectors are active. +If a pattern uses less than $4$ queries, it has to be padded with dummy queries as well. + +Finally, note that the denominator of the grand product argument is incomplete in the formula above. +Since the nominator has degree $5$ in $\gamma (1 + \beta)$, the denominator must match too. +This is achieved by having a longer $s$, and referring to it $5$ times. +The denominator thus becomes $\prod_{k=1}^{5} (\gamma (1+\beta) + s_{kn+i-1} + \beta s_{kn+i})$. + +## Back to the grand product argument + +There are two things that we haven't touched on: + +* The vector $t$ representing the **combined lookup table** (after its columns have been combined with a joint combiner $j$). The **non-combined loookup table** is fixed at setup time and derived based on the lookup tables used in the circuit (for now only one, the XOR lookup table, can be used in the circuit). +* The vector $s$ representing the sorted multiset of both the queries and the lookup table. This is created by the prover and sent as commitment to the verifier. + +The first vector $t$ is quite straightforward to think about: + +* if it is smaller than the domain (of size $n$), then we can repeat the last entry enough times to make the table of size $n$. +* if it is larger than the domain, then we can either increase the domain or split the vector in two (or more) vectors. This is most likely what we will have to do to support multiple lookup tables later. + +What about the second vector $s$? + +## The sorted vector $s$ + +We said earlier that in original plonk the size of $s$ is equal to $|s| = |f|+|t|$, where $f$ encodes queries, and $t$ encodes the lookup table. +With our multi-query approach, the second vector $s$ is of the size + +$$n \cdot |\#\text{queries}| + |\text{lookup\_table}|$$ + +That is, it contains the $n$ elements of each **query vectors** (the actual values being looked up, after being combined with the joint combinator, that's $4$ per row), as well as the elements of our lookup table (after being combined as well). + +Because the vector $s$ is larger than the domain size $n$, it is split into several vectors of size $n$. Specifically, in the plonkup paper, the two halves of $s$, which are then interpolated as $h_1$ and $h_2$. +The denominator in plonk in the vector form is +$$ +\big(\gamma(1+\beta) + s_{i-1} + \beta s_{i}\big)\big(\gamma(1+\beta)+s_{n+i-1} + \beta s_{n+i}\big) +$$ +which, when interpolated into $h_1$ and $h_2$, becomes +$$ +\big(\gamma(1+\beta) + h_1(x) + \beta h_1(g \cdot x)\big)\big(\gamma(1+\beta) + h_2(x) + \beta h_2(g \cdot x)\big) +$$ + +Since one has to compute the difference of every contiguous pairs, the last element of the first half is the replicated as the first element of the second half ($s_{n-1} = s_{n}$). +Hence, a separate constraint must be added to enforce that continuity on the interpolated polynomials $h_1$ and $h_2$: + +$$L_{n-1}(X)\cdot\big(h_1(X) - h_2(g \cdot X)\big) \equiv 0$$ + +which is equivalent to checking that $h_1(g^{n-1}) = h_2(1)$. + +## The sorted vector $s$ in kimchi + +Since this vector is known only by the prover, and is evaluated as part of the protocol, zero-knowledge must be added to the corresponding polynomial (in case of plookup approach, to $h_1(X),h_2(X)$). To do this in kimchi, we use the same technique as with the other prover polynomials: we randomize the last evaluations (or rows, on the domain) of the polynomial. + +This means two things for the lookup grand product argument: + +1. We cannot use the wrap around trick to make sure that the list is split in two correctly (enforced by $L_{n-1}(h_1(x) - h_2(g \cdot x)) = 0$ which is equivalent to $h_1(g^{n-1}) = h_2(1)$ in the plookup paper) +2. We have even less space to store an entire query vector. Which is actually super correct, as the witness also has some zero-knowledge rows at the end that should not be part of the queries anyway. + +The first problem can be solved in two ways: + +* **Zig-zag technique**. By reorganizing $s$ to alternate its values between the columns. For example, $h_1 = (s_0, s_2, s_4, \cdots)$ and $h_2 = (s_1, s_3, s_5, \cdots)$ so that you can simply write the denominator of the grand product argument as + $$(\gamma(1+\beta) + h_1(x) + \beta h_2(x))(\gamma(1+\beta)+ h_2(x) + \beta h_1(x \cdot g))$$ + Whis approach is taken by the [plonkup](https://eprint.iacr.org/2022/086) paper. +* **Snake technique**. By reorganizing $s$ as a snake. This is what is currently implemented in kimchi. + +The snake technique rearranges $s$ into the following shape: + +``` + __ _ + s_0 | s_{2n-1} | | | | + ... | ... | | | | + s_{n-1} | s_n | | | | + ‾‾‾‾‾‾‾‾‾‾‾ ‾‾ ‾ + h1 h2 h3 ... +``` + +Assuming that for now we have only one bend and two polynomials $h_1(X),h_2(X)$, the denominator has the following form: + +$$\big(\gamma(1+\beta) + h_1(x) + \beta h_1(x \cdot g)\big)\big(\gamma(1+\beta)+ h_2(x \cdot g) + \beta h_2(x)\big)$$ + +and the snake doing a U-turn is constrained via $s_{n-1} = s_n$, enforced by the following equation: + +$$L_{n-1} \cdot (h_1(x) - h_2(x)) = 0$$ + +In practice, $s$ will have more sections than just two. +Assume that we have $k$ sections in total, then the denominator generalizes to + +$$ +\prod_{i=1}^k \gamma(1+\beta) + h_i(x \cdot g^{\delta_{0,\ i \text{ mod } 2}}) + \beta h_i(x \cdot g^{\delta_{1,\ i \text{ mod } 2}}) +$$ + +where $\delta_{i,j}$ is Kronecker delta, equal to $1$ when $i$ is even (for the first term) or odd (for the second one), and equal to $0$ otherwise. + +Similarly, the U-turn constraints now become + +$$ +\begin{align*} +L_{n-1}(X) \cdot (h_2(X) - h_1(X)) &\equiv 0\\ +\color{green}L_{0}(X) \cdot (h_3(X) - h_2(X)) &\color{green}\equiv 0\\ +\color{green}L_{n-1}(X) \cdot (h_4(X) - h_3(X)) &\color{green}\equiv 0\\ +\ldots +\end{align*} +$$ + +In our concrete case with $4$ simultaneous lookups the vector $s$ has to be split into $k= 5$ sections --- each denominator term in the accumulator accounts for $4$ queries ($f$) and $1$ table consistency check ($t$). + +## Unsorted $t$ in $s$ + +Note that at setup time, $t$ cannot be sorted lexicographically as it is not combined yet. Since $s$ must be sorted by $t$ (in other words sorting of $s$ must follow the elements of $t$), there are two solutions: + +1. Both the prover and the verifier can sort the combined $t$ lexicographically, so that $s$ can be sorted lexicographically too using typical sorting algorithms +2. The prover can directly sort $s$ by $t$, so that the verifier doesn't have to do any pre-sorting and can just rely on the commitment of the columns of $t$ (which the prover can evaluate in the protocol). + +We take the second approach. +However, this must be done carefully since the combined $t$ entries can repeat. For some $i, l$ such that $i \neq l$, we might have + +$$ +t_0[i] + j \cdot t_1[i] + j^2 \cdot t_2[i] = t_0[l] + j \cdot t_1[l] + j^2 \cdot t_2[l] +$$ + +For example, if $f = \{1, 2, 2, 3\}$ and $t = \{2, 1, 2, 3\}$, then $\text{sorted}(f, t) = \{2, 2, 2, 1, 1, 2, 3, 3\}$ would be a way of correctly sorting the combined vector $s$. At the same time $\text{sorted}(f, t) = \{ 2, 2, 2, 2, 1, 1, 3, 3 \}$ is incorrect since it does not have a second block of $2$s, and thus such an $s$ is not sorted by $t$. + + +## Recap + +So to recap, to create the sorted polynomials $h_i$, the prover: + +1. creates a large query vector which contains the concatenation of the $4$ per-row (combined with the joint combinator) queries (that might contain dummy queries) for all rows +2. creates the (combined with the joint combinator) table vector +3. sorts all of that into a big vector $s$ +4. divides that vector $s$ into as many $h_i$ vectors as a necessary following the snake method +5. interpolate these $h_i$ vectors into $h_i$ polynomials +6. commit to them, and evaluate them as part of the protocol. diff --git a/book/src/plonk/maller_15.md b/book/src/kimchi/maller_15.md similarity index 100% rename from book/src/plonk/maller_15.md rename to book/src/kimchi/maller_15.md diff --git a/book/src/kimchi/permut.md b/book/src/kimchi/permut.md index 3d62293eae..cc82360ef6 100644 --- a/book/src/kimchi/permut.md +++ b/book/src/kimchi/permut.md @@ -1,4 +1 @@ -## Permutation - -TO-DO - +# Permutation diff --git a/book/src/kimchi/zkpm.md b/book/src/kimchi/zkpm.md new file mode 100644 index 0000000000..947a334538 --- /dev/null +++ b/book/src/kimchi/zkpm.md @@ -0,0 +1 @@ +# Zero-Column Approach to Zero-Knowledge diff --git a/book/src/pickles/passthrough.md b/book/src/pickles/passthrough.md deleted file mode 100644 index cc182c4e51..0000000000 --- a/book/src/pickles/passthrough.md +++ /dev/null @@ -1 +0,0 @@ -# Passthrough and Me-Only diff --git a/book/src/plonk/fiat_shamir.md b/book/src/plonk/fiat_shamir.md index fd486f6aba..f1e2b264ef 100644 --- a/book/src/plonk/fiat_shamir.md +++ b/book/src/plonk/fiat_shamir.md @@ -1,4 +1,4 @@ -# non-interaction with fiat-shamir +# Non-Interactivity via Fiat-Shamir So far we've talked about an interactive protocol between a prover and a verifier. The zero-knowledge proof was also in the honest verifier zero-knowedlge (HVZK) model, which is problematic. @@ -15,7 +15,7 @@ This is important as our technique to transform an interactive protocol to a non The whole idea is to replace the verifier by a random oracle, which in practice is a hash function. Note that by doing this, we remove potential leaks that can happen when the verifier acts dishonestly. -Initially the Fiat-Shamir transformation was only applied to sigma protocols, named after the greek letter $\Sigma$ due to its shape resembling the direction of messages (prover sends a commit to a verifier, verifier sends a challenge to a prover, prover sends the final proof to a verifier). +Initially the Fiat-Shamir transformation was only applied to sigma protocols, named after the greek letter $\Sigma$ due to its shape resembling the direction of messages (prover sends a commit to a verifier, verifier sends a challenge to a prover, prover sends the final proof to a verifier). A $Z$ would have made more sense but here we are. ## Generalization of Fiat-Shamir @@ -27,6 +27,6 @@ This is simple: every verifier move can be replaced by a hash of the transcript While we use a hash function for that, a different construction called the [duplex construction](https://keccak.team/sponge_duplex.html) is particularly useful in such situations as they allow to continuously absorb the transcript and produce challenges, while automatically authenticating the fact that they produced a challenge. -[Merlin](https://merlin.cool/) is a standardization of such a construction using the [Strobe protocol framework](https://strobe.sourceforge.io/) (a framework to make use of a duplex construction). -Note that the more recent [Xoodyak](https://keccak.team/xoodyak.html) (part of NIST's lightweight competition) could have been used for this as well. +[Merlin](https://merlin.cool/) is a standardization of such a construction using the [Strobe protocol framework](https://strobe.sourceforge.io/) (a framework to make use of a duplex construction). +Note that the more recent [Xoodyak](https://keccak.team/xoodyak.html) (part of NIST's lightweight competition) could have been used for this as well. Note also that Mina uses none of these standards, instead it simply uses Poseidon (see section on poseidon). diff --git a/book/src/plonk/glossary.md b/book/src/plonk/glossary.md index 4880894fb4..6769ac5f83 100644 --- a/book/src/plonk/glossary.md +++ b/book/src/plonk/glossary.md @@ -1,9 +1,9 @@ # Glossary -* size = number of rows -* columns = number of variables per rows -* cell = a pair (row, column) -* witness = the values assigned in all the cells -* gate = polynomials that act on the variables in a row -* selector vector = a vector of values 1 or 0 (not true for constant vector I think) that toggles gates and variables in a row -* gadget = a series of contiguous rows with some specific gates set (via selector vectors) +* Size: number of rows +* Columns: number of variables per rows +* Cell: a pair (row, column) +* Witness: the values assigned in all the cells +* Gate: polynomials that act on the variables in a row +* Selector vector: a vector of values 1 or 0 (not true for constant vector I think) that toggles gates and variables in a row +* Gadget: a series of contiguous rows with some specific gates set (via selector vectors) diff --git a/book/src/plonk/zkpm.md b/book/src/plonk/zkpm.md index a4a1ba291c..84f46f3ec0 100644 --- a/book/src/plonk/zkpm.md +++ b/book/src/plonk/zkpm.md @@ -19,7 +19,9 @@ Let $f(X)$ be an interpolation polynomial of degree $n-1$ such that $f(h_i) = v_ **Proof sketch.** Recall that the interpolation polynomial is -$f(X) = \sum_{j = 1}^n \prod_{k \neq j} \frac{(X-h_k)}{(h_j-h_k)} v_j$ +$$ +f(X) = \sum_{j = 1}^n \prod_{k \neq j} \frac{(X-h_k)}{(h_j-h_k)} v_j +$$ With $V_{w+1}, \ldots, V_{w+k}$ as random variables, we have, $f(X) = a_{w+1} V_{w+1} + a_{w+2} V_{w+2} + \ldots + a_{w+k} V_{w+k} + a$ @@ -32,16 +34,22 @@ The idea here is to set the last $k$ evaluations to be uniformly random elements **Modified permutation polynomial.** Specifically, set $z(X)$ as follows. -$z(X) = L_1(X) + \sum_{i = 1}^{\blue{n-k-2}} \left(L_{i+1} \prod_{j=1}^i \mathsf{frac}_{i,j} \right) + \blue{t_1 L_{n-k}(X) + \ldots + t_k L_{n}(X) }$ +$$ +z(X) = L_1(X) + \sum_{i = 1}^{\blue{n-k-2}} \left(L_{i+1} \prod_{j=1}^i \mathsf{frac}_{i,j} \right) + \blue{t_1 L_{n-k}(X) + \ldots + t_k L_{n}(X) } +$$ From Lemma 1, the above $z(X)$ has the desired zero knowledge property when $k$ evaluations are revealed. However, we need to modify the other parts of the protocol so that the last $k$ elements are not subject to the permutation evaluation, since they will no longer satisfy the permutation check. Specifically, we will need to modify the permutation polynomial to disregard those random elements, as follows. -$ \begin{aligned} & t(X) = \\ - & (a(X)b(X)q_M(X) + a(X)q_L(X) + b(X)q_R(X) + c(X)q_O(X) + PI(X) + q_C(X)) \frac{1}{z_H(X)} \\ - &+ ((a(X) + \beta X + \gamma)(b(X) + \beta k_1 X + \gamma)(c(X) + \beta k_2X + \gamma)z(X) \blue{(X-h_{n-k}) \ldots (X-h_{n-1})(X-h_n)} ) \frac{\alpha}{z_{H}(X)} \\ - & - ((a(X) + \beta S_{\sigma1}(X) + \gamma)(b(X) + \beta S_{\sigma2}(X) + \gamma)(c(X) + \beta S_{\sigma3}(X) + \gamma)z(X\omega) \blue{(X-h_{n-k}) \ldots (X-h_{n-1})(X-h_n)}) \frac{\alpha}{z_{H}(X)} \\ - & + (z(X)-1)L_1(X) \frac{\alpha^2}{z_H(X)} \\ - & + \blue{(z(X)-1)L_{n-k}(X) \frac{\alpha^3}{z_H(X)} } \end{aligned} $ +$$ +\begin{aligned} & t(X) = \\ + & \Big(a(X)b(X)q_M(X) + a(X)q_L(X) + b(X)q_R(X) + c(X)q_O(X) + PI(X) + q_C(X)\Big) \frac{1}{z_H(X)} \\ + &+ \Big((a(X) + \beta X + \gamma)(b(X) + \beta k_1 X + \gamma)(c(X) + \beta k_2X + \gamma)z(X)\\ + &\qquad\qquad\qquad\times{\blue{(X-h_{n-k}) \ldots (X-h_{n-1})(X-h_n)}} \Big) \frac{\alpha}{z_{H}(X)} \\ + & - \Big((a(X) + \beta S_{\sigma1}(X) + \gamma)(b(X) + \beta S_{\sigma2}(X) + \gamma)(c(X) + \beta S_{\sigma3}(X) + \gamma)z(X\omega)\\ + &\qquad\qquad\qquad\times{\blue{(X-h_{n-k}) \ldots (X-h_{n-1})(X-h_n)}}\Big) \frac{\alpha}{z_{H}(X)} \\ + & + \Big(z(X)-1\Big)\cdot L_1(X) \frac{\alpha^2}{z_H(X)} \\ + & + \blue{\Big(z(X)-1\Big)\cdot L_{n-k}(X) \frac{\alpha^3}{z_H(X)} } \end{aligned} +$$ **Modified permutation checks.** To recall, the permutation check was originally as follows. For all $h \in H$, @@ -50,7 +58,6 @@ $ \begin{aligned} & t(X) = \\ = Z(\omega h)[(a(h) + \beta S_{\sigma1}(h) + \gamma)(b(h) + \beta S_{\sigma2}(h) + \gamma)(c(h) + \beta S_{\sigma3}(h) + \gamma)]$ - The modified permuation checks that ensures that the check is performed only on all the values except the last $k$ elements in the witness polynomials are as follows. * For all $h \in H$, $L_1(h)(Z(h) - 1) = 0$ diff --git a/book/src/rfcs/3-lookup.md b/book/src/rfcs/3-lookup.md deleted file mode 100644 index 32dc9d39b8..0000000000 --- a/book/src/rfcs/3-lookup.md +++ /dev/null @@ -1,318 +0,0 @@ -# RFC: Plookup in Kimchi - -In 2020, [plookup](https://eprint.iacr.org/2020/315.pdf) showed how to create lookup proofs. Proofs that some witness values are part of a [lookup table](https://en.wikipedia.org/wiki/Lookup_table). Two years later, an independent team published [plonkup](https://eprint.iacr.org/2022/086) showing how to integrate Plookup into Plonk. - -This document specifies how we integrate plookup in kimchi. It assumes that the reader understands the basics behind plookup. - -## Overview - -We integrate plookup in kimchi with the following differences: - -* we snake-ify the sorted table instead of wrapping it around (see later) -* we allow fixed-ahead-of-time linear combinations of columns of the queries we make -* we only use a single table (XOR) at the moment of this writing -* we allow several lookups (or queries) to be performed within the same row -* zero-knowledgeness is added in a specific way (see later) - -The following document explains the protocol in more detail - -### Recap on the grand product argument of plookup - -As per the Plookup paper, the prover will have to compute three vectors: - -* $f$, the (secret) **query vector**, containing the witness values that the prover wants to prove are part of the lookup table. -* $t$, the (public) **lookup table**. -* $s$, the (secret) concatenation of $f$ and $t$, sorted by $t$ (where elements are listed in the order they are listed in $t$). - -Essentially, plookup proves that all the elements in $f$ are indeed in the lookup table $t$ if and only if the following multisets are equal: - -* $\{(1+\beta)f, \text{diff}(t)\}$ -* $\text{diff}(\text{sorted}(f, t))$ - -where $\text{diff}$ is a new set derived by applying a "randomized difference" between every successive pairs of a vector. For example: - -* $f = \{5, 4, 1, 5\}$ -* $t = \{1, 4, 5\}$ -* $\{\color{blue}{(1+\beta)f}, \color{green}{\text{diff}(t)}\} = \{\color{blue}{(1+\beta)5, (1+\beta)4, (1+\beta)1, (1+\beta)5}, \color{green}{1+\beta 4, 4+\beta 5}\}$ -* $\text{diff}(\text{sorted}(f, t)) = \{1+\beta 1, 1+\beta 4, 4+\beta 4, 4+\beta 5, 5+\beta 5, 5+\beta 5\}$ - -> Note: This assumes that the lookup table is a single column. You will see in the next section how to address lookup tables with more than one column. - -The equality between the multisets can be proved with the permutation argument of plonk, which would look like enforcing constraints on the following accumulator: - -* init: $\mathsf{acc}_0 = 1$ -* final: $\mathsf{acc}_n = 1$ -* for every $0 < i \leq n$: - $$ - \mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{(\gamma + (1+\beta) f_{i-1}) \cdot (\gamma + t_{i-1} + \beta t_i)}{(\gamma + s_{i-1} + \beta s_{i})(\gamma + s_{n+i-1} + \beta s_{n+i})} - $$ - -Note that the plookup paper uses a slightly different equation to make the proof work. It is possible that the proof would work with the above equation, but for simplicity let's just use the equation published in plookup: - -$$ -\mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{(1+\beta) \cdot (\gamma + f_{i-1}) \cdot (\gamma(1 + \beta) + t_{i-1} + \beta t_i)}{(\gamma(1+\beta) + s_{i-1} + \beta s_{i})(\gamma(1+\beta) + s_{n+i-1} + \beta s_{n+i})} -$$ - -> Note: in plookup $s$ is longer than $n$ ($|s| = |f| + |t|$), and thus it needs to be split into multiple vectors to enforce the constraint at every $i \in [[0;n]]$. This leads to the two terms in the denominator as shown above, so that the degree of $\gamma (1 + \beta)$ is equal in the nominator and denominator. - -### Lookup tables - -Kimchi uses a single **lookup table** at the moment of this writing; the XOR table. The XOR table for values of 1 bit is the following: - - -| l | r | o | -| --- | --- | --- | -| 1 | 0 | 1 | -| 0 | 1 | 1 | -| 1 | 1 | 0 | -| 0 | 0 | 0 | - -Whereas kimchi uses the XOR table for values of $4$ bits, which has $2^{8}$ entries. - -Note: the $(0, 0, 0)$ **entry** is at the very end on purpose (as it will be used as dummy entry for rows of the witness that don't care about lookups). - -### Querying the table - -The plookup paper handles a vector of lookups $f$ which we do not have. So the first step is to create such a table from the witness columns (or registers). To do this, we define the following objects: - -* a **query** tells us what registers, in what order, and scaled by how much, are part of a query -* a **query selector** tells us which rows are using the query. It is pretty much the same as a [gate selector](). - -Let's go over the first item in this section. - -For example, the following **query** tells us that we want to check if $r_0 \oplus r_2 = 2\cdot r_1$ - -| l | r | o | -| :---: | :---: | :---: | -| 1, $r_0$ | 1, $r_2$ | 2, $r_1$ | - -The grand product argument for the lookup consraint will look like this at this point: - -$$ -\mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{(1+\beta) \cdot {\color{green}(\gamma + w_0(g^i) + j \cdot w_2(g^i) + j^2 \cdot 2 \cdot w_1(g^i))} \cdot (\gamma(1 + \beta) + t_{i-1} + \beta t_i)}{(\gamma(1+\beta) + s_{i-1} + \beta s_{i})(\gamma(1+\beta) + s_{n+i-1} + \beta s_{n+i})} -$$ - -Not all rows need to perform queries into a lookup table. We will use a query selector in the next section to make the constraints work with this in mind. - -### Query selector - -The associated **query selector** tells us on which rows the query into the XOR lookup table occurs. - -| row | query selector | -| :---: | :------------: | -| 0 | 1 | -| 1 | 0 | - - -Both the (XOR) lookup table and the query are built-ins in kimchi. The query selector is derived from the circuit at setup time. Currently only the ChaCha gates make use of the lookups. - -With the selectors, the grand product argument for the lookup constraint has the following form: - -$$ -\mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{(1+\beta) \cdot {\color{green}\mathsf{query}} \cdot (\gamma(1 + \beta) + t_{i-1} + \beta t_i)}{(\gamma(1+\beta) + s_{i-1} + \beta s_{i})} -$$ - -where $\color{green}{\mathsf{query}}$ is constructed so that a dummy query ($0 \oplus 0 = 0$) is used on rows that don't have a query. - -$$ -\begin{align} -\mathsf{query} := &\ \mathsf{selector} \cdot (\gamma + w_0(g^i) + j \cdot w_2(g^i) + j^2 \cdot 2 \cdot w_1(g^i)) + \\ -&\ (1- \mathsf{selector}) \cdot (\gamma + 0 + j \cdot 0 + j^2 \cdot 0) -\end{align} -$$ - -### Supporting multiple queries - -Since we would like to allow multiple table lookups per row, we define multiple **queries**, where each query is associated with a **lookup selector**. - -At the moment of this writing, the `ChaCha` gates all perform $4$ queries in a row. Thus, $4$ is trivially the largest number of queries that happen in a row. - -**Important**: to make constraints work, this means that each row must make $4$ queries. Potentially some or all of them are dummy queries. - -For example, the `ChaCha0`, `ChaCha1`, and `ChaCha2` gates will jointly apply the following 4 XOR queries on the current and following rows: - -| l | r | o | - | l | r | o | - | l | r | o | - | l | r | o | -| :---: | :---: | :----: | --- | :---: | :---: | :----: | --- | :---: | :---: | :----: | --- | :---: | :----: | :----: | -| 1, $r_3$ | 1, $r_7$ | 1, $r_{11}$ | - | 1, $r_4$ | 1, $r_8$ | 1, $r_{12}$ | - | 1, $r_5$ | 1, $r_9$ | 1, $r_{13}$ | - | 1, $r_6$ | 1, $r_{10}$ | 1, $r_{14}$ | - -which you can understand as checking for the current and following row that - -$$ -\begin{align} -r_3 \oplus r_7 &= r_{11}\\ -r_4 \oplus r_8 &= r_{12}\\ -r_5 \oplus r_9 &= r_{13}\\ -r_6 \oplus r_{10} &= r_{14} -\end{align} -$$ - -The `ChaChaFinal` also performs $4$ (somewhat similar) queries in the XOR lookup table. In total this is $8$ different queries that could be associated to $8$ selector polynomials. - -### Grouping queries by queries pattern - -Associating each query with a selector polynomial is not necessarily efficient. To summarize: - -* the `ChaCha0`, `ChaCha1`, and `ChaCha2` gates that in total make $4$ queries into the XOR table -* the `ChaChaFinal` gate makes another $4$ different queries into the XOR table - -Using the previous section's method, we'd have to use $8$ different lookup selector polynomials for each of the different $8$ queries. Since there's only $2$ use-cases, we can simply group them by **queries patterns** to reduce the number of lookup selector polynomials to $2$. - -The grand product argument for the lookup constraint looks like this now: - -$$ -\mathsf{acc}_i = \mathsf{acc}_{i-1} \cdot \frac{{\color{green}(1+\beta)^4 \cdot \mathsf{query}} \cdot (\gamma(1 + \beta) + t_{i-1} + \beta t_i)}{(\gamma(1+\beta) + s_{i-1} + \beta s_{i})\times \ldots} -$$ - -where $\color{green}{\mathsf{query}}$ is constructed as: - -$$ -\begin{align} -\mathsf{query} = &\ \mathsf{selector}_1 \cdot \mathsf{pattern}_1 + \\ -&\ \mathsf{selector}_2 \cdot \mathsf{pattern}_2 + \\ -&\ (1 - \mathsf{selector}_1 - \mathsf{selector}_2) \cdot (\gamma + 0 + j \cdot 0 + j^2 \cdot 0)^4 -\end{align} -$$ - -where, for example the first pattern for the `ChaCha0`, `ChaCha1`, and `ChaCha2` gates looks like this: - -$$ -\begin{align} -\mathsf{pattern}_1 = &\ (\gamma + w_3(g^i) + j \cdot w_7(g^i) + j^2 \cdot w_{11}(g^i)) \cdot \\ -&\ (\gamma + w_4(g^i) + j \cdot w_8(g^i) + j^2 \cdot w_{12}(g^i)) \cdot \\ -&\ (\gamma + w_5(g^i) + j \cdot w_9(g^i) + j^2 \cdot w_{13}(g^i)) \cdot \\ -&\ (\gamma + w_6(g^i) + j \cdot w_{10}(g^i) + j^2 \cdot w_{14}(g^i)) \cdot \\ -\end{align} -$$ - -Note that there's now $4$ dummy queries, and they only appear when none of the lookup selectors are active. -If a pattern uses less than $4$ queries, it has to be padded with dummy queries as well. - -Finally, note that the denominator of the grand product argument is incomplete in the formula above. -Since the nominator has degree $5$ in $\gamma (1 + \beta)$, the denominator must match too. -This is achieved by having a longer $s$, and referring to it $5$ times. -The denominator thus becomes $\prod_{k=1}^{5} (\gamma (1+\beta) + s_{kn+i-1} + \beta s_{kn+i})$. - -## Back to the grand product argument - -There are two things that we haven't touched on: - -* The vector $t$ representing the **combined lookup table** (after its columns have been combined with a joint combiner $j$). The **non-combined loookup table** is fixed at setup time and derived based on the lookup tables used in the circuit (for now only one, the XOR lookup table, can be used in the circuit). -* The vector $s$ representing the sorted multiset of both the queries and the lookup table. This is created by the prover and sent as commitment to the verifier. - -The first vector $t$ is quite straightforward to think about: - -* if it is smaller than the domain (of size $n$), then we can repeat the last entry enough times to make the table of size $n$. -* if it is larger than the domain, then we can either increase the domain or split the vector in two (or more) vectors. This is most likely what we will have to do to support multiple lookup tables later. - -What about the second vector $s$? - -## The sorted vector $s$ - -We said earlier that in original plonk the size of $s$ is equal to $|s| = |f|+|t|$, where $f$ encodes queries, and $t$ encodes the lookup table. -With our multi-query approach, the second vector $s$ is of the size - -$$n \cdot |\#\text{queries}| + |\text{lookup\_table}|$$ - -That is, it contains the $n$ elements of each **query vectors** (the actual values being looked up, after being combined with the joint combinator, that's $4$ per row), as well as the elements of our lookup table (after being combined as well). - -Because the vector $s$ is larger than the domain size $n$, it is split into several vectors of size $n$. Specifically, in the plonkup paper, the two halves of $s$, which are then interpolated as $h_1$ and $h_2$. -The denominator in plonk in the vector form is -$$ -\big(\gamma(1+\beta) + s_{i-1} + \beta s_{i}\big)\big(\gamma(1+\beta)+s_{n+i-1} + \beta s_{n+i}\big) -$$ -which, when interpolated into $h_1$ and $h_2$, becomes -$$ -\big(\gamma(1+\beta) + h_1(x) + \beta h_1(g \cdot x)\big)\big(\gamma(1+\beta) + h_2(x) + \beta h_2(g \cdot x)\big) -$$ - -Since one has to compute the difference of every contiguous pairs, the last element of the first half is the replicated as the first element of the second half ($s_{n-1} = s_{n}$). -Hence, a separate constraint must be added to enforce that continuity on the interpolated polynomials $h_1$ and $h_2$: - -$$L_{n-1}(X)\cdot\big(h_1(X) - h_2(g \cdot X)\big) \equiv 0$$ - -which is equivalent to checking that $h_1(g^{n-1}) = h_2(1)$. - -## The sorted vector $s$ in kimchi - -Since this vector is known only by the prover, and is evaluated as part of the protocol, zero-knowledge must be added to the corresponding polynomial (in case of plookup approach, to $h_1(X),h_2(X)$). To do this in kimchi, we use the same technique as with the other prover polynomials: we randomize the last evaluations (or rows, on the domain) of the polynomial. - -This means two things for the lookup grand product argument: - -1. We cannot use the wrap around trick to make sure that the list is split in two correctly (enforced by $L_{n-1}(h_1(x) - h_2(g \cdot x)) = 0$ which is equivalent to $h_1(g^{n-1}) = h_2(1)$ in the plookup paper) -2. We have even less space to store an entire query vector. Which is actually super correct, as the witness also has some zero-knowledge rows at the end that should not be part of the queries anyway. - -The first problem can be solved in two ways: - -* **Zig-zag technique**. By reorganizing $s$ to alternate its values between the columns. For example, $h_1 = (s_0, s_2, s_4, \cdots)$ and $h_2 = (s_1, s_3, s_5, \cdots)$ so that you can simply write the denominator of the grand product argument as - $$(\gamma(1+\beta) + h_1(x) + \beta h_2(x))(\gamma(1+\beta)+ h_2(x) + \beta h_1(x \cdot g))$$ - Whis approach is taken by the [plonkup](https://eprint.iacr.org/2022/086) paper. -* **Snake technique**. By reorganizing $s$ as a snake. This is what is currently implemented in kimchi. - -The snake technique rearranges $s$ into the following shape: - -``` - __ _ - s_0 | s_{2n-1} | | | | - ... | ... | | | | - s_{n-1} | s_n | | | | - ‾‾‾‾‾‾‾‾‾‾‾ ‾‾ ‾ - h1 h2 h3 ... -``` - -Assuming that for now we have only one bend and two polynomials $h_1(X),h_2(X)$, the denominator has the following form: - -$$\big(\gamma(1+\beta) + h_1(x) + \beta h_1(x \cdot g)\big)\big(\gamma(1+\beta)+ h_2(x \cdot g) + \beta h_2(x)\big)$$ - -and the snake doing a U-turn is constrained via $s_{n-1} = s_n$, enforced by the following equation: - -$$L_{n-1} \cdot (h_1(x) - h_2(x)) = 0$$ - -In practice, $s$ will have more sections than just two. -Assume that we have $k$ sections in total, then the denominator generalizes to - -$$ -\prod_{i=1}^k \gamma(1+\beta) + h_i(x \cdot g^{\delta_{0,\ i \text{ mod } 2}}) + \beta h_i(x \cdot g^{\delta_{1,\ i \text{ mod } 2}}) -$$ - -where $\delta_{i,j}$ is Kronecker delta, equal to $1$ when $i$ is even (for the first term) or odd (for the second one), and equal to $0$ otherwise. - -Similarly, the U-turn constraints now become - -$$ -\begin{align*} -L_{n-1}(X) \cdot (h_2(X) - h_1(X)) &\equiv 0\\ -\color{green}L_{0}(X) \cdot (h_3(X) - h_2(X)) &\color{green}\equiv 0\\ -\color{green}L_{n-1}(X) \cdot (h_4(X) - h_3(X)) &\color{green}\equiv 0\\ -\ldots -\end{align*} -$$ - -In our concrete case with $4$ simultaneous lookups the vector $s$ has to be split into $k= 5$ sections --- each denominator term in the accumulator accounts for $4$ queries ($f$) and $1$ table consistency check ($t$). - -## Unsorted $t$ in $s$ - -Note that at setup time, $t$ cannot be sorted lexicographically as it is not combined yet. Since $s$ must be sorted by $t$ (in other words sorting of $s$ must follow the elements of $t$), there are two solutions: - -1. Both the prover and the verifier can sort the combined $t$ lexicographically, so that $s$ can be sorted lexicographically too using typical sorting algorithms -2. The prover can directly sort $s$ by $t$, so that the verifier doesn't have to do any pre-sorting and can just rely on the commitment of the columns of $t$ (which the prover can evaluate in the protocol). - -We take the second approach. -However, this must be done carefully since the combined $t$ entries can repeat. For some $i, l$ such that $i \neq l$, we might have - -$$ -t_0[i] + j \cdot t_1[i] + j^2 \cdot t_2[i] = t_0[l] + j \cdot t_1[l] + j^2 \cdot t_2[l] -$$ - -For example, if $f = \{1, 2, 2, 3\}$ and $t = \{2, 1, 2, 3\}$, then $\text{sorted}(f, t) = \{2, 2, 2, 1, 1, 2, 3, 3\}$ would be a way of correctly sorting the combined vector $s$. At the same time $\text{sorted}(f, t) = \{ 2, 2, 2, 2, 1, 1, 3, 3 \}$ is incorrect since it does not have a second block of $2$s, and thus such an $s$ is not sorted by $t$. - - -## Recap - -So to recap, to create the sorted polynomials $h_i$, the prover: - -1. creates a large query vector which contains the concatenation of the $4$ per-row (combined with the joint combinator) queries (that might contain dummy queries) for all rows -2. creates the (combined with the joint combinator) table vector -3. sorts all of that into a big vector $s$ -4. divides that vector $s$ into as many $h_i$ vectors as a necessary following the snake method -5. interpolate these $h_i$ vectors into $h_i$ polynomials -6. commit to them, and evaluate them as part of the protocol. diff --git a/book/src/rfcs/keccak.md b/book/src/rfcs/keccak.md index 457c247b97..03174df193 100644 --- a/book/src/rfcs/keccak.md +++ b/book/src/rfcs/keccak.md @@ -1,166 +1 @@ -# RFC: Keccak - -The Keccak gadget is comprised of 3 circuit gates (`Xor16`, `Rot64`, and `Zero`) - -Keccak works with 64-bit words. The state is represented using $5\times 5$ matrix -of 64 bit words. Each compression step of Keccak consists of 24 rounds. Let us -denote the state matrix with $A$ (indexing elements as $A[x,y]$), from which we derive -further states as follows in each round. Each round then consists of the following 5 steps: - -$$ -\begin{align} -C[x] &= A[x,0] \oplus A[x,1] \oplus A[x,2] \oplus A[x,3] \oplus A[x,4] \\ -D[x] &= C[x-1] \oplus ROT(C[x+1],1) \\ -E[x,y] &= A[x,y] \oplus D[x] \\ -B[y,2x+3y] &= ROT(E[x,y],\rho[x,y]) \\ -F[x,y] &= B[x,y] \oplus ((NOT\ B[x+1,y]) AND\ B[x+2,y]) \\ -Fp[0,0] &= F[0,0] \oplus RC -\end{align} -$$ - -for $0\leq x, y \leq 4$ and $\rho[x,y]$ is the rotation offset defined for Keccak. -The values are in the table below extracted from the Keccak reference - - -| | x = 3 | x = 4 | x = 0 | x = 1 | x = 2 | -| ----- | ----- | ----- | ----- | ----- | ----- | -| y = 2 | 155 | 231 | 3 | 10 | 171 | -| y = 1 | 55 | 276 | 36 | 300 | 6 | -| y = 0 | 28 | 91 | 0 | 1 | 190 | -| y = 4 | 120 | 78 | 210 | 66 | 253 | -| y = 3 | 21 | 136 | 105 | 45 | 15 | - -## Design Approach: - -The atomic operations are XOR, ROT, NOT, AND. In the sections below, we will describe -the gates for these operations. Below are some common approaches followed in their design. - -To fit within 15 wires, we first decompose each word into its lower and upper 32-bit -components. A gate for an atomic operation works with those 32-bit components at a time. - -Before we describe the specific gate design approaches, below are some constraints in the -Kimchi framework that dictated those approaches. -* only 4 lookups per row -* only first 7 columns are available to the permutation polynomial - -## Rot64 - -It is clear from the definition of the rotation gate that its constraints are complete -(meaning that honest instances always satisfy the constraints). It is left to be proven -that the proposal is sound. In this section, we will give a proof that as soon as we -perform the range checks on the excess and shifted parts of the input, only one possible -assignment satisfies the constraints. This means that there is no dishonest instance that -can make the constraints pass. We will also give an example where one could find wrong -rotation witnesses that would satisfy the constraints if we did not check the range. - -### Necessity of range checks - -First of all, we will illustrate the necessity of range-checks with a simple example. -For the sake of readability, we will use some toy field lengths. In particular, let us -assume that our words have 4 bits, meaning all of the elements between `0x0` and `0xF`. -Next, we will be using the native field $\mathbb{F}_{32}$. - -As we will later see, this choice of field lengths is not enough to perform any 4-bit -rotation, since the operations in the constraints would overflow the native field. -Nonetheless, it will be sufficient for our example where we will only rotate by 1 bit. - -Assume we want to rotate the word `0b1101` (meaning 13) by 1 bit to the left. This gives -us the rotated word `0b1011` (meaning 11). The excess part of the word is `0b1`, whereas -the shifted part corresponds to `0b1010`. We recall the constraints for the rotation gate: - -$$ -\begin{align*} -word \cdot 2^{rot} &= excess \cdot 2^{len} + shifted \\ -rotated &= excess + shifted -\end{align*} -$$ - -Applied to our example, this results in the following equations: - -$$ -\begin{align*} -13 \cdot 2 &= excess \cdot 16 + shifted \\ -11 &= excess + shifted -\end{align*} -$$ - -We can easily check that the proposed values of the shifted `0b1010=10` and the excess -`0b1=1` values satisfy the above constraint because $26 = 1 \cdot 16 + 10$ and $11 = 1 + 10$. -Now, the question is: _can we find another value for excess and shifted, such that their addition results in an incorrect rotated word?_ - -The answer to this question is yes, due to __diophantine equations__. We basically want to find $x,y$ such that $26 = x \cdot 16 + y (\text{ mod } 32)$. The solution to this equation is: - -$$ -\begin{align*} -\forall k \in [0..31]: & \\ -x &= k \\ -y &= 26 - 16 \cdot k -\end{align*} -$$ - -We chose these word and field lengths to better understand the behaviour of the solution. Here, we can see two "classes" of evaluations. - -- If we choose an even $k$, then $y$ will have the following shape: - - $$26 - 16 \cdot (2 \cdot n) \iff 26 - 32n \equiv_{32} 26 $$ - - Meaning, if $x = 2n$ then $y = 26$. - -- If on the other hand, we chose an odd $k$, then $y$ will have the following shape instead: - - $$26 - 16 \cdot (2 \cdot n + 1) \iff 26 - 32n - 16 \equiv_{32} 26 - 16 = 10$$ - - Meaning, if $x = 2n+1$ then $y = 10$. - -Thus, possible solutions to the diophantine equation are: - -$$ -\begin{align*} -x &= 0, 1, 2, 3, 4, 5... \\ -y &= 26, 10, 26, 10, 26, 10... -\end{align*} -$$ - -Note that our valid witness is part of that set of solutions, meaning $x=1$ and $y=10$. Of course, we can also find another dishonest instantiation such as $x=0$ and $y=26$. Perhaps one could think that we do not need to worry about this case, because the resulting rotation word would be $0+26=26$, and if we later use that result as an input to a subsequent gate such as XOR, the value $26$ would not fit and at some point the constraint system would complain. Nonetheless, we still have other solutions to worry about, such as $(x=3, y=10)$ or $(x=5, y=10)$, since they would result in a rotated word that would fit in the word length of 4 bits, yet would be incorrect (not equal to $11$). - -All of the above incorrect solutions differ in one thing: they have different bit lengths. This means that we need to range check the values for the excess and shifted witnesses to make sure they have the correct length. - -### Sufficiency of range checks - -In the following, we will give a proof that performing range checks for these values is not only necessary but also sufficient to prove that the rotation gate is sound. In other words, we will prove there are no two possible solutions of the decomposition constraint that have the correct bit lengths. Now, for the sake of robustness, we will consider 64-bit words and fields with at least twice the bit length of the words (as is our case). - -We will proceed by __contradiction__. Suppose there are two different solutions to the following diophantic equation: - -$$ -\begin{align*} -\forall k \in \mathbb{F}_n: & \\ -x &= k \\ -y &= w \cdot 2^r - 2^{64} \cdot k -\end{align*} -$$ - -where $k$ is a parameter to instantiate the solutions, $w$ is the word to be rotated, $r$ is the rotation amount, and $n$ is the field length. - -Then, that means that there are two different solutions, $(0\leq x=a<2^r, 0\leq y=b<2^{64})$ and $(0\leq x=a'<2^r, 0\leq y=b'<2^{64})$ with at least $a \neq a'$ or $b \neq b'$. We will show that this is impossible. - -If both are solutions to the same equation, then: -$$ -\begin{align*} -w \cdot 2^r &= a \cdot 2^{64} + b \\ -w \cdot 2^r &= a'\cdot 2^{64} + b' -\end{align*} -$$ -means that $a \cdot 2^{64} + b = a'\cdot 2^{64} + b'$. Moving terms to the left side, we have an equivalent equality: $2^{64}(a-a') + (b-b')=0 \mod{n}$. There are three cases to consider: - -- $a = a'$ and $b \neq b'$: then $(b - b') \equiv_n 0$ and this can only happen if $b' = b + kn$. But since $n > 2^{64}$, then $b'$ cannot be smaller than $2^{64}$ as it was assumed. CONTRADICTION. - -- $b = b'$ and $a \neq a'$: then $2^{64}(a - a') \equiv_n 0$ and this can only happen if $a' = a + kn$. But since $n > 2^r$, then $a'$ cannot be smaller than $2^r$ as it was assumed. CONTRADICTION. - -- $a\neq a'$ and $b \neq b'$: then we have something like $2^{64} \alpha + \beta \equiv_n 0$. - - This means $\beta \equiv_n -2^{64} \alpha = k \cdot n - 2^{64} \alpha$ for any $k$. - - According to the assumption, both $0\leq a<2^r$ and $0\leq a'<2^r$. This means, the difference $\alpha:=(a - a')$ lies anywhere in between the following interval: - - $$1 - 2^r \leq \alpha \leq 2^r - 1$$ - - We plug in this interval to the above equation to obtain the following interval for $\beta$: - - $$k\cdot n - 2^{64}(1-2^r)\leq \beta \leq k\cdot n - 2^{64}(2^r - 1) $$ - - We look at this interval from both sides of the inequality: $\beta \geq kn - 2^{64} + 2^{64+r}$ and $\beta \leq kn + 2^{64} - 2^{64+r}$ and we wonder if $kn - 2^{64} + 2^{64+r} \leq kn + 2^{64} - 2^{64+r}$ is at all possible. We rewrite as follows: - - $$2^{64+r} - 2^{64} \leq 2^{64} - 2^{64+r}$$ - - $$2\cdot2^{64+r} \leq 2\cdot2^{64} $$ - - $$2^{64+r} \leq 2^{64} $$ - - But this can only happen if $r\leq 0$, which is impossible since we assume $0 < r < 64$.CONTRADICTION. -- EOP. +# RFC 7: Keccak diff --git a/book/src/specs/kimchi.md b/book/src/specs/kimchi.md index 13a51330d9..d84af63172 100644 --- a/book/src/specs/kimchi.md +++ b/book/src/specs/kimchi.md @@ -1641,8 +1641,8 @@ If lookup is used, the following values are added to the common index: To create the index, follow these steps: 1. If no lookup is used in the circuit, do not create a lookup index -2. Get the lookup selectors and lookup tables (TODO: how?) -3. Concatenate runtime lookup tables with the ones used by gates +2. Get the lookup selectors and lookup tables that are specified implicitly +3. Concatenate explicit runtime lookup tables with the ones (implicitly) used by gates. 4. Get the highest number of columns `max_table_width` that a lookup table can have. 5. Create the concatenated table of all the fixed lookup tables. diff --git a/kimchi/src/circuits/constraints.rs b/kimchi/src/circuits/constraints.rs index 115a696965..31fdf5c4e0 100644 --- a/kimchi/src/circuits/constraints.rs +++ b/kimchi/src/circuits/constraints.rs @@ -5,7 +5,11 @@ use crate::{ domain_constant_evaluation::DomainConstantEvaluations, domains::EvaluationDomains, gate::{CircuitGate, GateType}, - lookup::{index::LookupConstraintSystem, lookups::LookupFeatures, tables::LookupTable}, + lookup::{ + index::LookupConstraintSystem, + lookups::LookupFeatures, + tables::{GateLookupTables, LookupTable}, + }, polynomial::{WitnessEvals, WitnessOverDomains, WitnessShifts}, polynomials::permutation::Shifts, wires::*, @@ -32,6 +36,11 @@ use std::sync::Arc; // /// Flags for optional features in the constraint system +#[cfg_attr( + feature = "ocaml_types", + derive(ocaml::IntoValue, ocaml::FromValue, ocaml_gen::Struct) +)] +#[cfg_attr(feature = "wasm_types", wasm_bindgen::prelude::wasm_bindgen)] #[derive(Copy, Clone, Serialize, Deserialize, Debug)] pub struct FeatureFlags { /// RangeCheck0 gate @@ -614,6 +623,47 @@ pub fn zk_rows_strict_lower_bound(num_chunks: usize) -> usize { (2 * (PERMUTS + 1) * num_chunks - 2) / PERMUTS } +impl FeatureFlags { + pub fn from_gates_and_lookup_features( + gates: &[CircuitGate], + lookup_features: LookupFeatures, + ) -> FeatureFlags { + let mut feature_flags = FeatureFlags { + range_check0: false, + range_check1: false, + lookup_features, + foreign_field_add: false, + foreign_field_mul: false, + xor: false, + rot: false, + }; + + for gate in gates { + match gate.typ { + GateType::RangeCheck0 => feature_flags.range_check0 = true, + GateType::RangeCheck1 => feature_flags.range_check1 = true, + GateType::ForeignFieldAdd => feature_flags.foreign_field_add = true, + GateType::ForeignFieldMul => feature_flags.foreign_field_mul = true, + GateType::Xor16 => feature_flags.xor = true, + GateType::Rot64 => feature_flags.rot = true, + _ => (), + } + } + + feature_flags + } + + pub fn from_gates( + gates: &[CircuitGate], + uses_runtime_tables: bool, + ) -> FeatureFlags { + FeatureFlags::from_gates_and_lookup_features( + gates, + LookupFeatures::from_gates(gates, uses_runtime_tables), + ) + } +} + impl Builder { /// Set up the number of public inputs. /// If not invoked, it equals `0` by default. @@ -682,22 +732,11 @@ impl Builder { // for some reason we need more than 1 gate for the circuit to work, see TODO below assert!(gates.len() > 1); - let lookup_features = LookupFeatures::from_gates(&gates, runtime_tables.is_some()); + let feature_flags = FeatureFlags::from_gates(&gates, runtime_tables.is_some()); let lookup_domain_size = { // First we sum over the lookup table size - let mut lookup_domain_size: usize = lookup_tables - .iter() - .map( - |LookupTable { data, id: _ }| { - if data.is_empty() { - 0 - } else { - data[0].len() - } - }, - ) - .sum(); + let mut lookup_domain_size: usize = lookup_tables.iter().map(|lt| lt.len()).sum(); // After that on the runtime tables if let Some(runtime_tables) = runtime_tables.as_ref() { for runtime_table in runtime_tables.iter() { @@ -705,12 +744,19 @@ impl Builder { } } // And we add the built-in tables, depending on the features. - let LookupFeatures { patterns, .. } = &lookup_features; + let LookupFeatures { patterns, .. } = &feature_flags.lookup_features; + let mut gate_lookup_tables = GateLookupTables { + xor: false, + range_check: false, + }; for pattern in patterns.into_iter() { if let Some(gate_table) = pattern.table() { - lookup_domain_size += gate_table.table_size(); + gate_lookup_tables[gate_table] = true } } + for gate_table in gate_lookup_tables.into_iter() { + lookup_domain_size += gate_table.table_size(); + } lookup_domain_size }; @@ -787,28 +833,6 @@ impl Builder { .collect(); gates.append(&mut padding); - let mut feature_flags = FeatureFlags { - range_check0: false, - range_check1: false, - lookup_features, - foreign_field_add: false, - foreign_field_mul: false, - xor: false, - rot: false, - }; - - for gate in &gates { - match gate.typ { - GateType::RangeCheck0 => feature_flags.range_check0 = true, - GateType::RangeCheck1 => feature_flags.range_check1 = true, - GateType::ForeignFieldAdd => feature_flags.foreign_field_add = true, - GateType::ForeignFieldMul => feature_flags.foreign_field_mul = true, - GateType::Xor16 => feature_flags.xor = true, - GateType::Rot64 => feature_flags.rot = true, - _ => (), - } - } - //~ 1. sample the `PERMUTS` shifts. let shifts = Shifts::new(&domain.d1); @@ -822,7 +846,7 @@ impl Builder { &domain, zk_rows as usize, ) - .map_err(|e| SetupError::ConstraintSystem(e.to_string()))?; + .map_err(SetupError::LookupCreation)?; let sid = shifts.map[0].clone(); diff --git a/kimchi/src/circuits/lookup/index.rs b/kimchi/src/circuits/lookup/index.rs index ab3e4545ab..41a9709737 100644 --- a/kimchi/src/circuits/lookup/index.rs +++ b/kimchi/src/circuits/lookup/index.rs @@ -1,10 +1,10 @@ -use super::runtime_tables::{RuntimeTableCfg, RuntimeTableSpec}; use crate::circuits::{ domains::EvaluationDomains, gate::CircuitGate, lookup::{ constraints::LookupConfiguration, lookups::{LookupInfo, LookupPattern}, + runtime_tables::{RuntimeTableCfg, RuntimeTableSpec}, tables::LookupTable, }, }; @@ -21,17 +21,15 @@ use std::iter; use thiserror::Error; /// Represents an error found when computing the lookup constraint system -#[derive(Debug, Error)] +#[derive(Debug, Error, Clone)] pub enum LookupError { - #[error("One of the lookup tables has columns of different lengths")] - InconsistentTableLength, #[error("The combined lookup table is larger than allowed by the domain size. Observed: {length}, expected: {maximum_allowed}")] LookupTableTooLong { length: usize, maximum_allowed: usize, }, - #[error("The table with id 0 must have an entry of all zeros")] - TableIDZeroMustHaveZeroEntry, + #[error("Cannot create a combined table since ids for sub-tables are colliding. The collision type is: {collision_type}")] + LookupTableIdCollision { collision_type: String }, } /// Lookup selectors @@ -200,7 +198,7 @@ impl LookupConstraintSystem { /// Will give error if inputs validation do not match. pub fn create( gates: &[CircuitGate], - lookup_tables: Vec>, + fixed_lookup_tables: Vec>, runtime_tables: Option>>, domain: &EvaluationDomains, zk_rows: usize, @@ -217,21 +215,64 @@ impl LookupConstraintSystem { // product is 1, we cannot use those rows to store any values. let max_num_entries = d1_size - zk_rows - 1; - //~ 2. Get the lookup selectors and lookup tables (TODO: how?) + //~ 2. Get the lookup selectors and lookup tables that are specified implicitly + // by the lookup gates. let (lookup_selectors, gate_lookup_tables) = lookup_info.selector_polynomials_and_tables(domain, gates); - //~ 3. Concatenate runtime lookup tables with the ones used by gates - let mut lookup_tables: Vec<_> = gate_lookup_tables + // Checks whether an iterator contains any duplicates, and if yes, raises + // a corresponding LookupTableIdCollision error. + fn check_id_duplicates<'a, I: Iterator>( + iter: I, + msg: &str, + ) -> Result<(), LookupError> { + use itertools::Itertools; + match iter.duplicates().collect::>() { + dups if !dups.is_empty() => Err(LookupError::LookupTableIdCollision { + collision_type: format!("{}: {:?}", msg, dups).to_string(), + }), + _ => Ok(()), + } + } + + // If there is a gate using a lookup table, this table must not be added + // explicitly to the constraint system. + let fixed_gate_joint_ids: Vec = fixed_lookup_tables + .iter() + .map(|lt| lt.id()) + .chain(gate_lookup_tables.iter().map(|lt| lt.id())) + .collect(); + check_id_duplicates( + fixed_gate_joint_ids.iter(), + "duplicates between fixed given and fixed from-gate tables", + )?; + + //~ 3. Concatenate explicit runtime lookup tables with the ones (implicitly) used by gates. + let mut lookup_tables: Vec> = fixed_lookup_tables .into_iter() - .chain(lookup_tables) + .chain(gate_lookup_tables) .collect(); - let mut has_table_id_0 = false; + let fixed_lookup_tables_ids: Vec = + lookup_tables.iter().map(|lt| lt.id()).collect(); + check_id_duplicates( + fixed_lookup_tables_ids.iter(), + "fixed lookup table duplicates", + )?; // if we are using runtime tables let (runtime_table_offset, runtime_selector) = if let Some(runtime_tables) = &runtime_tables { + let runtime_tables_ids: Vec = + runtime_tables.iter().map(|rt| rt.id).collect(); + check_id_duplicates(runtime_tables_ids.iter(), "runtime table duplicates")?; + check_id_duplicates( + runtime_tables_ids + .iter() + .chain(fixed_lookup_tables_ids.iter()), + "duplicates between runtime and fixed tables", + )?; + // save the offset of the end of the table let mut runtime_table_offset = 0; for table in &lookup_tables { @@ -272,18 +313,15 @@ impl LookupConstraintSystem { let (id, first_column) = (runtime_table.id, runtime_table.first_column.clone()); - // record if table ID 0 is used in one of the runtime tables - // note: the check later will still force you to have a fixed table with ID 0 - if id == 0 { - has_table_id_0 = true; - } - // important: we still need a placeholder column to make sure that // if all other tables have a single column // we don't use the second table as table ID column. let placeholders = vec![F::zero(); first_column.len()]; let data = vec![first_column, placeholders]; - let table = LookupTable { id, data }; + // TODO Check it does not fail actually. Maybe this should throw a different error. + let table = LookupTable::create(id, data) + .expect("Runtime table creation must succeed"); + lookup_tables.push(table); } @@ -345,31 +383,21 @@ impl LookupConstraintSystem { let mut table_ids: Vec = Vec::with_capacity(d1_size); let mut non_zero_table_id = false; - let mut has_table_id_0_with_zero_entry = false; for table in &lookup_tables { let table_len = table.len(); - if table.id == 0 { - has_table_id_0 = true; - if table.has_zero_entry() { - has_table_id_0_with_zero_entry = true; - } - } else { + if table.id() != 0 { non_zero_table_id = true; } //~~ * Update the corresponding entries in a table id vector (of size the domain as well) //~ with the table ID of the table. - let table_id: F = i32_to_field(table.id); + let table_id: F = i32_to_field(table.id()); table_ids.extend(repeat_n(table_id, table_len)); //~~ * Copy the entries from the table to new rows in the corresponding columns of the concatenated table. - for (i, col) in table.data.iter().enumerate() { - // See GH issue: https://github.com/MinaProtocol/mina/issues/14097 - if col.len() != table_len { - return Err(LookupError::InconsistentTableLength); - } + for (i, col) in table.data().iter().enumerate() { lookup_table[i].extend(col); } @@ -379,12 +407,6 @@ impl LookupConstraintSystem { } } - // If a table has ID 0, then it must have a zero entry. - // This is for the dummy lookups to work. - if has_table_id_0 && !has_table_id_0_with_zero_entry { - return Err(LookupError::TableIDZeroMustHaveZeroEntry); - } - // Note: we use `>=` here to leave space for the dummy value. if lookup_table[0].len() >= max_num_entries { return Err(LookupError::LookupTableTooLong { @@ -411,6 +433,8 @@ impl LookupConstraintSystem { lookup_table8.push(eval); } + // @volhovm: Do we need to enforce that there is at least one table + // with id 0? //~ 9. pre-compute polynomial and evaluation form for the table IDs, //~ only if a table with an ID different from zero was used. let (table_ids, table_ids8) = if non_zero_table_id { @@ -443,3 +467,105 @@ impl LookupConstraintSystem { } } } + +#[cfg(test)] +mod tests { + + use super::{LookupError, LookupTable, RuntimeTableCfg}; + use crate::{ + circuits::constraints::ConstraintSystem, circuits::gate::CircuitGate, + circuits::lookup::tables::xor, circuits::polynomials::range_check, error::SetupError, + }; + use mina_curves::pasta::Fp; + + #[test] + fn colliding_table_ids() { + let (_, gates) = CircuitGate::::create_multi_range_check(0); + let collision_id: i32 = 5; + + let cs = ConstraintSystem::::create(gates.clone()) + .lookup(vec![range_check::gadget::lookup_table()]) + .build(); + + assert!( + matches!( + cs, + Err(SetupError::LookupCreation( + LookupError::LookupTableIdCollision { .. } + )) + ), + "LookupConstraintSystem::create(...) must fail due to range table passed twice" + ); + + let cs = ConstraintSystem::::create(gates.clone()) + .lookup(vec![xor::xor_table()]) + .build(); + + assert!( + cs.is_ok(), + "LookupConstraintSystem::create(...) must succeed, no duplicates exist" + ); + + let cs = ConstraintSystem::::create(gates.clone()) + .lookup(vec![ + LookupTable::create(collision_id, vec![vec![From::from(0); 16]]).unwrap(), + LookupTable::create(collision_id, vec![vec![From::from(1); 16]]).unwrap(), + ]) + .build(); + + assert!( + matches!( + cs, + Err(SetupError::LookupCreation( + LookupError::LookupTableIdCollision { .. } + )) + ), + "LookupConstraintSystem::create(...) must fail, collision in fixed ids" + ); + + let cs = ConstraintSystem::::create(gates.clone()) + .runtime(Some(vec![ + RuntimeTableCfg { + id: collision_id, + first_column: vec![From::from(0); 16], + }, + RuntimeTableCfg { + id: collision_id, + first_column: vec![From::from(1); 16], + }, + ])) + .build(); + + assert!( + matches!( + cs, + Err(SetupError::LookupCreation( + LookupError::LookupTableIdCollision { .. } + )) + ), + "LookupConstraintSystem::create(...) must fail, collision in runtime ids" + ); + + let cs = ConstraintSystem::::create(gates.clone()) + .lookup(vec![LookupTable::create( + collision_id, + vec![vec![From::from(0); 16]], + ) + .unwrap()]) + .runtime(Some(vec![RuntimeTableCfg { + id: collision_id, + first_column: vec![From::from(1); 16], + }])) + .build(); + + assert!( + matches!( + cs, + Err(SetupError::LookupCreation( + LookupError::LookupTableIdCollision { .. } + )) + ), + "LookupConstraintSystem::create(...) must fail, collision between runtime and lookup ids" + ); + } +} diff --git a/kimchi/src/circuits/lookup/lookups.rs b/kimchi/src/circuits/lookup/lookups.rs index b126e1be00..12e4bd058d 100644 --- a/kimchi/src/circuits/lookup/lookups.rs +++ b/kimchi/src/circuits/lookup/lookups.rs @@ -222,7 +222,7 @@ impl LookupInfo { }; // TODO: is take(n) useful here? I don't see why we need this - for (i, gate) in gates.iter().enumerate().take(n) { + for (i, gate) in gates.iter().take(n).enumerate() { let typ = gate.typ; if let Some(lookup_pattern) = LookupPattern::from_gate(typ, CurrOrNext::Curr) { diff --git a/kimchi/src/circuits/lookup/runtime_tables.rs b/kimchi/src/circuits/lookup/runtime_tables.rs index 98b018ed8a..d05a0a5a72 100644 --- a/kimchi/src/circuits/lookup/runtime_tables.rs +++ b/kimchi/src/circuits/lookup/runtime_tables.rs @@ -18,6 +18,8 @@ pub struct RuntimeTableSpec { pub len: usize, } +/// A configuration of the runtime table as known at setup time. +/// /// Use this type at setup time, to list all the runtime tables. /// /// Note: care must be taken as table IDs can collide with IDs of other types of lookup tables. @@ -25,7 +27,8 @@ pub struct RuntimeTableSpec { pub struct RuntimeTableCfg { /// The table ID. pub id: i32, - /// The content of the first column of the runtime table. + /// The content of the first column of the runtime table, i.e. + /// keys when a table is viewed as a (k,v) array. pub first_column: Vec, } @@ -56,12 +59,13 @@ impl From> for RuntimeTableSpec { } /// A runtime table. Runtime tables must match the configuration -/// that was specified in [`RuntimeTableCfg`]. +/// specified in [`RuntimeTableCfg`]. #[derive(Debug, Clone)] pub struct RuntimeTable { /// The table id. pub id: i32, - /// A single column. + /// A single column. Represents runtime table values, where + /// ['RuntimeTableCfg'] defines compile-time keys. pub data: Vec, } diff --git a/kimchi/src/circuits/lookup/tables/mod.rs b/kimchi/src/circuits/lookup/tables/mod.rs index d6be11f9bd..7760086512 100644 --- a/kimchi/src/circuits/lookup/tables/mod.rs +++ b/kimchi/src/circuits/lookup/tables/mod.rs @@ -1,45 +1,71 @@ -use ark_ff::{FftField, One, Zero}; +use ark_ff::{Field, One, Zero}; use poly_commitment::PolyComm; use serde::{Deserialize, Serialize}; +use thiserror::Error; pub mod range_check; pub mod xor; -//~ spec:startcode -/// The table ID associated with the XOR lookup table. -pub const XOR_TABLE_ID: i32 = 0; - -/// The range check table ID. -pub const RANGE_CHECK_TABLE_ID: i32 = 1; -//~ spec:endcode - -/// Enumerates the different 'fixed' lookup tables used by individual gates -#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)] -pub enum GateLookupTable { - Xor, - RangeCheck, -} - /// A table of values that can be used for a lookup, along with the ID for the table. #[derive(Debug, Clone)] pub struct LookupTable { - pub id: i32, - pub data: Vec>, + id: i32, + data: Vec>, +} + +/// Represents inconsistency errors during table construction and composition. +#[derive(Debug, Error)] +pub enum LookupTableError { + #[error("Table must be nonempty")] + InputTableDataEmpty, + #[error("One of the lookup tables has columns of different lengths")] + InconsistentTableLength, + #[error("The table with id 0 must have an entry of all zeros")] + TableIDZeroMustHaveZeroEntry, } impl LookupTable where - F: FftField, + F: Field, { - /// Return true if the table has an entry containing all zeros. - pub fn has_zero_entry(&self) -> bool { + pub fn create(id: i32, data: Vec>) -> Result { + let res = LookupTable { id, data }; + + // Empty tables are not allowed + if res.data.is_empty() { + return Err(LookupTableError::InputTableDataEmpty); + } + + // All columns in the table must have same length + let table_len = res.len(); + for col in res.data.iter() { + if col.len() != table_len { + return Err(LookupTableError::InconsistentTableLength); + } + } + + // If a table has ID 0, then it must have a zero entry. + // This is for the dummy lookups to work. + if id == 0 && !res.has_zero_entry() { + return Err(LookupTableError::TableIDZeroMustHaveZeroEntry); + } + + Ok(res) + } + + /// Return true if the table has an entry (row) containing all zeros. + fn has_zero_entry(&self) -> bool { // reminder: a table is written as a list of columns, // not as a list of row entries. for row in 0..self.len() { + let mut row_zero = true; for col in &self.data { if !col[row].is_zero() { - continue; + row_zero = false; + break; } + } + if row_zero { return true; } } @@ -54,8 +80,11 @@ where self.data.len() } - /// Returns the length of the table. + /// Returns the length (height) of the table. pub fn len(&self) -> usize { + if self.is_empty() { + panic!("LookupTable#len() is called on an empty table") + } self.data[0].len() } @@ -63,10 +92,82 @@ where pub fn is_empty(&self) -> bool { self.data.is_empty() } + + /// Returns table id. + pub fn id(&self) -> i32 { + self.id + } + + /// Returns table data. + pub fn data(&self) -> &Vec> { + &self.data + } +} + +//~ spec:startcode +/// The table ID associated with the XOR lookup table. +pub const XOR_TABLE_ID: i32 = 0; + +/// The range check table ID. +pub const RANGE_CHECK_TABLE_ID: i32 = 1; +//~ spec:endcode + +/// Enumerates the different 'fixed' lookup tables used by individual gates +#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)] +pub enum GateLookupTable { + Xor, + RangeCheck, +} + +/// Enumerates the different 'fixed' lookup tables used by individual gates +#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)] +pub struct GateLookupTables { + pub xor: bool, + pub range_check: bool, +} + +impl std::ops::Index for GateLookupTables { + type Output = bool; + + fn index(&self, index: GateLookupTable) -> &Self::Output { + match index { + GateLookupTable::Xor => &self.xor, + GateLookupTable::RangeCheck => &self.range_check, + } + } +} + +impl std::ops::IndexMut for GateLookupTables { + fn index_mut(&mut self, index: GateLookupTable) -> &mut Self::Output { + match index { + GateLookupTable::Xor => &mut self.xor, + GateLookupTable::RangeCheck => &mut self.range_check, + } + } +} + +impl IntoIterator for GateLookupTables { + type Item = GateLookupTable; + type IntoIter = std::vec::IntoIter; + + fn into_iter(self) -> Self::IntoIter { + // Destructor pattern to make sure we add new lookup patterns. + let GateLookupTables { xor, range_check } = self; + + let mut patterns = Vec::with_capacity(2); + + if xor { + patterns.push(GateLookupTable::Xor) + } + if range_check { + patterns.push(GateLookupTable::RangeCheck) + } + patterns.into_iter() + } } /// Returns the lookup table associated to a [`GateLookupTable`]. -pub fn get_table(table_name: GateLookupTable) -> LookupTable { +pub fn get_table(table_name: GateLookupTable) -> LookupTable { match table_name { GateLookupTable::Xor => xor::xor_table(), GateLookupTable::RangeCheck => range_check::range_check_table(), @@ -108,6 +209,7 @@ where F: Zero + One + Clone, I: DoubleEndedIterator, { + // TODO: unnecessary cloning if binops between F and &F are supported v.rev() .fold(F::zero(), |acc, x| joint_combiner.clone() * acc + x.clone()) + table_id_combiner.clone() * table_id.clone() @@ -208,3 +310,50 @@ pub mod caml { } } } + +#[cfg(test)] +mod tests { + + use super::*; + use mina_curves::pasta::Fp; + + #[test] + fn test_zero_table_zero_row() { + let lookup_r: u64 = 32; + // Table column that /does not/ contain zeros + let lookup_table_values_1: Vec<_> = (1..lookup_r + 1).map(From::from).collect(); + // Another table column that /does/ contain zeros. + let lookup_table_values_2: Vec<_> = (0..lookup_r).map(From::from).collect(); + + // Jointly two columns /do not/ have a full zero now. + let table: Result, _> = + LookupTable::create(0, vec![lookup_table_values_1, lookup_table_values_2]); + + assert!( + matches!(table, Err(LookupTableError::TableIDZeroMustHaveZeroEntry)), + "LookupTable::create(...) must fail when zero table has no zero rows" + ); + } + + #[test] + fn test_invalid_data_inputs() { + let table: Result, _> = LookupTable::create(0, vec![]); + assert!( + matches!(table, Err(LookupTableError::InputTableDataEmpty)), + "LookupTable::create(...) must fail when empty table creation is attempted" + ); + + let lookup_r: u64 = 32; + // Two columns of different lengths + let lookup_table_values_1: Vec<_> = (0..2 * lookup_r).map(From::from).collect(); + let lookup_table_values_2: Vec<_> = (0..lookup_r).map(From::from).collect(); + + let table: Result, _> = + LookupTable::create(0, vec![lookup_table_values_1, lookup_table_values_2]); + + assert!( + matches!(table, Err(LookupTableError::InconsistentTableLength)), + "LookupTable::create(...) must fail when columns are not of the same length" + ); + } +} diff --git a/kimchi/src/circuits/lookup/tables/range_check.rs b/kimchi/src/circuits/lookup/tables/range_check.rs index 001ffedd52..2bf00e64f7 100644 --- a/kimchi/src/circuits/lookup/tables/range_check.rs +++ b/kimchi/src/circuits/lookup/tables/range_check.rs @@ -14,11 +14,9 @@ pub fn range_check_table() -> LookupTable where F: Field, { - let table = vec![(0..RANGE_CHECK_UPPERBOUND).map(F::from).collect()]; - LookupTable { - id: RANGE_CHECK_TABLE_ID, - data: table, - } + let data = vec![(0..RANGE_CHECK_UPPERBOUND).map(F::from).collect()]; + LookupTable::create(RANGE_CHECK_TABLE_ID, data) + .expect("range_check_table creation must succeed") } pub const TABLE_SIZE: usize = RANGE_CHECK_UPPERBOUND as usize; diff --git a/kimchi/src/circuits/lookup/tables/xor.rs b/kimchi/src/circuits/lookup/tables/xor.rs index d846942a31..6f87377d96 100644 --- a/kimchi/src/circuits/lookup/tables/xor.rs +++ b/kimchi/src/circuits/lookup/tables/xor.rs @@ -37,10 +37,8 @@ pub fn xor_table() -> LookupTable { // Just to be safe. assert!(r[r.len() - 1].is_zero()); } - LookupTable { - id: XOR_TABLE_ID, - data, - } + + LookupTable::create(XOR_TABLE_ID, data).expect("xor_table creation must succeed") } pub const TABLE_SIZE: usize = 256; diff --git a/kimchi/src/error.rs b/kimchi/src/error.rs index 8c801fd38f..8f2e8406a1 100644 --- a/kimchi/src/error.rs +++ b/kimchi/src/error.rs @@ -1,5 +1,6 @@ //! This module implements the [`ProverError`] type. +use crate::circuits::lookup::index::LookupError; // not sure about hierarchy use poly_commitment::error::CommitmentError; use thiserror::Error; @@ -100,6 +101,9 @@ pub enum SetupError { #[error("the domain could not be constructed: {0}")] DomainCreation(DomainCreationError), + + #[error("the lookup constraint system cannot not be constructed: {0}")] + LookupCreation(LookupError), } /// Errors that can arise when creating a verifier index diff --git a/kimchi/src/precomputed_srs.rs b/kimchi/src/precomputed_srs.rs index 116554e262..11f60edec3 100644 --- a/kimchi/src/precomputed_srs.rs +++ b/kimchi/src/precomputed_srs.rs @@ -33,6 +33,9 @@ where let file = File::open(srs_path.clone()).unwrap_or_else(|_| panic!("missing SRS file: {srs_path:?}")); let reader = BufReader::new(file); + // Note: In tests, this read takes significant amount of time (2 min) due + // to -O0 optimisation level. Compile tests with -O1 or --release flag. + // See: https://github.com/o1-labs/proof-systems/blob/develop/CONTRIBUTING.md#development rmp_serde::from_read(reader).unwrap() } diff --git a/kimchi/src/prover_index.rs b/kimchi/src/prover_index.rs index 523d583e18..b3fa552669 100644 --- a/kimchi/src/prover_index.rs +++ b/kimchi/src/prover_index.rs @@ -210,6 +210,7 @@ pub mod testing { override_srs_size, |d1: D, size: usize| { let log2_size = size.ilog2(); + // Run test_srs_serialization test to generate test SRS & enable this let mut srs = if log2_size <= precomputed_srs::SERIALIZED_SRS_SIZE { // TODO: we should trim it if it's smaller precomputed_srs::get_srs() diff --git a/kimchi/src/tests/and.rs b/kimchi/src/tests/and.rs index e344af6da4..9c1a86fbcd 100644 --- a/kimchi/src/tests/and.rs +++ b/kimchi/src/tests/and.rs @@ -255,6 +255,7 @@ fn verify_bad_and_decomposition( ); witness[col][and_row] += G::ScalarField::one(); } + if col == 2 { assert_eq!( cs.gates[0].verify_witness::(0, witness, &cs, &witness[0][0..cs.public]), diff --git a/kimchi/src/tests/foreign_field_add.rs b/kimchi/src/tests/foreign_field_add.rs index 24c1d5a8c1..760c7fa2d5 100644 --- a/kimchi/src/tests/foreign_field_add.rs +++ b/kimchi/src/tests/foreign_field_add.rs @@ -1490,7 +1490,6 @@ fn test_ffadd_finalization() { let index = { let cs = ConstraintSystem::create(gates.clone()) - .lookup(vec![range_check::gadget::lookup_table()]) .public(num_public_inputs) .build() .unwrap(); diff --git a/kimchi/src/tests/lookup.rs b/kimchi/src/tests/lookup.rs index a674eba78f..269235d35e 100644 --- a/kimchi/src/tests/lookup.rs +++ b/kimchi/src/tests/lookup.rs @@ -22,10 +22,12 @@ type BaseSponge = DefaultFqSponge; type ScalarSponge = DefaultFrSponge; fn setup_lookup_proof(use_values_from_table: bool, num_lookups: usize, table_sizes: Vec) { - let lookup_table_values: Vec> = table_sizes + let mut lookup_table_values: Vec> = table_sizes .iter() .map(|size| (0..*size).map(|_| rand::random()).collect()) .collect(); + // Zero table must have a zero row + lookup_table_values[0][0] = From::from(0); let lookup_tables = lookup_table_values .iter() .enumerate() @@ -33,10 +35,8 @@ fn setup_lookup_proof(use_values_from_table: bool, num_lookups: usize, table_siz let index_column = (0..lookup_table_values.len() as u64) .map(Into::into) .collect(); - LookupTable { - id: id as i32, - data: vec![index_column, lookup_table_values.clone()], - } + LookupTable::create(id as i32, vec![index_column, lookup_table_values.clone()]) + .expect("setup_lookup_proof: Table creation must succeed") }) .collect(); @@ -200,7 +200,7 @@ fn test_runtime_table() { let len = first_column.len(); let mut runtime_tables_setup = vec![]; - for table_id in 0..num { + for table_id in 1..num + 1 { let cfg = RuntimeTableCfg { id: table_id, first_column: first_column.into_iter().map(Into::into).collect(), @@ -236,7 +236,7 @@ fn test_runtime_table() { for row in 0..20 { // the first register is the table id. We pick one random table. - lookup_cols[0][row] = (rng.gen_range(0..num) as u32).into(); + lookup_cols[0][row] = (rng.gen_range(1..num + 1) as u32).into(); // create queries into our runtime lookup table. // We will set [w1, w2], [w3, w4] and [w5, w6] to randon indexes and diff --git a/kimchi/src/tests/range_check.rs b/kimchi/src/tests/range_check.rs index 27240db942..577e2aa4f7 100644 --- a/kimchi/src/tests/range_check.rs +++ b/kimchi/src/tests/range_check.rs @@ -68,7 +68,10 @@ fn create_test_prover_index( gates, public_size, 0, - vec![range_check::gadget::lookup_table()], + // specifying lookup table is not necessary, + // since it's already passed through patterns implicitly + //vec![range_check::gadget::lookup_table()], + vec![], None, false, None, diff --git a/kimchi/src/tests/rot.rs b/kimchi/src/tests/rot.rs index f88125cf9b..f9a1308b86 100644 --- a/kimchi/src/tests/rot.rs +++ b/kimchi/src/tests/rot.rs @@ -328,7 +328,6 @@ fn test_rot_finalization() { let index = { let cs = ConstraintSystem::create(gates.clone()) .public(num_public_inputs) - .lookup(vec![rot::lookup_table()]) .build() .unwrap(); let mut srs = SRS::::create(cs.domain.d1.size()); diff --git a/kimchi/src/tests/xor.rs b/kimchi/src/tests/xor.rs index 0344e0aea3..7ab28b4008 100644 --- a/kimchi/src/tests/xor.rs +++ b/kimchi/src/tests/xor.rs @@ -392,7 +392,6 @@ fn test_xor_finalization() { let index = { let cs = ConstraintSystem::create(gates.clone()) - .lookup(vec![xor::lookup_table()]) .public(num_inputs) .build() .unwrap();