Utilize NEW_TOKEN frames #1912

gretchenfrage · 2024-06-30T23:19:58Z

The server now sends the client NEW_TOKEN frames, and the client now stores and utilizes them.

The main motivation is that this allows 0.5-RTT data to not be subject to anti-amplification limits. This is a scenario likely to occur in HTTP/3 requests, as one example: a client makes a 0-RTT GET request for something like a jpeg, such that the response will be much bigger than the request, and so unless NEW_TOKEN frames are used, the response may begin to be transmitted but then hit the anti-amplification limit and have to pause until the full 1-RTT handshake completes.

For example, here's some experimental data that should be similar in the relevant ways:

The client sends the server an integer and the server responds with that number of bytes
They do it in 0-RTT if they can
For each iteration the client endpoint does it twice and measures its request/response time from the second time it does it (so it will have 0-RTT and NEW_TOKEN material)
100ms localhost latency was simulated by running sudo tc qdisc add dev lo root netem delay 100ms (and undone with sudo tc qdisc del dev lo root netem)

This experiment was performed on Nov/24 with 2edf192 as main and 478b325 as feature.

For responses in a certain size range, avoiding the anti-amplification limits by using NEW_TOKEN frames made the request/response complete in 1 RTT on this branch versus 2 RTT on main.

Reproducible experimental setup

newtoken.rs can be placed into quinn/examples/:

use std::{
    sync::Arc,
    net::ToSocketAddrs as _,
};
use anyhow::Error;
use quinn::*;
use tracing::*;
use tracing_subscriber::prelude::*;


#[tokio::main]
async fn main() -> Result<(), Error> {
    // init logging
    let log_fmt = tracing_subscriber::fmt::format()
        .compact()
        .with_timer(tracing_subscriber::fmt::time::uptime())
        .with_line_number(true);
    let stdout_log = tracing_subscriber::fmt::layer()
        .event_format(log_fmt)
        .with_writer(std::io::stderr);
    let log_filter = tracing_subscriber::EnvFilter::new(
        std::env::var(tracing_subscriber::EnvFilter::DEFAULT_ENV).unwrap_or("info".into())
    );
    let log_subscriber = tracing_subscriber::Registry::default()
        .with(log_filter)
        .with(stdout_log);
    tracing::subscriber::set_global_default(log_subscriber).expect("unable to install logger");

    // get args
    let args = std::env::args().collect::<Vec<_>>();
    anyhow::ensure!(args.len() == 2, "wrong number of args");
    let num_bytes = args[1].parse::<u32>()?;

    // generate keys
    let rcgen_cert = rcgen::generate_simple_self_signed(vec!["localhost".into()]).unwrap();
    let key = rustls::pki_types::PrivatePkcs8KeyDer::from(rcgen_cert.key_pair.serialize_der());
    let cert = rustls::pki_types::CertificateDer::from(rcgen_cert.cert);
    let mut roots = rustls::RootCertStore::empty();
    roots.add(cert.clone()).unwrap();
    let certs = vec![cert];

    let mut tasks = tokio::task::JoinSet::new();

    // start server
    let (send_stop_server, mut recv_stop_server) = tokio::sync::oneshot::channel();
    tasks.spawn(log_err(async move {
        let mut server_crypto = rustls::ServerConfig::builder()
                .with_no_client_auth()
                .with_single_cert(certs, key.into())?;
        // make sure to configure this:
        server_crypto.max_early_data_size = u32::MAX;
        let server_crypto = quinn::crypto::rustls::QuicServerConfig::try_from(Arc::new(server_crypto))?;
        let server_config = ServerConfig::with_crypto(Arc::new(server_crypto));
        let endpoint = Endpoint::server(
            server_config,
            "127.0.0.1:4433".to_socket_addrs().unwrap().next().unwrap(),
        )?;
        loop {
            let incoming = tokio::select! {
                option = endpoint.accept() => match option { Some(incoming) => incoming, None => break },
                result = &mut recv_stop_server => if result.is_ok() { break } else { continue },
            };
            // spawn subtask for connection
            tokio::spawn(log_err(async move {
                // attempt to accept 0-RTT data
                let conn = match incoming.accept()?.into_0rtt() {
                    Ok((conn, _)) => conn,
                    Err(connecting) => connecting.await?,
                };
                loop {
                    let (mut send, mut recv) = match conn.accept_bi().await {
                        Ok(stream) => stream,
                        Err(ConnectionError::ApplicationClosed(_)) => break,
                        Err(e) => Err(e)?,
                    };
                    // spawn subtask for stream
                    tokio::spawn(log_err(async move {
                        let requested_len_le_vec = recv.read_to_end(4).await?;
                        anyhow::ensure!(requested_len_le_vec.len() == 4, "malformed request {:?}", requested_len_le_vec);
                        let mut requested_len_le = [0; 4];
                        requested_len_le.copy_from_slice(&requested_len_le_vec);
                        let requested_len = u32::from_le_bytes(requested_len_le) as usize;
                        info!(%requested_len, "received request");
                        const BUF_LEN: usize = 8 << 10;
                        let mut buf = [0; BUF_LEN];
                        for i in 0..requested_len {
                            buf[i % BUF_LEN] = (i % 0xff) as u8;
                            if i % BUF_LEN == BUF_LEN - 1 {
                                send.write_all(&buf).await?;
                            }
                        }
                        if requested_len % BUF_LEN != 0 {
                            send.write_all(&buf[..requested_len % BUF_LEN]).await?;
                        }
                        info!("wrote response");
                        Ok(())
                    }.instrument(info_span!("server stream"))));
                }
                Ok(())
            }.instrument(info_span!("server conn"))));
        }
        // shut down server endpoint cleanly
        endpoint.wait_idle().await;
        Ok(())
    }.instrument(info_span!("server"))));

    // start client
    async fn send_request(conn: &Connection, num_bytes: u32) -> Result<std::time::Duration, Error> {
        let (mut send, mut recv) = conn.open_bi().await?;

        let start_time = std::time::Instant::now();

        debug!("sending request");
        send.write_all(&num_bytes.to_le_bytes()).await?;
        send.finish()?;
        debug!("receiving response");
        let response = recv.read_to_end(num_bytes as _).await?;
        anyhow::ensure!(response.len() == num_bytes as usize, "response is the wrong number of bytes");
        debug!("response received");

        let end_time = std::time::Instant::now();
        Ok(end_time.duration_since(start_time))
    }
    tasks.spawn(log_err(async move {
        let mut client_crypto = rustls::ClientConfig::builder()
                .with_root_certificates(roots)
                .with_no_client_auth();
        // make sure to configure this:
        client_crypto.enable_early_data = true;
        let mut endpoint = Endpoint::client(
            "0.0.0.0:0".to_socket_addrs().unwrap().next().unwrap()
        )?;
        let client_crypto =
                quinn::crypto::rustls::QuicClientConfig::try_from(Arc::new(client_crypto))?;
        endpoint.set_default_client_config(ClientConfig::new(Arc::new(client_crypto)));
        // twice, so as to allow 0-rtt to work on the second time
        for i in 0..2 {
            info!(%i, "client iteration");
            let connecting = endpoint.connect(
                "127.0.0.1:4433".to_socket_addrs().unwrap().next().unwrap(),
                "localhost",
            )?;
            // attempt to transmit 0-RTT data
            let duration = match connecting.into_0rtt() {
                Ok((conn, zero_rtt_accepted)) => {
                    debug!("attempting 0-rtt request");
                    let send_request_0rtt = send_request(&conn, num_bytes);
                    let mut send_request_0rtt_pinned = std::pin::pin!(send_request_0rtt);
                    tokio::select! {
                        result = &mut send_request_0rtt_pinned => result?,
                        accepted = zero_rtt_accepted => {
                            if accepted {
                                debug!("0-rtt accepted");
                                send_request_0rtt_pinned.await?
                            } else {
                                debug!("0-rtt rejected");
                                send_request(&conn, num_bytes).await?
                            }
                        }
                    }
                }
                Err(connecting) => {
                    debug!("not attempting 0-rtt request");
                    let conn = connecting.await?;
                    send_request(&conn, num_bytes).await?
                }
            };
            if i == 1 {
                println!("{}", duration.as_millis());
            }
            println!();
        }
        // tell the server to shut down so this process doesn't idle forever
        let _ = send_stop_server.send(());
        Ok(())
    }.instrument(info_span!("client"))));

    while tasks.join_next().await.is_some() {}
    Ok(())
}

async fn log_err<F: std::future::IntoFuture<Output=Result<(), Error>>>(task: F) {
    if let Err(e) = task.await {
        error!("{}", e);
    }
}

science.py crates the data:

import subprocess
import csv
import os

def run_cargo_command(n):
    try:
        result = subprocess.run(
            ["cargo", "run", "--example", "newtoken", "--", str(n)],
            capture_output=True, text=True, check=True
        )
        return result.stdout.strip()
    except subprocess.CalledProcessError as e:
        print(f"An error occurred: {e}")
        return None

def initialize_from_file():
    try:
        with open('0rtt_time.csv', mode='r', newline='') as file:
            last_line = list(csv.reader(file))[-1]
            return int(last_line[0])
    except (FileNotFoundError, IndexError):
        return -100  # Start from -100 since 0 is the first increment

def main():
    start_n = initialize_from_file() + 100
    with open('0rtt_time.csv', mode='a', newline='') as file:
        writer = csv.writer(file)
        if os.stat('0rtt_time.csv').st_size == 0:
            writer.writerow(['n', 'output'])  # Write header if file is empty
        
        for n in range(start_n, 20001, 100):
            output = run_cargo_command(n)
            if output is not None:
                writer.writerow([n, output])
                file.flush()  # Flush after every write operation
                print(f"Written: {n}, {output}")
            else:
                print(f"Failed to get output for n = {n}")

if __name__ == "__main__":
    main()

graph_it.py graphs the data, after you've manually renamed the files:

import matplotlib.pyplot as plt
import csv

def read_data(filename):
    response_sizes = []
    response_times = []
    try:
        with open(filename, mode='r') as file:
            reader = csv.reader(file)
            next(reader)  # Skip the header row
            for row in reader:
                response_sizes.append(int(row[0]))
                response_times.append(int(row[1]))
    except FileNotFoundError:
        print(f"The file {filename} was not found. Please ensure the file exists.")
    except Exception as e:
        print(f"An error occurred while reading {filename}: {e}")

    return response_sizes, response_times

def plot_data(response_sizes1, response_times1, response_sizes2, response_times2):
    plt.figure(figsize=(10, 5))
    # Plotting points with lines for the feature data
    plt.plot(response_sizes1, response_times1, 'o-', color='blue', label='Feature Data', alpha=0.5, markersize=5)
    # Plotting points with lines for the main data
    plt.plot(response_sizes2, response_times2, 'o-', color='red', label='Main Data', alpha=0.5, markersize=5)
    
    plt.title('Comparison of Feature and Main Data')
    plt.xlabel('Response Size')
    plt.ylabel('Request/Response Time')
    plt.grid(True)
    plt.ylim(bottom=0)  # Ensuring the y-axis starts at 0
    plt.legend()
    plt.show()



def main():
    response_sizes1, response_times1 = read_data('0rtt_time_feature.csv')
    response_sizes2, response_times2 = read_data('0rtt_time_main.csv')
    
    if response_sizes1 and response_times1 and response_sizes2 and response_times2:
        plot_data(response_sizes1, response_times1, response_sizes2, response_times2)

if __name__ == "__main__":
    main()

Here's a nix-shell for the Python graphing:

{ pkgs ? import <nixpkgs> {} }:

pkgs.mkShell {
  buildInputs = [
    pkgs.python3
    pkgs.python3Packages.matplotlib
  ];

  shellHook = ''
    echo "Python with matplotlib is ready to use."
  '';
}

Other motivations may include:

A server may wish for all connections to be validated before it serves them. If it responds to every initial connection attempt with .retry(), this means that requests take a minimum of 3 round trips to complete even for 1-RTT data, and makes 0-RTT impossible. If NEW_TOKENs are used, however, 1-RTT requests can once more be done in only 2 round trips, and 0-RTT requests become possible again.
A system may wish to allow 0-RTT data but mitigate or even make impossible retry attacks. If a server only accepts 0-RTT requests when their connection is validated, then replays are only possible to the extent that the server's TokenLog has false negatives, which may range from "sometimes" to "never," in contrast to the current situation of "always."

To see the specific tasks where the Asana app for GitHub is being used, see below:
- https://app.asana.com/0/0/1209099649554598

Ralith

Overall this looks pretty good, and seems well motivated. Thanks!

Thanks also for your patience while I got around to this; day job has been very busy lately.

quinn-proto/src/config.rs

quinn-proto/src/connection/mod.rs

quinn-proto/src/token_reuse_preventer.rs

quinn-proto/src/token.rs

quinn-proto/src/token_reuse_preventer.rs

gretchenfrage · 2024-11-25T00:31:39Z

Normally I wouldn't mark your own comments as resolved. But since your comments were from when this was a draft PR, I marked as resolved the ones that seem definitely irrelevant to the current version of it.

As mentioned on Discord, the MSRV CI failure does not seem to actually be caused by this PR.

quinn-proto/src/config.rs

quinn-proto/src/token.rs

quinn-proto/src/endpoint.rs

quinn-proto/src/token_log.rs

quinn-proto/src/bloom_token_log.rs

quinn-proto/Cargo.toml

quinn-proto/src/connection/mod.rs

quinn-proto/src/token.rs

Ralith · 2024-12-26T20:21:39Z

quinn-proto/src/config/mod.rs

@@ -209,6 +209,10 @@ pub struct ServerConfig {
    /// rebinding. Enabled by default.
    pub(crate) migration: bool,

+    pub(crate) validation_token_lifetime: Duration,
+    pub(crate) validation_token_log: Option<Arc<dyn TokenLog>>,


From the TokenLog docs, it sounds like the behavior of None here is equivalent to a log that always returns Err. Would it be simpler to omit the Option wrapper and provide a ZST impl with that behavior?

Ralith · 2024-12-26T20:24:13Z

quinn-proto/src/config/mod.rs

@@ -460,6 +460,9 @@ pub struct ClientConfig {
    /// Cryptographic configuration to use
    pub(crate) crypto: Arc<dyn crypto::ClientConfig>,

+    /// Validation token store to use
+    pub(crate) token_store: Option<Arc<dyn TokenStore>>,


Would it be simpler to omit the Option and provide an impl that drops everything?

quinn-proto/src/tests/util.rs

gretchenfrage · 2024-12-26T21:08:04Z

From the TokenLog docs, it sounds like the behavior of None here is equivalent to a log that always returns Err. Would it be simpler to omit the Option wrapper and provide a ZST impl with that behavior?

Would it be simpler to omit the Option and provide an impl that drops everything?

Experiment commit pushed to side branch: gretchenfrage@1dbc6eb.

Cons:

Net-increase in LOC of 11
IMO, the Option makes it more "self-documenting" that this object can be implemented vacuously, as opposed to something "load-bearing" such as the TLS session
Potential avoidable dyn calls and arc clones, although I assume this is trivial

Pros:

Although net increase in LOC is positive, they are more than accounted for by the addition of structs and traits that are basically extremely isolated in their complexity--loci of complexity such as IncomingToken::from_header experience a simplification, which could be considered more salient
It's less of a footgun for a user to create some library that wraps around Quinn's constructors but lets the end user thread these things through, and then realize they should have let the end user thread in a Option<Arc<dyn Thing>> rather than an Arc<dyn Thing>

Ralith · 2024-12-26T21:46:29Z

I like it, but I don't feel strongly; happy to defer to other judgement if someone else does.

gretchenfrage · 2024-12-27T00:20:58Z

Will wait to hear djc's thoughts on it then. Everything else should be addressed.

djc · 2024-12-27T17:54:01Z

Will wait to hear djc's thoughts on it then. Everything else should be addressed.

The experimental commit LGTM, and I agree that moving complexity loci out of common paths is probably a better direction.

(Will try to schedule a full review soon.)

gretchenfrage · 2024-12-28T06:36:06Z

Naming bikeshed:

I realized that this PR in its current state adds ServerConfig options validation_token_lifetime, validation_token_log, and validation_tokens_sent, but adds ClientConfig option token_store. Also adds the traits TokenLog and TokenStore. Do we want to be more consistent with the "validation" prefix, such as by renaming the token_store config option to validation_token_store, before we get relatively locked in to these names?

Argument for never using the validation prefix for anything: verbosity (I dislike this because I like the next bullet point)
Argument for using the validation prefix for ServerConfig: makes the 3 related options grouped together in docs, when searching, etc (this is the current situation with this PR) (I like this)
Argument for using the validation prefix for ClientConfig: consistency with ServerConfig, if we're using it in server config (I'm 50/50 on this)
Argument for using the validation prefix in trait names: consistency with config option names (I'm against this, because verbosity)

Ralith · 2024-12-28T22:40:55Z

I think obviousness easily trumps verbosity for identifiers that are typically used once or never like obscure config fields, so I'm in favor of keeping the prefix for ServerConfig setters and adding it to the ClientConfig setter. No strong opinion either way on the trait names; universal consistency would be nice, but it's harder to encounter those out of context, and most users will never see them.

gretchenfrage · 2025-01-01T19:58:51Z

Renamed the client config option to validation_token_store. This way, all relevant configuration options appear when searching for "validation"

Moves all the fields of Token to a new RetryTokenPayload struct, and makes Token have a single `payload: RetryTokenPayload` field. This may seem strange at first, but it sets up for the next commit, which adds an additional field to Token.

Previously, retry tokens were encrypted using the retry src cid as the key derivation input. This has been described by a reputable individual as "cheeky" (who, coincidentially, wrote that code in the first place). More importantly, this presents obstacles to using NEW_TOKEN frames. With this commit, tokens carry a random 128-bit value, which is used to derive the key for encrypting the rest of the token.

The ability for the server to process tokens from NEW_TOKEN frames will create the possibility of Incoming which are validated, but may still be retried. This commit creates an API for that. This means that rather than Incoming.remote_address_validated being tied to retry_src_cid, it is tied to a new `validated: bool` of `IncomingToken`. Currently, this field is initialized to true iff retry_src_cid is some. However, subsequent commits will introduce the possibility for divergence.

As of this commit, it only has a single variant, which is Retry. However, the next commit will add an additional variant. In addition to pure refactors, a discriminant byte is used when encoding.

When a path becomes validated, the server may send the client NEW_TOKEN frames. These may cause an Incoming to be validated. - Adds TokenPayload::Validation variant - Adds relevant configuration to ServerConfig - Adds `TokenLog` object to server to mitigate token reuse As of this commit, the only provided implementation of TokenLog is NoneTokenLog, which is equivalent to the lack of a token log, and is the default.

When a client receives a token from a NEW_TOKEN frame, it submits it to a TokenStore object for storage. When an endpoint connects to a server, it queries the TokenStore object for a token applicable to the server name, and uses it if one is retrieved. As of this commit, the only provided implementation of TokenStore is NoneTokenStore, which is equivalent to the lack of a token store, and is the default.

When we first added tests::util::IncomingConnectionBehavior, we opted to use an enum instead of a callback because it seemed cleaner. However, the number of variants have grown, and adding integration tests for validation tokens from NEW_TOKEN frames threatens to make this logic even more complicated. Moreover, there is another advantage to callbacks we have not been exploiting: a stateful FnMut can assert that incoming connection handling within a test follows a certain expected sequence of Incoming properties. As such, this commit replaces TestEndpoint.incoming_connection_behavior with a handle_incoming callback, modifies some existing tests to exploit this functionality to test more things than they were previously, and adds new integration tests for server and client usage of tokens from NEW_TOKEN frames.

Ralith requested changes Jul 28, 2024

View reviewed changes

Ralith reviewed Jul 28, 2024

View reviewed changes

quinn-proto/src/token_reuse_preventer.rs Outdated Show resolved Hide resolved

gretchenfrage force-pushed the new-token branch 4 times, most recently from c453b28 to 250e69d Compare November 21, 2024 01:12

gretchenfrage mentioned this pull request Nov 21, 2024

Consider releasing a new version so this can be used from projects with MSRV 1.70? tomtomwombat/fastbloom#9

Closed

gretchenfrage force-pushed the new-token branch 15 times, most recently from bdf45e5 to b3a469a Compare November 25, 2024 00:30

gretchenfrage force-pushed the new-token branch from b3a469a to 46e204a Compare November 25, 2024 00:38

gretchenfrage marked this pull request as ready for review November 25, 2024 00:55

gretchenfrage requested a review from Ralith November 25, 2024 00:56

djc reviewed Nov 25, 2024

View reviewed changes

gretchenfrage force-pushed the new-token branch 3 times, most recently from bbebc09 to 84b70fe Compare November 30, 2024 19:08

gretchenfrage force-pushed the new-token branch from a178d97 to fcfbd63 Compare December 26, 2024 08:20

Ralith mentioned this pull request Dec 26, 2024

proto: remove redundant cursors #2119

Merged

Ralith requested changes Dec 26, 2024

View reviewed changes

gretchenfrage force-pushed the new-token branch from fcfbd63 to 5dc9d49 Compare December 26, 2024 20:44

gretchenfrage force-pushed the new-token branch from 5dc9d49 to 4f15496 Compare December 26, 2024 22:33

gretchenfrage force-pushed the new-token branch from 4f15496 to 1bcf2a2 Compare December 28, 2024 06:29

gretchenfrage requested a review from Ralith December 28, 2024 06:29

gretchenfrage force-pushed the new-token branch 2 times, most recently from d8a856d to 5f27a96 Compare December 28, 2024 06:47

Ralith approved these changes Dec 28, 2024

View reviewed changes

gretchenfrage force-pushed the new-token branch from 5f27a96 to 0b37f5b Compare January 1, 2025 19:56

gretchenfrage mentioned this pull request Jan 4, 2025

feat: QUIC Address discovery extension #2043

Open

gretchenfrage added 9 commits January 9, 2025 17:41

proto: Factor out NewToken frame struct

12a2bbf

proto: Rename RetryToken -> Token

1ebc30f

proto: Split out RetryTokenPayload

90943cc

Moves all the fields of Token to a new RetryTokenPayload struct, and makes Token have a single `payload: RetryTokenPayload` field. This may seem strange at first, but it sets up for the next commit, which adds an additional field to Token.

proto: Convert TokenPayload into enum

4788116

As of this commit, it only has a single variant, which is Retry. However, the next commit will add an additional variant. In addition to pure refactors, a discriminant byte is used when encoding.

gretchenfrage force-pushed the new-token branch from 0b37f5b to 6cbecb2 Compare January 9, 2025 23:42

gretchenfrage mentioned this pull request Jan 10, 2025

ci: run quinn-udp tests with fast-apple-datapath #2130

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Utilize NEW_TOKEN frames #1912

Utilize NEW_TOKEN frames #1912

gretchenfrage commented Jun 30, 2024 •

edited by djc

Loading

Ralith left a comment

gretchenfrage commented Nov 25, 2024 •

edited

Loading

Ralith Dec 26, 2024

Ralith Dec 26, 2024

gretchenfrage commented Dec 26, 2024

Ralith commented Dec 26, 2024

gretchenfrage commented Dec 27, 2024

djc commented Dec 27, 2024 •

edited

Loading

gretchenfrage commented Dec 28, 2024

Ralith commented Dec 28, 2024

gretchenfrage commented Jan 1, 2025

Utilize NEW_TOKEN frames #1912

Are you sure you want to change the base?

Utilize NEW_TOKEN frames #1912

Conversation

gretchenfrage commented Jun 30, 2024 • edited by djc Loading

Ralith left a comment

Choose a reason for hiding this comment

gretchenfrage commented Nov 25, 2024 • edited Loading

Ralith Dec 26, 2024

Choose a reason for hiding this comment

Ralith Dec 26, 2024

Choose a reason for hiding this comment

gretchenfrage commented Dec 26, 2024

Ralith commented Dec 26, 2024

gretchenfrage commented Dec 27, 2024

djc commented Dec 27, 2024 • edited Loading

gretchenfrage commented Dec 28, 2024

Ralith commented Dec 28, 2024

gretchenfrage commented Jan 1, 2025

gretchenfrage commented Jun 30, 2024 •

edited by djc

Loading

gretchenfrage commented Nov 25, 2024 •

edited

Loading

djc commented Dec 27, 2024 •

edited

Loading