Skip to content

Latest commit

 

History

History
369 lines (268 loc) · 13.3 KB

gcp-aggregation-service.md

File metadata and controls

369 lines (268 loc) · 13.3 KB

Testing on GCP using encrypted reports

Prerequisites

To test the aggregation service with support for encrypted reports, you need the following:

Once you've submitted the onboarding form, we will contact you to verify your information. Then, we'll send you the remaining instructions and information needed for this setup.
You won't be able to successfully setup your GCP deployment without completing the onboarding process!

To set up aggregation service in GCP you'll use Terraform.

Set up GCLOUD CLI

Make sure you install and authenticate the latest gcloud CLI.

Set up Terraform

Change into the <repository_root>/terraform/gcp folder. See clone the repository if you have not cloned the repository so far.

The setup scripts require Terraform version 1.2.3. You can download Terraform version 1.2.3 from https://releases.hashicorp.com/terraform/1.2.3/ or at your own risk, you can install and use Terraform version manager instead.

If you have the Terraform version manager tfenv installed, run the following in your <repository_root> to set Terraform to version 1.2.3.

tfenv install 1.2.3;
tfenv use 1.2.3

We recommend you store the Terraform state in a cloud bucket. Create a Google Cloud Storage bucket via the console/cli, which we'll reference as tf_state_bucket_name. Consider enabling versioning to preserve, retrieve, and restore previous versions and set appropriate policies for this bucket to prevent accidental changes and deletion.

gsutil mb gs://<tf_state_bucket_name>

Authenticate gcloud cli for terraform.

gcloud auth application-default login

Download Terraform scripts and prebuilt dependencies

If you like to build the Confidential Space container, as well as the Cloud Function jars in your account, please follow the instructions in build-scripts/gcp. Skip running bash download_prebuilt_dependencies.sh and run bash fetch_terraform.sh instead. Continue with the next deployment step after building and downloading your self-build jars.

The Terraform scripts to deploy the aggregation service depend on 2 packaged jars for Cloud functions deployment. These jars are hosted on Google Cloud Storage (https://storage.googleapis.com/aggregation-service-published-artifacts/aggregation-service/{version}/{jar_file}) and can be downloaded with the <repository_root>/terraform/gcp/download_prebuilt_dependencies.sh script. The script downloads the terrafrom scripts and jars which will be stored in <repository_root>/terraform/gcp. License information of downloaded dependencies can be found in the DEPENDENCIES.md

Run the following script in the <repository_root>/terraform/gcp folder to download the prebuilt dependencies.

bash download_prebuilt_dependencies.sh

Note: The above script needs to be run with bash and does not support sh*

For manual download into the <repository_root>/terraform/gcp/jars folder you can download them from the links on our releases page.

Adtech Setup Terraform

Make sure you have completed the steps above before following the next instructions.

Make the following adjustments in the <repository_root>/terraform/gcp/environments/adtech_setup folder:

  1. Copy main.tf_sample to main.tf and add the tf_state_bucket_name to your main.tf by uncommenting and replacing the values using <...>:

    # backend "gcs" {
    #   bucket = "<tf_state_bucket_name>"
    #   prefix = "adtech_setup-tfstate"
    # }
  2. Copy adtech_setup.auto.tfvars_sample to adtech_setup.auto.tfvars and replace the values using <...> following instructions in the file.

  3. Once you've adjusted the configuration, run the following in the <repository_root>/terraform/gcp/environments/adtech_setup folder

    Install all Terraform modules:

    terraform init

    Get an infrastructure setup plan:

    terraform plan

    If you see the following output on a fresh project:

    ...
    Plan: ?? to add, 0 to change, 0 to destroy.

    you can continue to apply the changes (needs confirmation after the planning step)

    terraform apply

    If your see the following output, your setup was successful:

    ...
    Apply complete! Resources: 54 added, 0 changed, 0 destroyed.
    ...

Note: Please be advised that executing terraform destroy for the Adtech Setup environment will result in the deletion of all resources generated within that environment.

Set up your deployment environment

Note: Please, make sure that you have completed the above Prerequisites, including the onboarding process.

We use the following folder structure <repository_root>/terraform/gcp/environments/<environment_name> to separate deployment environments.

To set up your first environment (e.g dev), copy the demo environment. Run the following commands from the <repository_root>/terraform/gcp/environments folder:

mkdir dev
cp -R demo/* dev
cd dev

Make the following adjustments in the <repository_root>/terraform/gcp/environments/dev folder:

  1. Add the tf_state_bucket_name to your main.tf by uncommenting and replacing the values using <...>:

    # backend "gcs" {
    #   bucket = "<tf_state_bucket_name>"
    #   prefix    = "<environment_name>-tfstate"
    # }
  2. Rename example.auto.tfvars to <environment>.auto.tfvars and replace the values using <...>. Leave all other values as-is for the initial deployment.

    project_id  = "<YourProjectID>"
    environment = "<environment_name>"
    ...
    alarms_enabled           = true
    alarm_notification_email = "<[email protected]>"
    • project_id: Google Cloud project ID for your deployment
    • environment: name of your environment
    • user_provided_worker_sa_email: Set to worker service account created in Adtech Setup section and submitted in onboarding form
    • alarm_enabled: boolean flag to enable alarms (default: false)
    • alarm_notification_email: Email to receive alarm notifications.
  3. Skip this step if you use our prebuilt container image and Cloud Function jars

    If you self-build your container image and jars, you need to copy the contents of the release_params.auto.tfvars file into a new file self_build_params.auto.tfvars remove the release_params.auto.tfvars file afterwards.

    To copy without symlink, run the following in the <repository_root>/terraform/gcp/environments/dev folder

    cp -L release_params.auto.tfvars self_build_params.auto.tfvars

    Then delete the symlinked file:

    rm release_params.auto.tfvars

    And change the line worker_image to your location of the self built container image.

  4. To run the aggregation service deployment with the deploy service account created in Adtech Setup, set the following environment variable:

    export GOOGLE_IMPERSONATE_SERVICE_ACCOUNT="<YourDeployServiceAccountName>@<ProjectID>.iam.gserviceaccount.com"
  5. Once you've adjusted the configuration, run the following in the <repository_root>/terraform/gcp/environments/dev folder

    Install all Terraform modules:

    terraform init

    Get an infrastructure setup plan:

    terraform plan

    If you see the following output on a fresh project:

    ...
    Plan: 54 to add, 0 to change, 0 to destroy.

    you can continue to apply the changes (needs confirmation after the planning step)

    terraform apply

    If your see the following output, your setup was successful:

    ...
    Apply complete! Resources: 54 added, 0 changed, 0 destroyed.
    
    Outputs:
    frontend_service_cloudfunction_url = "https://<environment>-us-central1-frontend-service-<cloud-function-id>-uc.a.run.app"
    vpc_network = "https://www.googleapis.com/compute/v1/projects/<project>/global/networks/<environment>-network"

    The Terraform scripts create createJob and getJob API endpoints:

    • Create Job Endpoint: https://<environment>-<region>-frontend-service-<cloud-funtion-id>-uc.a.run.app/v1alpha/createJob
    • Get Job Endpoint: https://<environment>-<region>-frontend-service-<cloud-funtion-id>-uc.a.run.app/v1alpha/getJob

    These are authenticated endpoints, refer to the Testing the System section to learn how to use them.

    If you run into any issues during deployment of your system, please consult the Support section.

Testing the system

To test the system, you'll need encrypted aggregatable reports in Avro batch format (follow the collecting and batching instructions accessible by the aggregation service.

If your inputs are larger than a few hundred MB, we suggest sharding the input reports and domain file into smaller shards.

  1. Create a Google Cloud Storage bucket for your input and output data (if not done during Adtech Setup), we will refer to it as data_bucket. This bucket should be created in the same Google Cloud project where you set up the aggregation service.

    Consider enabling versioning to preserve, retrieve, and restore previous versions and set appropriate policies for this bucket to prevent accidental changes and deletion.

  2. Copy your reports.avro with batched encrypted aggregatable reports to <data_bucket>/input.

  3. Create an aggregation job with the createJob API.

    POST https://<environment>-<region>-frontend-service-<cloud-funtion-id>-uc.a.run.app/v1alpha/createJob

    {
        "input_data_blob_prefix": "input/reports.avro",
        "input_data_bucket_name": "<data_bucket>",
        "output_data_blob_prefix": "output/summary_report.avro",
        "output_data_bucket_name": "<data_bucket>",
        "job_parameters": {
            "attribution_report_to": "<your_attribution_domain>",
            "output_domain_blob_prefix": "domain/domain.avro",
            "output_domain_bucket_name": "<data_bucket>"
        },
        "job_request_id": "test01"
    }

    Note: This API requires authentication. Follow the Google Cloud Function instructions for sending an authenticated request.

  4. Check the status of your job with the getJob API, replace values in <...>

    GET https://<environment>-<region>-frontend-service-<cloud-funtion-id>-uc.a.run.app/v1alpha/getJob?job_request_id=test01

    Note: This API requires authentication. Follow the Google Cloud Function instructions for sending an authenticated request. Detailed API spec

Upgrade Environment

Run the following in the <repository_root>.

git fetch origin && git checkout -b dev-v{VERSION} v{VERSION}
cd terraform/gcp
bash download_prebuilt_dependencies.sh

Execute the following commands with your own Google Cloud account. If you were previously impersonating a service account, clean the environment variable :

export GOOGLE_IMPERSONATE_SERVICE_ACCOUNT=""
cd environments/adtech_setup
terraform plan
terraform apply

Execute the following command by impersonating the Deploy Service Account :

export GOOGLE_IMPERSONATE_SERVICE_ACCOUNT="<YourDeployServiceAccountName>@<ProjectID>.iam.gserviceaccount.com"
cd ../dev
terraform apply

Note: If you use self-built artifacts described in build-scripts/gcp, run bash fetch_terraform.sh instead of bash download_prebuilt_dependencies.sh and make sure you updated your dependencies in the jars folder.

Note: When migrating to new coordinator pair from version 2.[4|5|6].z to 2.7.z or later, ensure the file /terraform/gcp/environments/shared/release_params.auto.tfvars was updated with the following values:

coordinator_a_impersonate_service_account = "a-opallowedusr@ps-msmt-coord-prd-g3p-svcacc.iam.gserviceaccount.com"
coordinator_b_impersonate_service_account = "[email protected]"