Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: speed up workflows and reduce costs #1545

Merged
merged 260 commits into from
Nov 13, 2023
Merged

ci: speed up workflows and reduce costs #1545

merged 260 commits into from
Nov 13, 2023

Conversation

strophy
Copy link
Collaborator

@strophy strophy commented Oct 31, 2023

Issue being fixed or feature implemented

  • Builds were running in parallel, often building the same code to run different tests.
  • The Rust build cache is a complex solution that has multiple downsides. It cause compilation failures due to pollution from different branches. The cache size is huge (5 - 120 GB), making it inefficient.
  • Costs for self-hosted runners are pretty high.
  • Release workflow takes about 1 hour
  • Test workflow takes 30 mins without dashmate tests and 1 hour with dashmate tests

What was done?

  • Use sccache instead of target dir cache for Rust builds
  • Cache only with s3
  • Changed hardcoded Rust toolchain to "stable"
  • Updated change filter paths to trigger package tests
  • Build JS artifacts only once and then share them with subsequent jobs
  • Build test docker images only once and then share them with test suite and dashmate workflows
  • Use free runners where it's possible
  • Removed Rust compilation error job
  • Cache local network setup data to speed up test suite and dashmate workflows
  • Run dashmate tests in parallel
  • Run Clippy only for specific package (without deps)
  • Make sure that Rust build a stick to the lock file
  • Changed test workflow structure to control the flow and increase visibility
  • Support S3 storage in sccache in Dockerfile
  • Introduced DASHMATE_E2E_TESTS_SKIP_IMAGE_BUILD in to disable image builds in dashmate tests
  • Introduced DASHMATE_E2E_TESTS_LOCAL_HOMEDIR to set specific home dir in dashmate tests
  • Build arm and amd images for release in parallel without emulation
  • Switch to cheaper arm runners
  • Run test suite in 2 parallel jobs for Node.JS
  • Introduce BROWSER_TEST_BATCH_INDEX and BROWSER_TEST_BATCH_TOTAL to run only batch of test suite tests in browsers

How Has This Been Tested?

Running workflows

Breaking Changes

None

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have added "!" to the title and described breaking changes in the corresponding section if my code contains any
  • I have made corresponding changes to the documentation if needed

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

@strophy strophy force-pushed the ci/build-cache branch 2 times, most recently from 8124b92 to d279ced Compare October 31, 2023 06:21
@shumkov shumkov changed the title ci: pre-cache builds [WIP] ci: pre-cache builds Oct 31, 2023
@shumkov shumkov marked this pull request as ready for review October 31, 2023 11:53
@shumkov shumkov changed the title [WIP] ci: pre-cache builds ci: pre-cache builds Oct 31, 2023
@shumkov shumkov changed the title ci: pre-cache builds [WIP] ci: pre-cache builds Oct 31, 2023
@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

@shumkov shumkov changed the title [WIP] ci: speed up workflows and reduce costs ci: speed up workflows and reduce costs Nov 12, 2023
Copy link
Collaborator Author

@strophy strophy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't review changes under packages/ much but the CI stuff looks good to me! Couple of small changes to make

.github/actions/docker/action.yaml Show resolved Hide resolved
.github/actions/nodejs/action.yaml Show resolved Hide resolved
all = true
keepBytes = 30000000000 # 30 GB
keepDuration = 864000 # 10 days
gc = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it means the cache size will grow indefinitely

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We drop it every time. Only cargo registry mount is stored on S3 where we have own retention policy for files (3 days)

shell: bash
run: echo "sha=$(git log -1 --format="%h" -- packages/dashmate)" >> $GITHUB_OUTPUT

# TODO: Use upload artifacts action instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check this todo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't going to do it this iteration

key: local-network-volumes/${{ steps.dashmate-fingerprint.outputs.sha }}
if: steps.local-network-data.outputs.cache-hit != 'true'

- name: Configure pre-built docker images
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we get these images from? Looks like we assume they are already pushed (= we should have some dependency on another job declared above). Please add dependency/comment explaining that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a job, it's an action that pulls images to do the work. Workflows what use this action defines dependencies to build-images job.

- name: Cache NPM build artifacts (S3 bucket cache)
uses: everpcpc/actions-cache@v1
if: contains(runner.name, 'ubuntu-platform')
uses: strophy/actions-cache@opendal-update
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should prefer @master, + we should migrate that repo to dashpay org

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect some updates on upstream soon. When they don't we will change it.

- name: Install protoc
id: deps-protoc
shell: bash
run: |
curl -Lo /tmp/protoc.zip \
https://github.com/protocolbuffers/protobuf/releases/download/v22.0/protoc-22.0-linux-x86_64.zip
https://github.com/protocolbuffers/protobuf/releases/download/v22.0/protoc-22.0-linux-aarch_64.zip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it work on different architectures? maybe we should add some if depending on runner arch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works this way according to the tests. We want to move this step to AMI. When infra guys have time they will do it.

target: ${{ inputs.target }}
platform: linux/arm64
push_tags: true
dockerhub_username: ${{ secrets.DOCKERHUB_USERNAME }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we use dockerhub_username if we use ECR? If this is correct (eg. just var name), please add comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We login to our pairs docker hub user to extend pull limits

Copy link
Member

@shumkov shumkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job guys!

@shumkov shumkov merged commit f14e3e6 into master Nov 13, 2023
78 checks passed
@shumkov shumkov deleted the ci/build-cache branch November 13, 2023 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants