Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: update git fetch command and remove git pull command #548

Conversation

tromai
Copy link
Member

@tromai tromai commented Nov 7, 2023

Closes #547
Closes #519

This PR does the following changes to the check_out_repo_target function.

Add additional flags to the git fetch command

For more details on why and which flags were used, see #547
See commit 46cb61d

Remove unnecessary logging in the check_out_repo_target function

See commit fe81a0c

Remove git pull, instead check out the latest commit directly

For more details on why, see #519.
This change is incorporated with the new design of check_out_repo_target.

Implement a new design for the check_out_repo_target function

See commit ce3dda7

Current supported use cases of "checking out" a specific branch and/or commit in Macaron:

  • No branch and commit hash are given: Macaron will checkout the latest commit of the default branch. (1)
  • No branch and commit hash is given: The CLI options won't allow the user to provide a commit hash without specifying a branch first. (2)
  • Branch and no commit hash given: Macaron will checkout the latest commit of that branch. (3)
  • Branch and commit hash are given: Macaron will checkout the commit with the given hash in that branch. (4)

The new design will remove the need for checking out a branch, and instead check out the target commit directly.

  • In scenario (1): The latest commit of the default branch of a repository can be obtained through origin/HEAD -> git checkout --force origin/HEAD.
  • In scenario (2): We don't support at the moment but this is a totally normal use case. A git commit hash is supposed to be unique. If we are only interested in viewing the repository state at a given commit, a commit hash alone is sufficient.
  • In scenario (3): To checkout the latest commit of a branch, we can checkout the reference of that branch in remote: git checkout --force origin/<branch_name>. Assume that git fetch has updated the local repository to be the up-to-date version of remote, this command will ensure that the reference origin/<branch_name> always point to the latest commit of that branch as observed from remote. For more information on why it is, see the Remotes section in this article.
  • In scenario (4): This use case can be rephrased as: we want to checkout a commit A, we have the information that commit A exists in the history of branch B. Therefore, it's Macaron responsibility to only checkout commit A if commit A exists in branch B. With that being said, we still only need to checkout once which is git checkout --force <commit> if we have made sure that the commit exists in the user provided branch.

Why "origin" remote but not other remote?

  • With all repositories that Macaron clones into <output>/git_repos, Macaron just clones and do nothing else. This mean that the default origin remote always exist.
  • If we just pick one origin remote, we can prevent any extra complexity:
    • Given a branch name, which remote we check the branch name against ? what if multiple remotes have the same branch name (totally possible), but that branch in each remote have a different history. This is especially important for use case (4) as the result is not correct anymore if we don't be specific on the remote we check.
    • When we fetch, which remote do we need to fetch from?
    • How do we find the default branch ? What if different remote have different default branch ?

The drawback of pinning the remote named origin is not large. The only issue I can see right now is if we run the analysis on a local repository provided by the user. However, we could put some documentation saying that we expect the origin remote is the repository we want to analyze.

How to find all the branches from origin remote whose history contains a commit hash.

To find all branches that contains a specified commit we could use the git branch command as documented here.
Ideally, what we want to run is:

git branch -r --list origin/* --contains <commit_hash>

The -r flag ensures that it only returns the branch from all remotes in the repositories
The --list origin/* makes sure to only filter out the branches that are tracked from the origin remote.
The --contains <commit_hash> makes sure that only return the branch names that contain the commit hash we are interested in.

For example, let's try that on the Macaron repository:

$ git clone https://github.com/oracle/macaron
$ cd macaron
$ git rev-parse v0.1.0 # obtain the commit that was tagged with v0.1.0
9217e9319ce4b3e08b4e4eb12c951f6a97b33c6e
$ git branch -r --list origin/* --contains 9217e9319ce4b3e08b4e4eb12c951f6a97b33c6e
...
origin/main
origin/staging
...

I have create a function in git_url.py to perform the above command and extract the list of branches from the output. With the above example, we will have a list of string: [..., "origin/main", "origin/staging", ...] whose elements are the branches that contains the commit we are interested in.

@tromai tromai added bug Something isn't working git_operations The issues realated to the git operations that Macaron makes on the target repository labels Nov 7, 2023
@tromai tromai self-assigned this Nov 7, 2023
@tromai tromai requested a review from behnazh-w as a code owner November 7, 2023 00:42
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Nov 7, 2023
@tromai tromai changed the title chore: use add --force and --tags flags to git fetch chore: use add --force and --tags flags to git fetch (WIP) Nov 7, 2023
@tromai tromai marked this pull request as draft November 7, 2023 00:43
@behnazh-w
Copy link
Member

Please also address this issue in this PR and make it a more general optimization for git operations.

@tromai tromai removed the bug Something isn't working label Nov 7, 2023
@tromai tromai force-pushed the 547-running-git-fetch-only-does-not-guarantee-that-we-have-fetched-all-changes-related-to-tags branch from c91440e to 28144e7 Compare November 7, 2023 04:04
@tromai tromai marked this pull request as ready for review November 7, 2023 04:32
@tromai tromai marked this pull request as draft November 7, 2023 05:06
@tromai tromai changed the title chore: use add --force and --tags flags to git fetch (WIP) chore: update git fetch command and remove git pull command Nov 7, 2023
@tromai tromai force-pushed the 547-running-git-fetch-only-does-not-guarantee-that-we-have-fetched-all-changes-related-to-tags branch 2 times, most recently from ce3dda7 to 079e515 Compare November 10, 2023 04:54
@tromai tromai marked this pull request as ready for review November 10, 2023 06:17
@tromai tromai force-pushed the 547-running-git-fetch-only-does-not-guarantee-that-we-have-fetched-all-changes-related-to-tags branch from 079e515 to 73f3e87 Compare November 13, 2023 06:31
@tromai tromai force-pushed the 547-running-git-fetch-only-does-not-guarantee-that-we-have-fetched-all-changes-related-to-tags branch from 7ee578f to 141b4de Compare December 6, 2023 00:22
@tromai tromai force-pushed the 547-running-git-fetch-only-does-not-guarantee-that-we-have-fetched-all-changes-related-to-tags branch from 141b4de to b37c190 Compare December 19, 2023 02:29
* Remove git pull and use the new design where the branch is not used to checked out anymore.
* Update some integrations tests where the tag is used as the branch, which is not correct.
* Implement a function to find all remote branches that contain a commit.

Signed-off-by: Trong Nhan Mai <[email protected]>
…et_branches_containing_commit for clarity

Signed-off-by: Trong Nhan Mai <[email protected]>
docs/source/pages/using.rst Outdated Show resolved Hide resolved
src/macaron/slsa_analyzer/git_url.py Outdated Show resolved Hide resolved
src/macaron/slsa_analyzer/git_url.py Outdated Show resolved Hide resolved
src/macaron/slsa_analyzer/git_url.py Outdated Show resolved Hide resolved
tests/slsa_analyzer/test_git_url.py Show resolved Hide resolved
@tromai tromai merged commit 8056f9f into staging Dec 22, 2023
9 checks passed
@tromai tromai deleted the 547-running-git-fetch-only-does-not-guarantee-that-we-have-fetched-all-changes-related-to-tags branch December 22, 2023 05:12
art1f1c3R pushed a commit that referenced this pull request Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
git_operations The issues realated to the git operations that Macaron makes on the target repository OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants