So I tried a whole bunch of other approaches.
CircleCI support suggested using https://api.github.com/repos/${CIRCLE_USERNAME}/${CIRCLE_REPO_NAME}/pulls/${CIRCLE_PR_NUMBER}")
to get the correct data about the PR, in particular base.ref
/base.sha
:
resp=$(curl -Ls https://api.github.com/repos/huggingface/transformers/pulls/${CIRCLE_PR_NUMBER})
base_sha=$(jq -r .base.sha \<<< $resp)
but first I found a whole bunch of reports where base.sha
is unreliable, and then when I still tried to test it out it sort of worked for a few tests and then sent me totally bogus information. It set base.sha to a commit from many days back and now it’s telling me that I made 92 commits in a PR. horrors.
edit - later I discovered that someone force pushed into master and thus rewriting history - so that was not github’s problem. so I don’t know still then whether base.sha
coming from the above github API is reliable or not.
I also tried to use commit count:
resp=$(curl -Ls https://api.github.com/repos/huggingface/transformers/pulls/${CIRCLE_PR_NUMBER})
commits=$(jq -r .commits \<<< $resp)
echo count backward $commits commits
git --no-pager diff --name-only HEAD~$commits
but again with bogus info from github, I can’t even test how reliable HEAD~$commits
is - somehow I feel there will be wrong info if someone rewrites PR’s history.
Probably the only solution that would possibly work is by checking out the real user branch in the forked repo and then doing all the figuring out there. But this only seems to work when a normal PR is submitted. If it’s done via github UI file edit, $CIRCLE_PR_NUMBER
is not set!
Here is my emulate-user’s-clone-solution:
commands:
skip-job-on-doc-only-changes:
description: "Do not continue this job and exit with success for PRs with only doc changes"
steps:
- run:
name: docs-only changes skip check
command: |
if test -n "$CIRCLE_PR_NUMBER"
then
echo $CIRCLE_PR_NUMBER
resp=$(curl -Ls https://api.github.com/repos/huggingface/transformers/pulls/${CIRCLE_PR_NUMBER})
user=$(jq -r .user.login \<<< $resp) # PR creator username
head_ref=$(jq -r .head.ref \<<< $resp) # PR user's branch name
echo head_ref=$head_ref, user=$user
fi
if test -n "$user" && test -n "$head_ref"
then
git clone https://github.com/$user/transformers user-clone
cd user-clone
git checkout $head_ref
fork_point_sha=$(git merge-base --fork-point master)
cd -
fi
if test -n "$fork_point_sha" && test -n "$(git diff --name-only $fork_point_sha)"
then
git --no-pager diff --name-only $fork_point_sha
if git diff --name-only $fork_point_sha | egrep -qv '\.(md|rst)$'
then
echo "Non-docs were modified in this PR, proceeding normally"
else
echo "Only docs were modified in this PR, quitting this job"
# enable skipping once we get this sorted out
# circleci step halt
fi
else
echo "Not enough data to perform a skipping check - continuing the job"
fi
It probably can be further simplified, but I wanted lots of debug info for now.
edit: I can see that this code will probably fail if the branch is on the original repo and not in the PR submitter’s forked repo, since that branch won’t exist in their repo. Grrr.