CIRCLE_OIDC_TOKEN is sometimes missing with parallel builds

We are setup with a job to use openid-connect-tokens for AWS - which recently has started manifesting a situation where, say, 3 of the 24 concurrent jobs is missing CIRCLE_OIDC_TOKEN in the env and of course those specific steps of the overall parallel running job immediately fails in the app boot phase.

The other ~21 steps of that concurrent job have the value correctly set and happily works as designed.

Any suggestions?

As a note in the end I contacted circle support and the engineers pushed a fix out for this.

We’re still seeing side effects of this, they’ve frequency of failures has reducing over the past few days.

An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements
== Setup AWS profile (assumed roles) =====================

Issues seem to have resolved for us in the past 48 hours when AWS was able to get a response within 2 seconds from circleci.

In my case since I was told by circle support this was a “known aws problem” I ended up having to implement the auth attempts in a retry/sleep loop. (In addition to tuning SDK behavior around retries, etc.). I still randomly see them here and there, but agreed its far less common now vs. ~last week.

That “known aws problem” is circleci taking more than 2 seconds to respond to an OIDC request. In the error message circleci is the identity provider so the error is coming up from AWS but the root cause is circleci response time:

Couldn't retrieve verification key from your identity provider

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.