What happens to builds during Github outages


first of all, sorry, I wasn’t really sure on where to put this, and since I think this is also a bug, I chose this tag.

I’m currently building a service that depends on the Github status API and we’re testing a few things with Circle CI as well as with other CI vendors. On Jan 13, during the last Github outage, we obviously had all kinds of trouble but one was that for some builds, weird things occurred. One thing was particularly weird: I have a build that is stuck in the status “NOT RUNNING” since Jan 13th. My assumption is as follows:

  • Github sent the webhook request to Circle CI
  • Circle CI registered the build and wanted to send the “pending” status update to the GH API but failed
  • Build is stuck

Of course, when Github is down, there are a lot more weird error scenarios of which most are pretty hard to recover, but I’d love to know if my assumptions are correct and if you think it would be possible to try some sort of Retry mechanism here (exponential backoff etc. etc.). Maybe it’s something that exists, but only on paid plans?

The reason I would love to know more about these scenarios is because I’m trying to figure out on how to correctly detect build errors or service errors in contrast to “this repo currently doesn’t have CI set up”.

I can provide build details but for obvious reasons would love to not share this on this semi public forum.