Continuous Deployment Serialisability

Hi all, we’ve been using CircleCI on my team for about a year now, having switched from Jenkins. One of the things we understood about CircleCI was that it was both a CI and CD tool.

There are a number of requirements for reliable delivery, but one of the core requirements for most automated deployments is strictly ordered deployments.

This means that if you push changes A and then B (ordered that way in git history), then B will never deploy before A. There are nuances to this, some systems may coalesce A and B into the same deployment, some may define critical regions such that A and B can run in parallel for most of the build but get strictly ordered for the parts with side effects, and so on, but the core premise is that you can’t deploy in the wrong order.

This is critical for anything that deploys schema changes or does data migrations, but is typically a very useful property to have in all circumstances. It’s essentially table-stakes for a continuous delivery product.

Jenkins, for all its faults, does a pretty good job at this, and it was never something that required any thought from us. In ~6 years of operating Jenkins, this was never an issue we encountered.

Unfortunately CircleCI doesn’t have this. We were pointed towards the Queue Orb, which has helped us with the transition to CircleCI, but with our team growing and deployment velocity increasing, the race conditions in the orb and lack of flexibility around critical sections in deployments, we’re starting to have more and more issues of deployments going out in the wrong order. Also because the queueing is implemented in build, we pay build minutes for the time our builds queue, which is starting to become a significant proportion of our build time bill.

We’ve attempted to speak to our account manager about this, but after a few emails back and forth he went silent on us, and hasn’t responded to an email in months. The answer I always get from CircleCI support is a link to their “ideas” board, which isn’t a great sign for critical product features.

So I was wondering, how do others solve this?
How do you ensure your deployments go out in-order?
How do you enable queueing builds?
Is it possible to get around queueing while being billed for build minutes?
Are there alternative services that get this right?

This has been a challenge for a long time. I think CircleCI is going to continue to invest in the “CD” portion of the platform, but for now, there is no great out of the box solution for this.

In the past, I have leveraged the API along with some hacky scripts to poll for running jobs and trigger new ones. My scripts are no better than the queue orb, and also have some race conditions and occur in the build.

Recently I have been exploring using Spinnaker or Octopus for the deployment portion while using CircleCI for the CI portion. It integrates pretty seamlessly. Depending on your specific application and deployment needs, these might be good options.

I don’t believe there is a commercial “all-in-one” CI and CD platform that gets this 100% right today.


EDIT: Clarified “commerical all-in-one CI and CD platform” since as you mentioned Jenkins is able to handle this well with plugins.

P.S - Also, for anyone else who is interested in this feature, this is the Idea (Feature Request). Please vote!

1 Like

@levlaz thanks for your input. We’ve considered things like Spinnaker in the past, but it’s probably an order of magnitude more complex than the system we have at the moment so not something we really want to introduce, it also looks like it’s going to have a lot more overhead whereas we’re currently about ~15 mins from push to code being live and would prefer to reduce that, not increase it.

I don’t believe there is a commercial “all-in-one” CI and CD platform that gets this 100% right today.

This is a tricky one. I’m inclined to agree with you in parts. Jenkins does manage this and technically has commercial offerings, although it’s not easy and most aren’t using it commercially.

Semaphore actually manages to do this, but that’s because they don’t run any parallel test runs for a given branch, which is actually probably a smart way to get around the problem – I can’t think of a great use-case for parallel test runs on the same branch.