I’ve shared this with our account rep already, but I figured it would be good to get a conversation started around feedback on workflows, because I’ve compiled a list which would help our company in our journey, and I’m sure others have things to add.
Note: This was written a couple of weeks before I published it on the forum, so some points may have changed.
Repetitive Setup and Teardown
On parallel builds, running of commands in parallel is possible, and all other commands are run on all containers. Unfortunately, this is on a per-job basis, but it would be nice to be able to define job steps that always run on all jobs in the workflow, similar to the way 1.0’s
depedencies section works.
If I’m building a bunch of Docker containers, pushing the images to a private registry, and deploying the images to a container service like GKE or ECS, I have to copy and paste the same steps to
checkout the code, set up the Google Cloud context (zone, project, cluster, credentials), set up remote Docker, etc.
In this instance, I think a solution to this would be to define some way to allow—in the individual
jobs config for a job—a way to define
teardown_steps for jobs within the workflow so that it becomes easy to just make a default set of yaml steps with an anchor and include it in the job by referencing it like
defaults: docker_builds: &docker_builds setup_steps: - checkout - setup_remote_docker: reusable: true - run: name: "Set up Google Cloud" command: ... teardown_steps: - run: name: "My API Call" command: ... version: 2 jobs: my_job: ... <<: *docker_builds steps: - run: name: "My step unique to this job" command: ...
In the above example,
my_job would be for a single Docker container build for, say, a microservice. Given many microservices, this sort of modularity could come in quite handy.
An added bonus of this method would be that you could run the setup steps on the next job in the workflow before the step it requires finishes so that the job runs far more quickly (assuming there are parallel containers).
Specifying Manual Builds Only
Obviously CCI is working on the API calls for the workflows, because the docs reflect that, but at the moment all the workflows run as long as the branch is included and/or not excluded. We need a way to specify which jobs should be run automatically and which should be run manually only, not just for the API, but having an interface to run defined workflows on-demand (with build parameters) would be fantastic.
At the moment we have to either push to git and then cancel the jobs we don’t want, or manually rebuild the workflow.
Run Job Only on Failure
There are some steps of the workflow that we’re setting up that create a Kubernetes namespace in order to sandbox deployment of our test environments from each other, and if a build fails and is re-run, then we get an error due to an existing namespace with the same name (we use an identifier similar to a Git hash for the namespace names). It’s easy enough to create another command that will check for the existence of the namespace (which is how we’re currently getting around it), but it would be nice to have a way to run jobs only on failure (similar to
requires on the job) so that we can spin down anything that we spun up but is no longer needed.
Thoughts on Parallelism
With many projects on the same account, parallelism settings become important so that blocking of other developers does not take too long.
With workflows, depending on the jobs’
requires settings and the number of total containers, the queue could possibly max out quite easily. For this reason, it might be worthwhile to look into a
parallelism configuration option, maxing out the number of containers used for the entire workflow at a time. I can see how this would be tricky due to the fact that jobs have their own parallelism, but perhaps if parallelism for the workflow is configured below the highest
parallelism setting for each of the jobs then a configuration error would appear?
Also—from what I can tell—the jobs queue up in random order rather than the order in which they appear in the configuration file, which is slightly confusing.
Our two most active projects in CircleCI are
qa project is making use of Workflows, and the
platform project is still on 1.0 with a setting of 3 containers per build.
I have set up our test environment deployment workflows in such a way that 9 microservice containers build at once (in parallel) and there are 11 jobs in total (with
parallelism of 1). Given that math, one build of the workflow on the
qa repo could result in a developer who triggers a second build having to wait for the entirety of the microservices to build, or for the first
platform build to complete before his build even starts.
Smaller Bullet Points
- The UI does not update the workflows list for a good 7-10 seconds when they fail, are canceled, are rebuilt, or a new workflow is executed.
- The “Builds” list does not list workflow builds on the main repo list of builds, but does list them when focused on a particular branch.
- Not sure if this is more of an over-arching CCI 2.0 thing or a workflows thing, but when we debug with SSH, the container is not in the same state that it was upon failure of the build. The filesystem is the same, but I’m assuming since it’s Docker magic that there’s a reason for this.