Firstly, let me say I love the 2.0 product, which we’ve been happily using for about a month.
We’ve been seeing some dramatic variation in build time, which seems to imply some fairly intense network congestion or contention for some build nodes. The behavior is that some builds take much, much, longer to do things like pull down docker images, clone git repos, and send build context to remote docker engines.
On a “fast” node, the git clone takes ~10s, but on a slow node it takes 3-4 minutes. The build time compounds, because we also do dockerbuilds which require sending context over to a remote docker engine, and there’s a tight correlation between slow git clones and slow docker builds. It all points to network contention on some nodes, but not on all of them.
On a “fast” node, we get builds on the order of 3-4 minutes, which is fine for my team, but on a slow node that also happens not to have cached our build image, it can take nearly 20. We gate merging of PRs on successful builds, so it’s potentially a real productivity hit for us.
Did something change in the network configuration or utilization in the past week? It’s gotten much worse recently.