Non-gracefully shutdown after enable Auto-cancel redundant workflows

Hi CircleCI Support,

I’m reaching out regarding an issue where Terraform jobs don’t gracefully shut down after a new commit when the Auto-Cancel Redundant Workflows feature is enabled. This issue is causing problems with our Terraform state, as the jobs are being abruptly terminated before they have a chance to finalize the state properly.

To address this, I attempted to capture and handle the termination signals by using a signal trap in the pipeline to catch TERM, KILL, and other signals. Despite this, I’m unsure about the exact timing needed to allow Terraform to complete its cleanup and finalize the state before the job is canceled.

It seems that the default auto-cancel behavior may not give Terraform enough time to cleanly shut down, resulting in broken states. I would like to pass the necessary time for graceful shutdown directly into the steps themselves, allowing Terraform to handle the termination more effectively and avoid state corruption.

Could you assist us with this or provide guidance on how to better manage the shutdown process for Terraform within this new auto-cancel workflow configuration?

There is some functionality coming out shortly that would let you run a “clean-up” job only on “cancel”. Would that be sufficient to handle the Terraform termination or does it need to happen in the context of the actual job that was cancelled?

The issue is when we running a terrafom deploy step,
if we enable auto-cacnel workflows, it will stop in the middle of the deployment which lead to computation of the state - this should be running in the job context
How do you suggest me to take this?

Darn. I don’t think there’s an easy solution here if you want to still use the auto-cancel redundant workflows feature. Do you need to use that feature, it’s typically used for heavy iteration in a build & test motion, not really for deployments (which i’m assuming is what youre using terraform for?)

An option could be to turn off CircleCI’s “auto-cancel” setting and write your own logic: https://support.circleci.com/hc/en-us/articles/360058421851-Auto-cancel-workflow-via-run-step

Then you have full control over the auto-cancel behavior that you should be able to fine tune?