[Product Launch] Chunk Tasks - fixing flaky tests

luisejrobles · August 18, 2025, 8:37pm

Hello CircleCI community!

Now in beta: a new agentic capability that identifies and provides fixes for flaky tests in your CircleCI projects, helping you ship quickly with confidence by reducing time spent debugging intermittent failures.

Have feedback or feature requests? Submit them on our Ideas board where you can also see existing feature requests and vote on them.

Getting Started

Prerequisites:

An Anthropic or OpenAI API key to enable the agent to process and generate flaky test fixes using your existing model provider. Your source code is not stored nor used for training purposes by CircleCI.

For OpenAI, make sure your org has gpt-5 model access and your organization is verified. You can read more at OpenAI Organization Verification Guide. If organization verification isn’t possible, read our FAQ section What if I can’t get my organization verified when using OpenAI?.
Test results stored on CircleCI: To enable CircleCI to detect test flakiness, you need to store your test results using the store_test_results step in your CircleCI YML configuration file. Learn more about collecting test data here.

Setup:

Navigate to the CircleCI web app → Your organization → Chunk Tasks (from the left-hand navigation) → Get Started
Install the CircleCI GitHub App in your desired GitHub Organization so the agent can create Pull Requests with recommended fixes.
Add your Anthropic or OpenAI API key.
Select the followed project where the task should be assigned.
Configure your preferred
- Run frequency (daily/weekly/monthly)
  - Daily: Runs Sunday-Thursday at 22:00 UTC
  - Weekly: Runs on Sunday at 22:00 UTC
  - Monthly: Runs on the first first day of the month at 22:00 UTC
- Maximum tests to fix per run
- Number of solutions to try per test
- Number of validation runs per test
- Maximum number of concurrent open PRs.
- Screenshot 2025-10-15 at 10.57.32 a.m.1114×1816 212 KB
The agent creates an environment to run your tests by inferring setup from your repository. If you’d like more control, you can customize it with a .circleci/cci-agent-setup.yml file on your default branch. Learn more in our FAQ: Unable to run verification tests.

How It Works

When running, the agent will identify flaky tests based on the tests that are marked flaky in CircleCI’s Test Insights
The agent will generate potential fixes to the flakiness based on the Number of solutions to try per test setting that was configured during setup.
The agent will validate solutions through multiple test runs to ensure the flakiness has been removed based on the Number of validations run per test setting that was configured during setup.
The agent will open a pull request with the proposed fix

The Agent tasks list in CircleCI web app →Your organization → Chunk Tasks tab will show a row per every test being analyzed.

If the agent does not run into an error while analyzing & attempting to fix a given test’s flakiness, a it opens a PR with a proposed fix.

Each agent task will have two tabs:

Code Diff: the proposed code changes
Logs: the agent’s reasoning and analysis

If the agent lacks confidence in the fixes or runs into an error during execution, a PR is not created, but logs and analysis remain available for review.

Known limitations

Editing agent task configurations

Currently, there is no way to directly edit task configuration settings including post-run commands once an agent task is created. The workaround for now is to delete the Chunk task and recreate it:

- Navigate to Organization settings > Chunk settings > Delete the current agent task.
- Create a new agent task > customize your settings

OpenAI Zero Data Retention Compatibility Issue
If you’re using OpenAI as your model provider and see all Chunk tasks marked as “Not fixed” with “Could not diagnose a fix” messages and empty Logs tabs, the issue may be that your OpenAI account has Zero Data Retention enabled. Chunk does not yet support OpenAI accounts with Zero Data Retention.

Ad-hoc tasks

In the CircleCI web app, navigate to Organization settings > Chunk settings > “…” > Submit ad-hoc task. From a branch that already exists, you can ask Chunk to accomplish any task you’d like (ie. “remove the outdated call-to-action from my web app’s home page”). It will push its changes to the branch that you select. For these tasks, Chunk runs in the environment that you define in your cci-agent-setup.yml file (read more about Chunk’s environment here).

Join the Beta

Join the waitlist to get access

Have feedback or feature requests? Submit them on our Ideas board where you can also see existing feature requests and vote on them.

No extra cost during beta. Uses compute credits and your AI provider tokens. This will be a paid feature after beta.

luisejrobles · September 5, 2025, 5:08pm

Chunk Tasks - Latest updates (09/05/2025)

We’ve been actively improving the agent based on your feedback. Here are the latest enhancements:

When Assigning an Agent task, you can now configure the Maximum number of concurrent open PRs for the project.

Chunk Task > Assign Task modal1114×1816 52.9 KB

Better User Experience

We’ve improved agent logs within Agent tasks to be more human-friendly and conversation-like. Logs now clearly distinguish between Assistant (instructions being executed by the agent) and User (output from the agent), making it easier to follow the agent’s workflow.
In the Agent task Logs tab, we’ve added an Expand/Collapse All toggle to streamline troubleshooting. This allows you to quickly expand all logs and use Cmd+F (or Ctrl+F) to search for specific words or commands.
We’ve enhanced PR bodies and run summaries with human-readable descriptions for better clarity.

CircleCI | Pull request summary generated by the agent894×709 40.9 KB
Agent-generated branches now use circleci/fix-flaky-test-<id> instead of the previous fix-flaky-test-<id> format. Making them easier to identify and filter in your repository.

Behind the Scenes

Enhanced Execution Environment: The agent now uses a machine executor for better performance and reliability when running tests.

Have feedback or feature requests? Submit them on our Ideas board where you can also see existing feature requests and vote on them.

luisejrobles · September 30, 2025, 11:49pm

Chunk - FAQs

Have feedback or feature requests? Submit them on our Ideas board where you can also see existing feature requests and vote on them.

Does CircleCI use my data to train the models?

Your source code is not stored nor used for training purposes by CircleCI

What data does Chunk access?

Chunk accesses historical build data and repository contents to identify and fix flaky tests. This is the same information that CircleCI already has access to through your existing CircleCI configuration.

Will my test results be shared with other customers?

Your usage data, including test results, will not be used for any other customer. Each customer’s data remains isolated and is only used to support their own Chunk tasks.

How long are agent logs stored?

We store agent logs for 90 days. This is a fixed retention period that applies to all organizations, regardless of your plan’s standard data retention policy. After 90 days, logs are automatically deleted to keep your workspace at optimal performance.

OpenAI organization verification required. Please verify your organization at…

When encountering the message:
OpenAI organization verification required. Please verify your organization at https://platform.openai.com/settings/organization/generaland see ourcommunity forum for more debugging help
inside an agent task, it indicates that your OpenAI organization verification is still pending.

To fix this: In OpenAI Platform navigate to General > Organization settings and click ‘Verify Organization’ to follow the necessary steps to have your organization verified.

verifyorganization-ezgif.com-video-to-gif-converter

Additional help: OpenAI Organization Verification Guide

What if I can’t get my organization verified when using OpenAI?

If organization verification isn’t possible, you can bypass this requirement by adding an environment variable:

Go to Organization Settings > Contexts > circleci-agents
Add new Environment Variable:
1. Name: CCI_AGENT_OPENAI_MODEL
2. Value: gpt-5-nano

gpt-5-nano-ezgif.com-video-to-gif-converter (1)

Invalid OpenAI model specified. Please check the model name and ensure it is available for your account.

When encountering the message:
Invalid OpenAI model specified. Please check the model name and ensure it is available for your account.
you need to make sure your organization has gpt-5 model access.

To verify this: In OpenAI Platform

Switch to the project you want to check (top-left dropdown).
Go to Settings → Limits in the left-hand menu.
- This page shows the models and rate limits for your project.
- If gpt-5 is listed, you have access. If not, that project doesn’t.
You can also check in the Playground:
- Select the project (top-left).
- Open the Model dropdown. Only available models will appear.

limitsss-ezgif.com-video-to-gif-converter

Action required - agent execution error

When encountering the message:
Action required - agent execution error
The agent ran into an error while executing this task. See our community forum for how to solve this error.
Email us at sebastian@circleci.com and we’ll help you figure it out.

Unable to run verification tests

Chunk runs in a Linux Machine VM with basic software installed by default. To verify that a proposed fix resolves flakiness, it re-runs the affected test several times. To do this, the agent may install additional software needed to set up the test environment, using clues from your circleci/config.yml to determine how to run the tests.

You can view these attempts in the CircleCI web app by opening the Chunk Tasks → Select a task logs → Expand All, then searching for “run the command for each attempt.” This will take you to the sections where the agent is trying to run the tests.

Improving verification success

Create an “agent environment” CircleCI YML file. This file lets you copy the environment-setup parts of your existing CircleCI config into a dedicated file for Chunk. Name the file cci-agent-setup.yml and ensure that it is present in your .circleci directory and on the default branch.

Chunk supports all standard CircleCI configuration options. This includes executors, resource classes, caching, contexts, environment variables, service containers, orbs, and everything else you’d use in a normal CircleCI pipeline. If it works in your .circleci/config.yml, it works in cci-agent-setup.yml. For a complete reference of available configuration options, see the CircleCI Configuration Reference.

Example cci-agent-setup.yml files:

Basic Python setup

version: 2.1
workflows:
  cci-agent-setup:
    jobs:
      - cci-agent-setup
jobs:
  cci-agent-setup:
    docker:
      - image: cimg/python:3.12
      - image: cimg/postgres:15.3
    steps:
      - checkout
      - run:
          name: Install dependencies
          command: |
             pip install -r requirements.txt

With Caching and Contexts

version: 2.1
workflows:
  cci-agent-setup:
    jobs:
      - cci-agent-setup:
          context: 
            - my-team-context  # Includes any secrets/env vars from this context
jobs:
  cci-agent-setup:
    docker:
      - image: cimg/node:18.0
    steps:
      - checkout
      - restore_cache:
          keys:
            - v1-dependencies-{{ checksum "package-lock.json" }}
      - run:
          name: Install dependencies
          command: npm install
      - save_cache:
          paths:
            - node_modules
          key: v1-dependencies-{{ checksum "package-lock.json" }}

With multiple services

version: 2.1
workflows:
  cci-agent-setup:
    jobs:
      - cci-agent-setup
jobs:
  cci-agent-setup:
    docker:
      - image: cimg/ruby:3.2
      - image: cimg/postgres:15.3
        environment:
          POSTGRES_USER: circleci
          POSTGRES_DB: test_db
      - image: redis:7.0
    steps:
      - checkout
      - run:
          name: Wait for DB
          command: dockerize -wait tcp://localhost:5432 -timeout 1m
      - run:
          name: Install dependencies
          command: bundle install
      - run:
          name: Setup database
          command: bundle exec rake db:setup

With custom resource class and machine executor

version: 2.1
workflows:
  cci-agent-setup:
    jobs:
      - cci-agent-setup
jobs:
  cci-agent-setup:
    machine:
      image: ubuntu-2204:2024.01.2
    resource_class: large
    steps:
      - checkout
      - run:
          name: Install dependencies
          command: |
            sudo apt-get update
            sudo apt-get install -y build-essential

Environment Variables & Contexts

Project environment variables: Chunk automatically has access to any environment variables you’ve configured at the project level in CircleCI. You don’t need to recreate or reference these, they’re already available.

Contexts: If you’re using CircleCI contexts to manage secrets or environment variables, simply include the context in your cci-agent-setup job (as shown in the caching example above). Chunk will have access to all variables from that context, no need to manually recreate them.

Testing Your Environment Setup

To build & iterate on Chunk’s environment, navigate to Organization Settings → Chunk Tasks → Identify desired Agent Task → Select [ … ] → Select [ Chunk Environment ]. This page lets you run the contents of your cci-agent-setup.yml file on a specific branch and immediately see the results from those ad-hoc tasks. Use the Custom button to submit a task to Chunk and see the results.

Merge the cci-agent-setup.yml file to your default branch when the results on the environment setup page are satisfactory.

Additional Guidance for Chunk

To improve Chunk’s ability to run tests & produce fixes that are aligned with stylistic/architectural preferences, many users also include a markdown file (claude.md or agents.md) in the root of their repo with instructions for running tests. Chunk should pick this up automatically.

Changing Chunk’s model provider

Currently, Chunk can only have one model provider installed at a time. To change your model provider:

Navigate to Organization settings → Chunk tasks from the left navigation
Click Edit in Contexts
Select circleci-agents, scroll down to Environment variables and delete your current model provider API key
Click on Add environment variable and input your new API key information:
- Environment variable name: Enter OPENAI_API_KEY or ANTHROPIC_API_KEY (depending on your model provider)
- Value: Enter your model provider API key
- Click Add environment variable

Task Summary or Pull Request Body Too Long or Poorly Formatted

If you’re noticing that your Chunk Task responses appear incomplete or poorly formatted, this may indicate that your API key needs to be configured for a more capable model.

Identifying the Issue
When viewing a Chunk Task through the CircleCI UI → Chunk Tasks, you might observe these indicators of suboptimal model performance:

Inconsistent formatting: The task body lacks properly bolded section headers for Run Summary, Root Cause, Proposed Fix, and Verification
Interactive prompts: Chunk Task body ends with open-ended questions like:

“Would you like me to implement the robust wait pattern in the test now, and add a small helper for future tests? If yes, I’ll apply the changes and run the targeted tests.”

To guide you, here are some examples of how a Chunk Task body should look:

Root Cause
These formatting and content issues typically occur when Chunk is using a less powerful language model to analyze and propose fixes for flaky tests.

Resolution
To fix this, you’ll need to ensure your organization has access to gpt-5 and it’s properly verified.
For details on verification requirements, see “OpenAI organization verification required. Please verify your organization at…” in our FAQs.

Important: If your team previously overrode the model used by Chunk, you’ll need to remove that configuration to prevent using a lower-performance model:

Navigate to CircleCI web app > Organization Settings > Contexts > circleci-agents
Remove the CCI_AGENT_OPENAI_MODEL environment variable from Environment variables section

Start Task button disabled

We’ve noticed some users experiencing an issue where the Start Task button remains disabled in Chunk Tasks > Assign new task, even after filling in all required inputs.Resolution

Look for a callout in the top navigation bar that says “Your GitHub identity is not verified” and click Authorize.

Select “Authorize CircleCI App” when asked3
Try assigning a new Chunk Task again

This should resolve the issue and allow you to proceed with assigning a Chunk task to a project.

luisejrobles · October 16, 2025, 8:24pm

Chunk Tasks - Latest Updates (10/16/2025)

Have feedback or feature requests? Submit them on our Ideas board where you can also see existing feature requests and vote on them.

Better User Experience

Chunk now uses failed job step context to improve fix accuracy.
We’ve fixed an issue where the Chunk Tasks nav bar item wasn’t highlighting when selected, improving navigation clarity.
We’ve resolved a bug where pasting an API key would automatically close the setup modal, ensuring a smoother setup experience.
Chunk’s commits now include verified signatures. All commits created by Chunk are now properly signed and authored by circleci-app[bot]. This resolves issues where unsigned commits would require special handling or higher privileges to merge.
You can now guide Chunk with a custom instruction file. Create a fix-flaky-test.md file in your .circleci/ directory to provide specific guidance about how you want the agent to approach fixing flaky tests in your project. This gives you fine-grained control over the agent’s behavior and lets you encode your team’s testing best practices directly into the fix generation process. Example .circleci/fix-flaky-test.md file:


## Command Restrictions

- You MUST NOT use the `sleep()` command or `setTimeout()` for delays in any scripts
- You MUST NOT use `eval()` as it poses security risks
- Avoid using shell wildcards in destructive operations (e.g., `rm -rf *`)

## Code Style Preferences

- Prefer functional components over class components in React
- Use TypeScript `type` definitions instead of `interface` (this project enforces this via ESLint)
- Favor explicit error handling over try-catch-all patterns
- Use async/await syntax over Promise chains for readability

## Security Considerations

- Always flag use of `dangerouslySetInnerHTML` in React components
- Highlight any potential SQL injection vulnerabilities
- Point out hardcoded credentials or API keys
- Flag any use of `eval()` or `Function()` constructors

## Documentation Standards
- Complex algorithms MUST include explanatory comments

Enhanced Functionality

Chunk is now using the latest model from Anthropic by default: Claude Sonnet 4.5
The cci-agent-setup.yml configuration now works seamlessly with orbs and user-specified resource classes, giving you more flexibility in how you set up your agent environment.

Behind the Scenes

We’ve added file protection safeguards to ensure the agent respects and excludes sensitive files from commits
We’ve restored full detail in execution logs when using OpenAI as model provider. A regression that caused Chunk execution logs to show significantly less output has been resolved. When in CCI web app > Chunk Tasks > Chunk Task, Logs now provide complete visibility into the agent’s actions, making troubleshooting much easier when reviewing task history.

luisejrobles · October 29, 2025, 8:25pm

Chunk Tasks - Latest Updates (10/29/2025)

Better User Experience

We’ve improved visibility in Chunk Settings so you can now see all projects with scheduled tasks, whether you’re following them or not. Previously, the task configurations table only displayed projects you were actively following, which could make it harder to understand which projects had scheduled Chunk tasks configured across your organization. To delete tasks for both followed and unfollowed projects,you’ll need Project Admin or Organization Admin permissions.

For a complete reference of roles and permissions, see Roles and permissions.

We’ve streamlined Chunk’s environment customization with one-click setup file creation. When Chunk doesn’t find a cci-agent-setup.yml file in your repository, it will attempt to create one for you automatically. You can also manually generate one by navigating to Organization Settings → Chunk Tasks in the CircleCI web app, selecting the […] menu for your desired task, then going to Chunk Environment. From there, simply choose your branch and select [ Create File in GitHub ] to generate the file directly in your repository

We’ve fixed a bug that removed the “More Details” section from Chunk’s pull request descriptions. This section now includes a direct link to view the full task execution in the CircleCI web app, making it easier to review Chunk’s work, examine logs, and understand the complete context behind each fix.

We’ve improved Chunk’s failure handling when both verification tests fail and pipeline monitoring encounters issues. Chunk now avoids opening pull requests when it can’t verify the fix worked, preventing potentially broken PRs from appearing in your repository.
Enhanced ad-hoc task capabilities. Chunk now commits changes directly from ad-hoc prompts. When you submit an ad-hoc task through Organization Settings > Chunk Settings, Chunk will push its changes to your selected branch, making it easier to iterate on custom tasks.

You’ll now see visual feedback in both Ad Hoc Tasks and the Chunk Environment results view as Chunk works through your custom instructions, making it easier to understand how long tasks will take.

Screenshare-2025-10-294_08_28PM-ezgif.com-video-to-gif-converter

Custom instruction files now supported in ad-hoc tasks and Chunk Environment. Chunk now automatically picks up guidance from agents.md, claude.md, and fix-flaky-tests.md files when running ad-hoc tasks and environment setup. Simply add these files to the root of your repository.
We’ve added a search filter to the branch dropdown in ad-hoc tasks. Instead of scrolling through all branches, you can now type to quickly find the branch you need making it much easier to work with repositories that have many branches.

ScreenRecording2025-11-03at12.37.54p.m.-ezgif.com-video-to-gif-converter

Chunk Tasks - Latest Updates (11/06/2025)

Have feedback or feature requests? Submit them on our Ideas board where you can also see existing feature requests and vote on them.

Chunk now validates changes by running your CI pipeline. For Ad-Hoc Tasks, after Chunk pushes changes to a branch, it triggers and monitors your CI pipeline to verify the changes. If the pipeline fails, Chunk will attempt to fix the issues and push updated changes.

This prevents the frustrating experience of receiving pull requests that fail basic checks like linting or formatting.

Note: Ad-hoc tasks currently update existing branches or create new ones

You can access Ad-Hoc Tasks by navigating to Organization Settings → Chunk Tasks, selecting the ‘⋮’ symbol in the Task configurations table, then Submit Ad Hoc Task.

Screenshot 2025-11-06 at 3.31.39 p.m. (1)2006×724 26.3 KB

Chunk Tasks - Latest Updates (11/18/2025)

We’ve improved Chunk’s logic to prevent re-opening PRs for tests that have already been successfully fixed. Once Chunk has addressed a flaky test and the fix has been merged, it will no longer attempt to fix that same test again, reducing unnecessary PRs and noise in your repository.
Error message implemented for when running low on credits for Anthropic users. Chunk now displays visual warnings throughout the UI when your Anthropic account credits are running low, helping you avoid unexpected task failures. You’ll see an “Insufficient Anthropic Credits” indicator on the Chunk Tasks dashboard, in Organization Settings, and on failed task detail pages. The warning includes clear steps to add credits at console.anthropic.com and automatically clears once your next scheduled run, ad-hoc task, or environment setup completes successfully with sufficient credits.
Chunk shows “Model restricted” If you’re seeing a “Model restricted” chip or error message stating “Cannot diagnose fix due to OpenAI Zero Data Retention policy” in your Chunk Tasks, this means your OpenAI organization has Zero Data Retention enabled, which prevents Chunk from analyzing your test data and generating fixes.

Topic		Replies	Views
CircleCI Config Suggestions Bot Build Environment	1	1333	May 7, 2023
CircleCI is blocking the rapidAPI Build Environment circle-yml	0	143	July 3, 2024
Content-Type returned by API Feedback & Bug Reports api	1	1009	June 18, 2018
Create/Follow Project v2 API endpoint now in open preview API api	2	521	December 12, 2023
My Automation Suspand Due To Many Commits Running Tests circle-yml	0	676	December 25, 2021

[Product Launch] Chunk Tasks - fixing flaky tests

Getting Started

Prerequisites:

Setup:

How It Works

Known limitations

Ad-hoc tasks

Join the Beta

Chunk Tasks - Latest updates (09/05/2025)

Better User Experience

Behind the Scenes

Chunk - FAQs

Does CircleCI use my data to train the models?

What data does Chunk access?

Will my test results be shared with other customers?

How long are agent logs stored?

OpenAI organization verification required. Please verify your organization at…

What if I can’t get my organization verified when using OpenAI?

Invalid OpenAI model specified. Please check the model name and ensure it is available for your account.

Action required - agent execution error

Unable to run verification tests

Environment Variables & Contexts

Testing Your Environment Setup

Additional Guidance for Chunk

Changing Chunk’s model provider

Task Summary or Pull Request Body Too Long or Poorly Formatted

Start Task button disabled

Chunk Tasks - Latest Updates (10/16/2025)

Better User Experience

Enhanced Functionality

Behind the Scenes

Chunk Tasks - Latest Updates (10/29/2025)

Better User Experience

Chunk Tasks - Latest Updates (11/06/2025)

Chunk Tasks - Latest Updates (11/18/2025)

Related topics