Split by timings is not accurate



Hi all,

I’m having some issues with splitting by timing. I’ve seen other people get errors that the timing data is not available, but that is not the case for me. I’m splitting tests by timing, however the timing data must be off, because I have some containers that finish in 25 minutes while others take longer than 2 hours. Is there anyway to look at timing data to see if there is an issue with the data that has been saved?

Here is a copy of my splitting…
mvn -f rest/pom.xml -P common,stand-alone -Dtest=$(for file in $(circleci tests glob "rest/src/com/OURPACKAGE/**/*Tests*.java" "rest/src/com/OURPACKAGE/restTests/**Tests*.java" "rest/src/ADIFFERENTPACKAGE/**/*Tests*.java" | circleci tests split --split-by=timings); do basename $file | sed -e "s/.java/,/"; done | tr -d '\r\n') -e test


We just released timing-based test splitting in Workflows recently. The system will look through the most recent 50 builds for your timing data from a job with the same name as the current job.

The most common issues I see with the results being off are a) re-running failing tests b) one file of tests takes far longer than any other file.

I recommend storing the tests in a variable to echo what you’re passing before running mvn.


I’m not using workflows. So I suppose my only job is the “Build” job. I am rerunning failed tests, but it isn’t substantially changing the test times. It seems like the splitting is completely random. When I ran at 16x parallelization, some containers finished in 20 minutes while some took 2.5 hours. The ones that took 2.5 hours had just as many test classes being run as the ones that took 20 minutes, the smaller, faster classes just happened to be lumped together.


Are you using store_test_results? You should also store_artifacts the timing output to take a look at what is going on. Are your tests taking different times to run?


I am using store_test_results. How would I store the test timing using store_artifacts? Are you referring to the same reports that I’m storing with store_test_results or some internal Circle-specific file?


Oh, that’s great. Will you update the various places in the documentation that still state timing based splitting isn’t available for workflow-based builds?


Use both so you can analyze the output files for issues. Artifacts are available for download from the UI, but the test results are not accessible otherwise.


Yeah, https://github.com/circleci/circleci-docs/pull/1929


I’ve done that and the files look normal (to me), but if there’s an issue with Circle parsing the file, I’m not sure how I would spot that.


I always have to debug this issue manually, too. I understand the pain :frowning: Looking at the build-insights and comparing the longest container to an average one might lead you to the answer.

The most common thing I see is having a specific file that takes much longer than any other file to execute all the tests within it. One container might be running 1024 tests while another runs 18, and the former completes faster. (That’s a literal example of what I’ve seen)


That makes sense for splitting by filesize, but when splitting by timing, shouldn’t cases like that be handled? For the record, that is very much the issue I have. We have Selenium tests that take ~10-100x longer than a non-UI test of a similar size.


We’ve been having similar problems trying to understand how circle attributes execution time to various files.

Not sure if you’ve discovered this since, but if you’re uploading your junit reports, you can see how circle has mapped them by making an api call like:

curl https://circleci.com/api/v1.1/project/<VCS>/<USER>/<REPO>/<BUILD_NUM>/tests?circle-token=<API_TOKEN>

Which will return a json file like:

            "classname": "bar test",
            "file": "/bar/file.js",
            "name": "some test blah blah",
            "result": "success",
            "run_time": 0.814
            "classname": "foo test",
            "file": "/foo/file.js",
            "name": "doesn't send a followup text",
            "result": "success",
            "run_time": 0.009

Which gives you some insight as to how circle decides to schedule your tests.


This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.