Problem using CIRCLE_NODE_INDEX to split tests

Hello I’m having trouble splitting my unit tests through the env var CIRCLE_NODE_INDEX following this documentation.

When running unit tests on one container (parallelism: 1) my test run in about 7 minutes.
When I’m doing the same but with four containers (parallelism = 4) I see how several containers are created, but the time execution is the same on the first and second, and less on the third and fourth (the third and fourth ones are taking more tests from cache).
Screenshot 2021-09-01 at 15.36.49

I’m not fully understanding what to do in order for my test to execute faster. Any input is welcome. Pasting my bash script where I’m running my code below with parallelism 4:

if [[ ${CIRCLE_NODE_INDEX} -eq 0 ]]; then

     ./gradlew --parallel \
        :module:test \
        :module:test \
        module \
        --continue \
        --stacktrace \
        --no-daemon

 elif [[ ${CIRCLE_NODE_INDEX} -eq 1 ]]; then

     ./gradlew --parallel \
        :module:test \
        :module:test \
        :module:test \
        :module:test \
        module:test \
        --continue \
        --stacktrace \
        --no-daemon

 elif [[ ${CIRCLE_NODE_INDEX} -eq 2 ]]; then
    ./gradlew --parallel \
        :module:test \
        :module:test \
        module:test \
        --continue \
        --stacktrace \
        --no-daemon

 elif [[ ${CIRCLE_NODE_INDEX} -eq 3 ]]; then
    ./gradlew --parallel \
        :module:test \
        :module:test \
        module \
        --continue \
        --stacktrace \
        --no-daemon
 fi