Since they are all logged as taking 12:07, I assume it is 12:07 total, and they are waiting until they are all started. If that is the case, can you get the times for each one? I am guessing from your description that Postgres and Redis start quickly, and it is Elasticsearch that takes 12+ minutes. Is that right?
I’d assume as well that you are using the default command to start this image. Have a look at the configuration reference in the docs: you can supply a command key for images, in which you can supply a custom start command. I would guess that you can turn off the initial indexing here.
Good idea on trying to stop elasticsearch from loading previous indexes, but it feels slightly wrong though: no indexes should be there in a first place!
I had a quick search about elasticsearch settings and couldn’t find a way to prevent it from loading the indexes - it’s seem to be an essential part of their “initial recovery” feature.
I then looked up more towards the docker image of elasticsearch and saw they’re using a VOLUME
So my thinking is now to force the volume to an empty/new one!
Here is the head of our .circleci.config.yaml config.
You’re not using a CircleCI image here, so if the container contains index data, they will also be in the image at Docker Hub. Isn’t this an upstream issue for that project?
Ah, are you running several jobs in a workflow? I would not have assumed that your secondary containers would cache data from one job to the next in a single workflow, but perhaps you can shed some light on how you are running this job?
If you mean previous runs of a single job not in a workflow, then yes, secondary containers should absolutely not be caching anything. They are meant to be fresh every time.
Can you dump the indexes to see what is in them? That might give you a clue as to when they are accumulated.
Thanks again, your reasoning process is of great help!
You’re not using a CircleCI image here, so if the container contains index data, they will also be in the image at Docker Hub. Isn’t this an upstream issue for that project?
Sorry I didn’t specify. Yes the container is not from CircleCI, but it’s a known public docker image, and I can confirm there is no index files contained in that image. The files are set only in CircleCI environment.
are you running several jobs in a workflow?
Yes, 3 actually, a build and 2 deploys dependent of the build, the current delayed job is the build (first) one.
I’ll dig on the source of the index files and potentially clean them at the end of the build for next build if I can’t find another way
Hmm, intriguing. I wonder if secondary containers are preserved across the lifetime of the workflow? I don’t use secondary containers, but I would have thought everything would be stopped and started with each job.
Still, I can understand the benefit of such an arrangement - it would allow the preservation of server state between related jobs.
Yeah, that’s a good idea.
Or if you want to make sure the ES is completely fresh each time, you could always drop it as a secondary container and just install it in the primary/build container. It’s a bit messier, but getting it from apt-get will guarantee it to be as clean as possible .
They all say 12:07 because they were running for the duration of the build. If one were to die early, the timestamp would reflect it. It’s a shortcoming of the UI because it’s more clear while the build is still running what is happening with those containers.
We always destroy containers after they run. There is no data being pulled into the image that isn’t committed into the image itself.