I’m having trouble with one of my integration tests. It connects to https://httpbin.org but recently we repeatedly had failing tests because of connection issues to httpbin.org
I decided to run a local httpbin in the circle machine: gunicorn -D -b 127.0.0.13:18080 httpbin:app
I tested this first locally in docker.
The strange thing is that I’m running a database for the integration test too, that one has never given any problems, so it’s not a pure loopback interface issue. Though I had better results using an explicit IP address in the 127.0.0.0/24 block than when using localhost
.
Why would gunicorn
or httpbin
specifically have trouble when running in CircleCI, whereas the database does not?
Note: When I run with localhost
instead of an IP address I get this error: Cannot assign requested address
System.Net.Sockets.SocketException : Cannot assign requested address
Stack Trace:
at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.AddHttp11ConnectionAsync(QueueItem queueItem)
at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.HttpConnectionWaiter`1.WaitForConnectionAsync(Boolean async, CancellationToken requestCancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
There are several articles that reference this error, but none of them seem pertinent:
- A Medium article whose author somehow thought
localhost
would refer to the host machine, instead of the docker container itself (Why would you even think that?) - Another article that suggests something is wrong with the Docker routing.
I know that it’s not an issue with localhost
not being resolvable, since I’ve been running the mentioned database for integration tests for years and accessing it on a localhost
address. In fact, in the same run that outputs this error, that database is accessed successfully using localhost
.
Ok, I’ve been trying some more things, and it seems that gunicorn
doesn’t run properly in CircleCI somehow. This is what I run:
- run: gunicorn -D -b localhost:18080 httpbin:app --log-file httpbin.log
- run: dotnet test
- run:
name: Print `gunicorn` log
command: cat httpbin.log
when: always
The final job’s output is:
cat: httpbin.log: No such file or directory
However, if I rerun that with SSH enabled (it still will fail the same way), and I then manually execute those commands. It will work completely as intended. Even when I precede every command run by first running bash -eo pipefail
to open a new bash shell, as CircleCI does.
What is the difference? Why doesn’t gunicorn
seem to start on a normal CircleCI run, but does it work when I rerun the exact same command in the same context manually?
Ok, after many, many, attempts to figure out any logical reason, I found a weird workaround.
Gunicorn, whether daemonised or not, terminates when the step ends, unless the step is a multi-line command. When using multi-line YAML notation, it only terminates sometimes. This seems to be fixed by adding another command after gunicorn
.
So after probably half a working day of messing around, the solution was to change this:
steps:
- run: gunicorn -D -b localhost:18080 httpbin:app --log-file /var/log/httpbin.log
To this:
steps:
- run: |
gunicorn -D -b localhost:18080 httpbin:app --log-file /var/log/httpbin.log
echo Gunicorn started in background
I though this had something to do with the handling of SIGHUP which can be sent to gunicorn
to stop it. However that doesn’t explain why it sometimes works when there’s no command after gunicorn
.
Too bad with this set-up the tests became unacceptably slow. They now took over 25 minutes to complete what before only took <4 minutes.
When looking into a solution for that, I found a really cool, if somewhat poorly documented feature of CircleCI: You can run multiple containers!
This meant that all my experimenting was pointless, and my whole set-up could be simplified considerably, by running the official httpbin image in a secondary container. In fact, I could also run the database’s official image in a secondary container.
This greatly simplified both my set-up and the amount of maintenance I have to do because I don’t need to maintain the more complicated image that includes everything.
Of course this didn’t solve my slow tests. After some quick investigation, it showed that while gunicorn
is highly performant under heavy load, it is not simply idle under no, or very little, load. In fact it probably consumes ½ to ⅔ of the large
class CircleCI machine, just to exist.
Luckily httpbin
is WSGI, so I went looking for a light-weight WSGI server. I found Bjoern which seemed to fit the bill. There’s even a docker image for it. So finally I created my own docker image which is just docker-python-bjoern
with httpbin
pre-installed and auto started.
After this whole rework my tests actually run approximately 25% faster than before. Apparently running the database in a separate image is much more performant than daemonising it in the primary image.
Thank you for this run down of what you were experimenting with! I will pass this info on to the Docs team!
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.