I recently dealt with an application that is comprised of multiple services running in containers. Even though every part of this application is correctly split into each separated microservice, the independence of each service is not enforced.
This lack of independence has several drawbacks, one of which is that containers must be started by following a pre-defined startup order. Otherwise, some containers might be terminated due to an application error (the application breaks when an unexpected error occurs, e.g. it is relying on another linked service that is not ready to accept the connection).
Not all applications suffer from this kind of problem: the application I was dealing with was not born with microservices in mind, but it was rather split and converted to separate containers across its lifetime. But it is not the only application that has this particular limit, for sure other applications out there are converted into a Franken-microservice-stein “monster”.
I am going to explore what are the possible workarounds to define and follow a startup order when launching containerized applications that span across multiple containers.
Depending on the scenario, it is possible that we do not want (or we cannot) change the containers and the application itself: there are multiple reasons behind these factors, namely:
- the complexity of the application
- whether the sources are available
- if changes to the
Dockerfiles are possible (especially
- the time required to change the architecture of the application
docker-compose, we can specify:
healthcheck: it specifies what is the
test(command) to check if the container is working. The
testis executed at intervals (
interval) and retried
db: image: my-db-image container_name: db-management ports: - 31337:31337 healthcheck: test: ["CMD", "curl", "-fk", "https://localhost:31337"] interval: 300s timeout: 400s retries: 10
depends_onfield to describe to start the container after the dependency has been started and a
web: image: my-web-image restart: on-failure depends_on: - db links: - db
What is happening here?
docker-composestarts the service and starts the
dbcontainer first (the
webone depends on it)
webcontainer is started shortly after (it does not wait for
dbto be ready, because it does not know what “ready” means for us). Until the
dbcontainer is ready to accept connections, the
webcontainer will be restarted (
dbservice is marked as
healthyas soon as
curl -fk https://localhost:31337returns 0 (the
db-managementimage ships with an HTTP controller, and it returns 0 only when the database is ready to accept the connections). Marking the service is
healthymeans that service is working as expected (because the
testis returning what we are expecting). When the service is no longer healthy, the container must be restarted and other policies and actions might be introduced.
docker-compose reference < 3,
depends_on could also wait for the health checks, but starting from
docker-compose reference specification version 3,
depends_on can only accept other services as parameters in
This solution is not ideal, as the
web container is restarted until the dependency is satisfied: that can be a huge problem if we are using that container for running tests, as a container exiting because of failure can be assimilated as failed tests.
wait-for-it wrapper script
This approach is slightly better than the previous, but it is still a workaround. We are going to use
docker-compose and the
docker-compose.yml file we insert a
depends_on (as described in the previous section) and a
db: container_name: db-management ports: - 31337:31337 healthcheck: test: ["CMD", "curl", "-fk", "https://localhost:31337"] interval: 300s timeout: 400s retries: 10 web: image: my-web-image depends_on: - db links: - db command: ["./wait-for-it.sh", "db:31337", "--", "./webapp"]
wait-for-it script waits for
host:port to be open (TCP only). Again, this does not guarantee that the application is ready to serve but, compared to the previous workaround, we are not restarting the
web container until its dependency is ready.
One drawback of this workaround is that it is invasive: it requires the container image to be rebuilt by adding the
wait-for-it script (you can use a multi-stage build to do so).
Re-architect the application
This is not a workaround but it is rather the solution, and the best one we can achieve. It takes effort and it might cost a lot: the application architecture needs to be modified to make it resilient against failures. There are no general guidelines on how to successfully re-architect an application to be failproof and microservice ready, even though I strongly suggest to follow the 12 guidelines expressed in the 12-factor applications website.