Zookeeper Ensemble

Last Updated : Dec 01, 2020 |

Apache Kafka uses ZooKeeper to store cluster metadata. ZooKeeper is a distributed, open-source coordination service for distributed applications. Zookeeper keeps track of the status of the Kafka cluster nodes and it also keeps track of Kafka topics, partitions, etc.

ZooKeeper service to be active, there must be a majority of non-failing machines that can communicate with each other. To create a deployment that can tolerate the failure of F machines, you should count on deploying 2xF+1machines.

For example, if one zookeeper died, another zookeeper will jump in. This behavior also applies to Kafka brokers, in this case, the system is fault-tolerant

Thus, a deployment that consists of three machines can handle one failure, and a deployment of five machines can handle two failures.

Deployment of six machines can only handle two failures since three machines is not a majority. For this reason, ZooKeeper deployments are usually made up of an odd number of machines.

Note:

POM HA deployment also supports Kafka HA deployment.

The proposed architecture shows ZooKeeper and Kafka deployment on three POM instances (without external Kafka-Zookeeper):