[apache-zookeeper] Explaining Apache ZooKeeper

I am trying to understand ZooKeeper, how it works and what it does. Is there any application which is comparable to ZooKeeper?

If you know, then how would you describe ZooKeeper to a layman?

I have tried apache wiki, zookeeper sourceforge...but I am still not able to relate to it.

I just read thru http://zookeeper.sourceforge.net/index.sf.shtml, so aren't there more services like this? Is it as simple as just replicating a server service?

This question is related to apache-zookeeper distributed-computing

The answer is


My approach to understand zookeeper was, to play around with the CLI client. as described in Getting Started Guide and Command line interface

From this I learned that zookeeper's surface looks very similar to a filesystem and clients can create and delete objects and read or write data.

Example CLI commands

create /myfirstnode mydata
ls /
get /myfirstnode
delete /myfirstnode

Try yourself

How to spin up a zookeper environment within minutes on docker for windows, linux or mac:

One time set up:

docker network create dn

Run server in a terminal window:

docker run --network dn --name zook -d zookeeper
docker logs -f zookeeper

Run client in a second terminal window:

docker run -it --rm --network dn zookeeper zkCli.sh -server zook

See also documentation of image on dockerhub


I understand the ZooKeeper in general but had problems with the terms "quorum" and "split brain" so maybe I can share my findings with you (I consider myself also a layman).

Let's say we have a ZooKeeper cluster of 5 servers. One of the servers will become the leader and the others will become followers.

  • These 5 servers form a quorum. Quorum simply means "these servers can vote upon who should be the leader".

  • So the voting is based on majority. Majority simply means "more than half" so more than half of the number of servers must agree for a specific server to become the leader.

  • So there is this bad thing that may happen called "split brain". A split brain is simply this, as far as I understand: The cluster of 5 servers splits into two parts, or let's call it "server teams", with maybe one part of 2 and the other of 3 servers. This is really a bad situation as if both "server teams" must execute a specific order how would you decide wich team should be preferred? They might have received different information from the clients. So it is really important to know what "server team" is still relevant and which one can/should be ignored.

  • Majority is also the reason you should use an odd number of servers. If you have 4 servers and a split brain where 2 servers seperate then both "server teams" could say "hey, we want to decide who is the leader!" but how should you decide which 2 servers you should choose? With 5 servers it's simple: The server team with 3 servers has the majority and is allowed to select the new leader.

  • Even if you just have 3 servers and one of them fails the other 2 still form the majority and can agree that one of them will become the new leader.

I realize once you think about it some time and understand the terms it's not so complicated anymore. I hope this also helps anyone in understanding these terms.


I would suggest the following resources:

  1. The paper: https://pdos.csail.mit.edu/6.824/papers/zookeeper.pdf
  2. The lecture offered by MIT 6.824 from 36:00: https://youtu.be/pbmyrNjzdDk?t=2198

I would suggest watching the video, read the paper, and then watch the video again. It would be easier to understand if you know Raft beforehand.


Zookeeper is one of the best open source server and service that helps to reliably coordinates distributed processes. Zookeeper is a CP system (Refer CAP Theorem) that provides Consistency and Partition tolerance. Replication of Zookeeper state across all the nodes makes it an eventually consistent distributed service.

Moreover, any newly elected leader will update its followers with missing proposals or with a snapshot of the state, if the followers have many proposals missing.

Zookeeper also provides an API that is very easy to use. This blog post, Zookeeper Java API examples, has some examples if you are looking for examples.

So where do we use this? If your distributed service needs a centralized, reliable and consistent configuration management, locks, queues etc, you will find Zookeeper a reliable choice.


Zookeeper is a centralized open-source server for maintaining and managing configuration information, naming conventions and synchronization for distributed cluster environment. Zookeeper helps the distributed systems to reduce their management complexity by providing low latency and high availability. Zookeeper was initially a sub-project for Hadoop but now it's a top level independent project of Apache Software Foundation.

More Information