ZooKeeper Getting Started Guide (2024)

This document contains information to get you started quickly with ZooKeeper. It is aimed primarily at developers hoping to try it out, and contains simple installation instructions for a single ZooKeeper server, a few commands to verify that it is running, and a simple programming example. Finally, as a convenience, there are a few sections regarding more complicated installations, for example running replicated deployments, and optimizing the transaction log. However for the complete instructions for commercial deployments, please refer to the ZooKeeper Administrator's Guide.

Pre-requisites

See System Requirements in the Admin guide.

Download

To get a ZooKeeper distribution, download a recent stable release from one of the Apache Download Mirrors.

Standalone Operation

Setting up a ZooKeeper server in standalone mode is straightforward. The server is contained in a single JAR file, so installation consists of creating a configuration.

Once you've downloaded a stable ZooKeeper release unpack it and cd to the root

To start ZooKeeper you need a configuration file. Here is a sample, create it in conf/zoo.cfg:

tickTime=2000dataDir=/var/zookeeperclientPort=2181

This file can be called anything, but for the sake of this discussion call it conf/zoo.cfg. Change the value of dataDir to specify an existing (empty to start with) directory. Here are the meanings for each of the fields:

tickTime

the basic time unit in milliseconds used by ZooKeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime.

dataDir

the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.

clientPort

the port to listen for client connections

Now that you created the configuration file, you can start ZooKeeper:

bin/zkServer.sh start

ZooKeeper logs messages using log4j -- more detail available in the Logging section of the Programmer's Guide. You will see log messages coming to the console (default) and/or a log file depending on the log4j configuration.

The steps outlined here run ZooKeeper in standalone mode. There is no replication, so if ZooKeeper process fails, the service will go down. This is fine for most development situations, but to run ZooKeeper in replicated mode, please see Running Replicated ZooKeeper.

Managing ZooKeeper Storage

For long running production systems ZooKeeper storage must be managed externally (dataDir and logs). See the section on maintenance for more details.

Connecting to ZooKeeper

Once ZooKeeper is running, you have several options for connection to it:

  • Java: Use

    bin/zkCli.sh -server 127.0.0.1:2181

    This lets you perform simple, file-like operations.

  • C: compile cli_mt (multi-threaded) or cli_st (single-threaded) by running make cli_mt or make cli_st in the src/c subdirectory in the ZooKeeper sources. See the README contained within src/c for full details.

    You can run the program from src/c using:

    LD_LIBRARY_PATH=. cli_mt 127.0.0.1:2181

    or

    LD_LIBRARY_PATH=. cli_st 127.0.0.1:2181

    This will give you a simple shell to execute file system like operations on ZooKeeper.

Once you have connected, you should see something like:

Connecting to localhost:2181log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).log4j:WARN Please initialize the log4j system properly.Welcome to ZooKeeper!JLine support is enabled[zkshell: 0] 

From the shell, type help to get a listing of commands that can be executed from the client, as in:

[zkshell: 0] helpZooKeeper host:port cmd args get path [watch] ls path [watch] set path data [version] delquota [-n|-b] path quit printwatches on|off createpath data acl stat path [watch] listquota path history setAcl path acl getAcl path sync path redo cmdno addauth scheme auth delete path [version] setquota -n|-b val path 

From here, you can try a few simple commands to get a feel for this simple command line interface. First, start by issuing the list command, as in ls, yielding:

[zkshell: 8] ls /[zookeeper] 

Next, create a new znode by running create /zk_test my_data. This creates a new znode and associates the string "my_data" with the node. You should see:

[zkshell: 9] create /zk_test my_dataCreated /zk_test 

Issue another ls / command to see what the directory looks like:

[zkshell: 11] ls /[zookeeper, zk_test] 

Notice that the zk_test directory has now been created.

Next, verify that the data was associated with the znode by running the get command, as in:

[zkshell: 12] get /zk_testmy_datacZxid = 5ctime = Fri Jun 05 13:57:06 PDT 2009mZxid = 5mtime = Fri Jun 05 13:57:06 PDT 2009pZxid = 5cversion = 0dataVersion = 0aclVersion = 0ephemeralOwner = 0dataLength = 7numChildren = 0 

We can change the data associated with zk_test by issuing the set command, as in:

[zkshell: 14] set /zk_test junkcZxid = 5ctime = Fri Jun 05 13:57:06 PDT 2009mZxid = 6mtime = Fri Jun 05 14:01:52 PDT 2009pZxid = 5cversion = 0dataVersion = 1aclVersion = 0ephemeralOwner = 0dataLength = 4numChildren = 0[zkshell: 15] get /zk_testjunkcZxid = 5ctime = Fri Jun 05 13:57:06 PDT 2009mZxid = 6mtime = Fri Jun 05 14:01:52 PDT 2009pZxid = 5cversion = 0dataVersion = 1aclVersion = 0ephemeralOwner = 0dataLength = 4numChildren = 0 

(Notice we did a get after setting the data and it did, indeed, change.

Finally, let's delete the node by issuing:

[zkshell: 16] delete /zk_test[zkshell: 17] ls /[zookeeper][zkshell: 18]

That's it for now. To explore more, continue with the rest of this document and see the Programmer's Guide.

Programming to ZooKeeper

ZooKeeper has a Java bindings and C bindings. They are functionally equivalent. The C bindings exist in two variants: single threaded and multi-threaded. These differ only in how the messaging loop is done. For more information, see the Programming Examples in the ZooKeeper Programmer's Guide for sample code using of the different APIs.

Running Replicated ZooKeeper

Running ZooKeeper in standalone mode is convenient for evaluation, some development, and testing. But in production, you should run ZooKeeper in replicated mode. A replicated group of servers in the same application is called a quorum, and in replicated mode, all servers in the quorum have copies of the same configuration file. The file is similar to the one used in standalone mode, but with a few differences. Here is an example:

tickTime=2000dataDir=/var/zookeeperclientPort=2181initLimit=5syncLimit=2server.1=zoo1:2888:3888server.2=zoo2:2888:3888server.3=zoo3:2888:3888

The new entry, initLimit is timeouts ZooKeeper uses to limit the length of time the ZooKeeper servers in quorum have to connect to a leader. The entry syncLimit limits how far out of date a server can be from a leader.

With both of these timeouts, you specify the unit of time using tickTime. In this example, the timeout for initLimit is 5 ticks at 2000 milleseconds a tick, or 10 seconds.

The entries of the form server.X list the servers that make up the ZooKeeper service. When the server starts up, it knows which server it is by looking for the file myid in the data directory. That file has the contains the server number, in ASCII.

Finally, note the two port numbers after each server name: " 2888" and "3888". Peers use the former port to connect to other peers. Such a connection is necessary so that peers can communicate, for example, to agree upon the order of updates. More specifically, a ZooKeeper server uses this port to connect followers to the leader. When a new leader arises, a follower opens a TCP connection to the leader using this port. Because the default leader election also uses TCP, we currently require another port for leader election. This is the second port in the server entry.

Note

If you want to test multiple servers on a single machine, specify the servername as localhost with unique quorum & leader election ports (i.e. 2888:3888, 2889:3889, 2890:3890 in the example above) for each server.X in that server's config file. Of course separate dataDirs and distinct clientPorts are also necessary (in the above replicated example, running on a single localhost, you would still have three config files).

Other Optimizations

There are a couple of other configuration parameters that can greatly increase performance:

  • To get low latencies on updates it is important to have a dedicated transaction log directory. By default transaction logs are put in the same directory as the data snapshots and myid file. The dataLogDir parameters indicates a different directory to use for the transaction logs.

  • [tbd: what is the other config param?]

ZooKeeper Getting Started Guide (2024)

FAQs

Why did ZooKeeper fail to start? ›

Typically, ZooKeeper election failure is caused by a misconfigured myid. Use the resolution in Misconfigured ZooKeeper myid to address the election failure. If the problem persists and further diagnosis is needed, contact Apigee Edge Support.

How do you check ZooKeeper is started or not? ›

  1. Zookeeper process runs on infra VM's. ...
  2. To start the zookeeper service use command: /usr/share/zookeeper/bin/zkServer.sh start.
  3. To check whether process is running: ps -ef | grep zookeeper.
  4. Errorlogs can be checked in Infra nodes: /var/log/zookeeper/zookeeper.log. ...
  5. Check the free memory: free -mh.
Jun 15, 2019

What is the minimum quorum for ZooKeeper? ›

Zookeeper Quorum

Zookeeper needs to have a strict majority of servers to form a consensus when votes happen. Therefore a Zookeeper quorum can have 1, 3, 5, 7, and up to (2N+1) servers. This allows 0, 1, 2, 3, and N servers to go down without making the cluster unusable.

What is the tick time in ZooKeeper? ›

tickTime : the basic time unit in milliseconds used by ZooKeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime. dataDir : the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.

Is ZooKeeper going away? ›

Note: ZooKeeper is marked as deprecated since the 3.5. 0 release. ZooKeeper is planned to be removed in Apache Kafka 4.0.

What replaced ZooKeeper? ›

4. Kafka Raft (Kraft) Protocol. Inspired by the complexity of Kafka with ZooKeeper, a Kafka Improvement Proposal (KIP) was submitted to replace ZooKeeper with a self-managed metadata quorum.

What happens if ZooKeeper fails? ›

If one the ZooKeeper nodes fails, the following occurs: Other ZooKeeper nodes detect the failure to respond. A new ZooKeeper leader is elected if the failed node is the current leader. If multiple nodes fail and ZooKeeper loses its quorum, it will drop into read-only mode and reject requests for changes.

How to start ZooKeeper command? ›

  1. A series of tools for ZooKeeper. Scripts. ...
  2. start the server. ./zkServer.sh start.
  3. start the server in the foreground for debugging. ./zkServer.sh start-foreground.
  4. stop the server. ./zkServer.sh stop.
  5. restart the server. ./zkServer.sh restart.
  6. show the status,mode,role of the server. ...
  7. Deprecated. ...
  8. print the parameters of the start-up.

What problem does ZooKeeper solve? ›

ZooKeeper is an open-source Apache project that provides a centralized service for providing configuration information, naming, synchronization and group services over large clusters in distributed systems. The goal is to make these systems easier to manage with improved, more reliable propagation of changes.

What is a quorum for 5? ›

There are two (2) vacancies on the public body, leaving (5) members serving. Because the general law creating the body specifies that quorum is measured as a majority of the five (5) serving members, quorum is now three (3) members.

How to start ZooKeeper quorum? ›

With each node configured to work as a cluster, you are ready to start a quorum. In this step, you will start the quorum on each node and then test your cluster by creating sample data in ZooKeeper. To start a quorum node, first change to the /opt/zookeeper directory on each node: cd /opt/zookeeper.

What is the difference between leader and follower in ZooKeeper? ›

In a multi-node ZooKeeper installation, one of the nodes is designated as the leader. All other ZooKeeper nodes are designated as followers. While reads can happen from any ZooKeeper node, all write requests get forwarded to the leader. For example, a new Message Processor is added to Edge.

Is ZooKeeper worth it? ›

Rewards of the job:

Having a close relationship and contact with amazing animals. It is rewarding when the animals recognize you. Having a physical job with a lot of time spent outdoors (not at a desk)! It is fun to talk to people interested in animals and to teach people about endangered species and conservation.

Does ZooKeeper use TCP? ›

Clients connect to a single ZooKeeper server. The client maintains a TCP connection through which it sends requests, gets responses, gets watch events, and sends heart beats. If the TCP connection to the server breaks, the client will connect to a different server.

What is the sync limit in ZooKeeper? ›

syncLimit. Maximum number of ticks for the followers to wait to synchronize with the leader before the followers time out. Default is 5.

What is the zookeepers problem? ›

The zoo-keeper's problem is to find the shortest closed path starting at p and touching at least one point of each cage, see Fig. 1.

Why ZooKeeper was replaced with KRaft? ›

KRaft is based on the Raft consensus protocol and addresses several potential failures in large Kafka clusters. Moving away from ZooKeeper simplifies Kafka's architecture and removes the need for deploying two separate distributed systems to be fully operational.

What animal does the most damage to zookeepers? ›

Elephants are statistically the most dangerous animals to keep in captivity. Interactions with captive elephants are responsible for more zookeeper deaths than those involving any other captive animal.

References

Top Articles
Latest Posts
Article information

Author: Rueben Jacobs

Last Updated:

Views: 5494

Rating: 4.7 / 5 (77 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Rueben Jacobs

Birthday: 1999-03-14

Address: 951 Caterina Walk, Schambergerside, CA 67667-0896

Phone: +6881806848632

Job: Internal Education Planner

Hobby: Candle making, Cabaret, Poi, Gambling, Rock climbing, Wood carving, Computer programming

Introduction: My name is Rueben Jacobs, I am a cooperative, beautiful, kind, comfortable, glamorous, open, magnificent person who loves writing and wants to share my knowledge and understanding with you.