kafka-topics utility to create Kafka topics and to fetch metainformation about existing topicsEither choose between the following options:
Apache Kafka is state of the art messaging broker for high throughput applications. Thanks for simple but elegant architecture, Kafka cluster is extremely reliable and scalable.
Key concepts we need in order to understand Kafka:
Server part of kafka is called a broker. Multiple brokers can work in unison to form a cluster. Instead of using built-in solution for cluster coordination (as is the case with Cassandra, MongoDb or ElasticSearch to name a few), Kafka relies on ZooKeeper for node communication and coordination. Keep in mind that for production purposes you would need minimum of three ZooKeeper instances for reliability guarantees.
Producers send messages directly to one of the nodes in Kafka cluster. Consumers are slightly more complicated, since the can form a consumer group. Consumer group is cluster of consumer nodes that will receive all events in a topic. One consumer group is usually responsible for one concern. As an example, let's assume we have a topic containing system metrics information. One consumer group in that case might store this events in database to be displayed on a dashboard, while second one is performing real-time analysis and sends notifications when values cross certain threshold.
Setting up separate Zookeeper node is very easy. In fact, Zookeper and sample config file for it is included in default Kafka distribution. Starting zookeeper requires only a few steps:
curl http://apache.lauf-forum.at/kafka/1.0.0/kafka_2.11-1.0.0.tgz -o kafka_2.11-1.0.0.tgz
tar -xzf kafka_2.11-1.0.0.tgz
cd kafka_2.11-1.0.0
$ bin/zookeeper-server-start.sh config/zookeeper.properties
This command will give you output lots of diagnostic information. To verify Zookeeper node is working properly open another terminal tab, navigate to Kafka directory and use ./bin/zookeeper-shell.sh utility:
$ ./bin/zookeeper-shell.sh localhost:2181
Welcome to ZooKeeper!
JLine support is disabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
At this point, we are ready to start Kafka "cluster" containing single broker instance:
$ bin/kafka-server-start.sh config/server.properties
Now let's create a new topic to play with:
$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic tutorial
Created topic "tutorial".
$ bin/kafka-topics.sh --list --zookeeper localhost:2181
tutorial
$ bin/kafka-topics.sh --describe --topic tutorial --zookeeper localhost:2181
Topic:tutorial PartitionCount:1 ReplicationFactor:1 Configs:
Topic: tutorial Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Let's describe each argument in details:
--create and --list are actions we want to perform. Other actions are --alter, --delete and --describe.--zookeper is na address of zookeeper instance to usecases--replication-factor describes how many copies of each message should exist in a cluster. Higher number gives better consistensy and availability guarantees, but requires more disk space.--partitions specifies how many partitions are present in a topic. Higher partition number increases the maximal size of consumer group, but might increase latency and in some circumstances increases CPU and memory usage on consumers.--topic name of the topicOnce topic is created with desired configuration, we can start sending and recieving messages through this topic. Kafka libraries exist for all popular programming languages, but for educational purpose of this tutorial we can just use kafka-console-consumer.sh and kafka-console-producer.sh utilites packaged with Kafka distribution.
Open two new terminal windows, navigate to Kafka directory in both of them and in one of them execute following command to run producer:
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic tutorial
while in the second one execute command to run consumer:
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic tutorial
Now whatever you type in kafka-console-producer will be transfered through kafka and will be delivered to kafka-console-consumer window.
We have learned high level principles of Kafka architecture, started local instance of Kafka broker and transferred message from console producer to console consumer. I hope this tutorial was interesting as well as educational.