Cyberithub

How to Install Apache Kafka on Ubuntu 18.04

Advertisements

In this article I will take you through the steps to install Apache Kafka on Ubuntu. Kafka was Originally developed at LinkedIn. Later on, in 2011 it became an Open Source Project. Since then it has evolved as a very important Real Time Data Pipeline application. Kafka basically provides a Distributed Streaming platform which you can utilize to store and process streams of records in a Fault Tolerant Way.

Apache Kafka Tutorial

In this Kafka tutorial, I will cover below Topics:-

  • Kafka Architecture
  • Install Apache Kafka Server
  • Prerequisites
  • Update System
  • Apache Kafka Download
  • Extract Tar File
  • Start Zookeeper Service
  • Start Kafka As a Service
  • Create a Topic
  • Send Messages using Kafka Console Producer
  • Receive Messages using Kafka Console Consumer
  • Kafka Delete Topic(optional)
  • How to find out Kafka Version
  • What is the Default Kafka Port?

Kafka Cluster Architecture

How to Install Apache Kafka on Ubuntu 18.04 2

 

Some Terminology and Concepts:-

Producer: Producer write data to the topic and connect to one broker. In turn, Kafka will take care of routing of data to the right broker.

Consumer: Consumer read data from the topic and connect to one broker. Kafka will take care of pulling of the data from right broker.

Broker: Broker is a Kafka Server which does Load Balancing internally and choose the right broker to send the data or pull the data.

Topic: Topic is a Kafka Stream that we creates to send our data generated by the producer or consume the data using Consumer.

Partition: Partition is a part of Broker which is managed by broker itself.

Offset: Offset is a unique id for a message within a partition.

Consumer Groups: A group of Consumer acting as a Single Logical Unit.

 

How to install Apache Kafka on Ubuntu 18.04

Install Apache Kafka Server

Before going through the steps to install Apache Kafka Server, you need to fulfill the prerequisites criteria.

Also Read: How to Install and Configure Redis on CentOS 7

Step 1: Prerequisites

a) You need to have a running Ubuntu 18.04 machine in order to install Apache Kafka server.

b) Atleast 4GB memory is required for single node set up. It will require more for multi node setup running heavy processing.

Step 2: Update System

You need to update your all installed packages before proceeding with the steps to install Kafka Server.

root@znode1:~# apt-get update
Hit:1 http://in.archive.ubuntu.com/ubuntu bionic InRelease
Get:2 http://in.archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:3 http://in.archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:4 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Hit:5 http://ppa.launchpad.net/ansible/ansible/ubuntu bionic InRelease
Get:6 http://in.archive.ubuntu.com/ubuntu bionic-updates/main amd64 DEP-11 Metadata [294 kB]
................................................................................................................

Step 3: Download and Install Kafka

Download Apache Kafka through wget. At the time of Writing, Kafka_2.11.2.4.0 is the latest one. You can download the latest one from Apache Kafka Download Page depends on your requirement.

root@znode1:~# wget http://mirrors.estointernet.in/apache/kafka/2.4.0/kafka_2.11-2.4.0.tgz
--2020-01-01 21:14:48-- http://mirrors.estointernet.in/apache/kafka/2.4.0/kafka_2.11-2.4.0.tgz
Resolving mirrors.estointernet.in (mirrors.estointernet.in)... 103.97.84.254, 2403:8940:2::f
Connecting to mirrors.estointernet.in (mirrors.estointernet.in)|103.97.84.254|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 70057083 (67M) [application/octet-stream]
Saving to: ‘kafka_2.11-2.4.0.tgz’

kafka_2.11-2.4.0.tgz 100%[=====================================================================================>] 66.81M 5.99MB/s in 10s

2020-01-01 21:14:59 (6.58 MB/s) - ‘kafka_2.11-2.4.0.tgz’ saved [70057083/70057083

Step 4: Extract Tar File

Extract Kafka download package using tar -xvf command in current directory.

root@znode1:~# tar -xvf kafka_2.11-2.4.0.tgz

And then move the extracted file into /opt directory.

root@znode1:~# mv kafka_2.11-2.4.0/ /opt

Step 5: Start Zookeeper Service

Go to Kafka directory and start the Zookeeper Service using below command in background.

root@znode1:~# cd /opt/kafka_2.11-2.4.0/
root@znode1:~/kafka_2.11-2.4.0# bin/zookeeper-server-start.sh config/zookeeper.properties &
[2020-01-02 07:42:55,381] INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2020-01-02 07:42:55,389] WARN config/zookeeper.properties is relative. Prepend ./ to indicate that you're sure! (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2020-01-02 07:42:55,399] INFO clientPortAddress is 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2020-01-02 07:42:55,400] INFO secureClientPort is not set

.......................................................................................................................

If you want to stop Zookeeper server, then run below command:-

root@znode1:~/kafka_2.11-2.4.0# bin/zookeeper-server-stop.sh

Step 6: Start Kafka as a Service

Once Zookeeper service started successfully, you can now go ahead and start Kafka Service in the background.

root@znode1:~/kafka_2.11-2.4.0# bin/kafka-server-start.sh config/server.properties &

[2020-01-02 07:43:28,733] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
[2020-01-02 07:43:29,451] INFO Registered signal handlers for TERM, INT, HUP (org.apache.kafka.common.utils.LoggingSignalHandler)
[2020-01-02 07:43:29,453] INFO starting (kafka.server.KafkaServer)
[2020-01-02 07:43:29,455] INFO Connecting to zookeeper on localhost:2181 (kafka.server.KafkaServer)
[2020-01-02 07:43:29,483] INFO [ZooKeeperClient Kafka server] Initializing a new session to localhost:2181. (kafka.zookeeper.ZooKeeperClient)

.............................................................................................................

If you want to stop Kafka Server, then run below command:-

root@znode1:~/kafka_2.11-2.4.0# bin/kafka-server-stop.sh
[2020-01-02 08:32:14,118] INFO Terminating process due to signal SIGTERM (org.apache.kafka.common.utils.LoggingSignalHandler)
[2020-01-02 08:32:14,120] INFO [KafkaServer id=0] shutting down (kafka.server.KafkaServer)
[2020-01-02 08:32:14,121] INFO [KafkaServer id=0] Starting controlled shutdown (kafka.server.KafkaServer)
[2020-01-02 08:32:14,171] INFO [KafkaServer id=0] Controlled shutdown succeeded (kafka.server.KafkaServer)

.........................................................................................................................

Step 7: Create a Topic

Once all the services are started successfully, you can now go ahead and create your first Topic using below command. Topic are nothing but Kafka Stream which you need to subscribe to send your data. It will connect to given Broker and choose the right broker internally to send the data.

root@znode1:~/kafka_2.11-2.4.0# bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic FirstTopic
Created topic FirstTopic.

Verify topic by below command:-

root@znode1:~/kafka_2.11-2.4.0# bin/kafka-topics.sh --list --zookeeper localhost:2181
FirstTopic

Step 8: Send Data using Kafka Console Producer

It is now time to send data over your Kafka Stream using producer api. You can open a new shell and send some data. Once done, you can either close the shell or let it be like that.

root@znode1:~/kafka_2.11-2.4.0# bin/kafka-console-producer.sh --broker-list localhost:9092 --topic FirstTopic
>Hello
>from CyberITHub
>

Step 9: Receive Data using Kafka Console Consumer

You can simultaneously open another shell and run below command to consume the data sent by the Producer over the Kafka stream through Consumer API.

root@znode1:~/kafka_2.11-2.4.0# bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic FirstTopic --from-beginning
Hello
from CyberITHub

Step 10: Kafka Delete Topic(Optional)

If you want to delete some topic, you can do that by using below command.

root@znode1:~/kafka_2.11-2.4.0# bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic FirstTopic

[2020-01-02 08:26:17,170] INFO [Log partition=FirstTopic-0, dir=/tmp/kafka-logs] Deleting segments List(LogSegment(baseOffset=0, size=157, lastModifiedTime=1577932597085, largestTime=1577932596081)) (kafka.log.Log)
[2020-01-02 08:26:17,175] INFO Deleted log /tmp/kafka-logs/FirstTopic-0.4f1f7a0b4a4c461eb5dc638e7a7886cb-delete/00000000000000000000.log.deleted. (kafka.log.LogSegment)
[2020-01-02 08:26:17,185] INFO Deleted offset index /tmp/kafka-logs/FirstTopic-0.4f1f7a0b4a4c461eb5dc638e7a7886cb-delete/00000000000000000000.index.deleted. (kafka.log.LogSegment)
[2020-01-02 08:26:17,193] INFO Deleted time index /tmp/kafka-logs/FirstTopic-0.4f1f7a0b4a4c461eb5dc638e7a7886cb-delete/00000000000000000000.timeindex.deleted. (kafka.log.LogSegment)
[2020-01-02 08:26:17,204] INFO Deleted log for partition FirstTopic-0 in /tmp/kafka-logs/FirstTopic-0.4f1f7a0b4a4c461eb5dc638e7a7886cb-delete. (kafka.log.LogManager)
Topic FirstTopic is marked for deletion.
[2020-01-02 08:26:19,855] INFO [GroupMetadataManager brokerId=0] Group console-consumer-28412 transitioned to Dead in generation 4 (kafka.coordinator.group.GroupMetadataManager)

.............................................................................................................................................

How to find out Kafka Version ?

To check the Kafka version, you can run below command:-

root@znode1:~/kafka_2.11-2.4.0# bin/kafka-topics.sh --version
2.4.0 (Commit:77a89fcf8d7fa018)

What is the default Kafka Port ?

Kafka uses default TCP Kafka Port 9092

 

Also Read: How to Install Docker on CentOS 7
Reference: Kafka Documentation

Leave a Comment