Simplifying data pipelines with Apache Kafka
When you hear the terms, producer, consumer, topic category, broker, and cluster used together to describe a messaging system, something is brewing in the Kafka pipelines. Get connected and learn what that is, and what it means!
Many Big Data use cases have one thing in common – the use of Apache Kafka somewhere in the mix. Whether the distributed, partitioned, replicated commit log service is being used for messaging, website activity tracking, stream processing, or more, there’s no denying it is a hot technology. In this course, you will learn how Kafka is used in the real world and its architecture and components. You will quickly get up and running, producing and consuming messages using both the command line tools and the Java APIs. You also will get hands-on experience connecting Kafka to Spark, and working with Kafka Connect.
Course Syllabus
Lesson 1 – Introduction to Apache Kafka
What Kafka is and why it was created
The Kafka Architecture
The main components of Kafka
Some of the use cases for Kafka
Lesson 2 – Kafka Command Line
The contents of Kafka’s /bin directory
How to start and stop Kafka
How to create new topics
How to use Kafka command line tools to produce and consume messages
Lesson 3 – Kafka Producer Java API
The Kafka producer client
Some of the KafkaProducer configuration settings and what they do
How to create a Kafka producer using the Java API and send messages both synchronously and asynchronously
Lesson 4 – Kafka Consumer Java API
The Kafka consumer client
Some of the KafkaConsumer configuration settings and what they do
How to create a Kafka consumer using the Java API
Lesson 5 – Kafka Connect and Spark Streaming
Kafka Connect and how to use a pre-built connector
Some of the components of Kafka Connect
How to use Kafka and Spark Streaming together