Apache Kafka® concepts#

A comprehensive glossary of essential Apache Kafka® terms and their meanings.

Broker#

A server that operates Apache Kafka, responsible for message storage, processing, and delivery. Typically part of a cluster for enhanced scalability and reliability, each broker functions independently but is integral to Kafka’s overall operations, separate from tools like Apache Kafka Connect.

Consumer#

An application that reads data from Apache Kafka, often processing or acting upon it. Various tools used with Apache Kafka ultimately function as either a producer or a consumer when communicating with Apache Kafka.

Consumer groups#

Groups of consumers in Apache Kafka are used to scale beyond a single application instance. Multiple instances of an application coordinate to handle messages, with each group allocated to different partitions for even workload distribution.

Event-driven architecture#

Application architecture centered around responding to and processing events.

Event#

A single discrete data unit in Apache Kafka, consisting of a value (the message body) and often a key (for quick identification) and headers (metadata about the message).

Kafka node#

See Broker

Kafka server#

See Broker

Message#

See Event

Partitioning#

A method used by Apache Kafka to distribute a topic across multiple servers. Each server acts as the leader for a partition, ensuring data sharding and message order within each partition.

Producer#

An application that writes data into Apache Kafka without concern for the data’s consumers. The data can range from well-structured to simple text, often accompanied by metadata.

Pub/sub#

A publish-subscribe messaging architecture where messages are broadcasted by publishers and received by any listening subscribers, unlike point-to-point systems.

Queueing#

A messaging system where messages are sent and received in the order they are produced. Apache Kafka maintains a watermark for each consumer to track the most recent message read.

Record#

See Event

Replication#

Apache Kafka’s feature for data replication across multiple servers, ensuring data preservation even if a server fails. This is configurable per topic.

Topic#

Logical channels in Apache Kafka through which messages are organized. Topics are named in a human-readable manner, like sensor-readings or kubernetes-logs.