gRPC vs Kafka

February 16, 2024

Overview

When comparing gRPC and Kafka, we're likely comparing gRPC's streaming features. gRPC's unary RPC takes either asynchronous or synchronous forms of communication (in a Request/Response architectural paradigm). Kafka and gRPC Streaming are both examples of Event-Driven Architecture. There's typically an expectation of an immediate response in a Request/Response journey, whereas Event-Driven request journeys are typically send-and-forget (with journey durations anywhere from milliseconds to days in distributed systems). Acknowledgements of a triggered Event-Driven journey are common (as well as follow-up event responses).

High-Level Guidance

If latency is the most important factor, choose gRPC Streaming: gRPC Streaming doesn’t require anything like Kafka’s intermediary message broker between a client and server. Its streams are direct between client and server, minimizing latency.
If reliability out-of-the-box is the most important factor, choose Kafka: gRPC Streaming doesn’t provide application-level guaranteed delivery on its own. In contrast, Kafka not only guarantees message delivery at the application level, but also persists messages on its distributed message broker for a default of 14 days.

For more detail, continue reading:

Comparison

Architectural Diagrams

gRPC Streaming

Kafka

Advanced Trade-Offs

Conclusion

Comparison

	gRPC Streaming	Kafka
Use Case	- Direct communication between services. - Less resource overhead. - Technical aptitude needs to be high to handle all edge cases.	- Indirect communication between services. - More resource overhead. - Relatively less technical aptitude necessary for edge cases due to broker-based handling.
Latency	Low latency due to direct communication without intermediary hops (often single-digit ms).	Generally low (single-digit ms), but can increase if producers outpace consumers.
Throughput	Dependent on infrastructure and design. Harder to scale seamlessly.	Extremely high due to distributed brokers and partitioning.
Guaranteed Delivery	TCP-level is guaranteed, but application-level must be implemented.	Out-of-the-box guaranteed once processed. Consumers can commit offsets to resume.
Scalability	Some support with client-side load balancing. Server-side needs additional load balancers or custom orchestration.	Highly scalable (brokers, partitions) with built-in persistence for at least 14 days by default.

Note: Learn more about Throughput vs Latency.

Architectural Diagrams

Below are simplified diagrams illustrating the difference between Request/Response (gRPC) and Event-Driven (Kafka) approaches.

gRPC Streaming Flow

+----------+     (1) Request Stream    +----------+
| gRPC     |-------------------------> | gRPC     |
| Client   |                           | Server   |
+----------+                           +----------+
         ^                                |
         |      (2) Response Stream       |
         +--------------------------------+

Direct communication between client and server.
Low latency but ephemeral connections; if either side fails, the stream is lost.

Kafka Message Flow

+---------+  (1) Publish Messages  +--------+
| Producer| ---------------------> | Kafka  | -- (2) Stores Messages
| (Client)|                        | Broker |
+---------+                        +--------+
                                        \
                                         \
                                          v (3) Consume Messages
                                      +----------+
                                      | Consumer |
                                      +----------+

Indirect communication with a broker in between.
Messages persist for a configurable retention period; if a consumer is offline, it can catch up later.

gRPC Streaming

gRPC is touched on in gRPC vs REST. When comparing it to REST, we typically compare its unary RPC feature. However, when comparing it to Kafka, we compare its streaming features. There are three different streaming modes:

Server Streaming RPC: Server sends a stream of messages in response to a client's request.
Client Streaming RPC: Client sends a stream of messages to the server.
Bidirectional Streaming RPC: A two-way stream where both sides can send messages independently.

Kafka

Note that Kafka and Kafka Streams operate within the same infrastructure framework, but serve different purposes:

Kafka: A distributed streaming platform that enables you to reliably stream messages from producers to consumers via an intermediary message broker in a highly fault-tolerant manner.
Kafka Streams: Uses the same Kafka infrastructure, but provides a higher-level, more user-friendly client library to handle many technical details under-the-hood.

Advanced Trade-Offs

Below is a second table that delves deeper into operational and developer-experience aspects.

Criteria	gRPC Streaming	Kafka
Processing Model	Streams are ephemeral; the client and server must remain connected for continuous streaming.	Messages are persisted in topics for a configurable retention time; consumers can catch up any time.
Failure Handling	If the client or server fails, the stream terminates. Developer must handle retries, buffering, etc.	Broker retains messages; if a consumer fails, it can resume from the last committed offset after recovery.
Operational Complexity	Requires advanced code-based logic for complex retries, deduplication, etc.	Requires dedicated infrastructure (brokers, ZooKeeper/Kafka Controller), but simplifies client retry logic.
Security & Encryption	Typically runs over HTTP/2 with TLS. Configuration is relatively straightforward per service.	SSL for encryption in-flight, plus optional encryption at rest. Requires more config, but well-documented.
Cost Considerations	Potentially lower infra cost (no broker), but higher dev cost for advanced features.	Higher infra cost for brokers and storage, but possibly lower dev overhead for reliability features.
Idempotency	Must be handled at the application level when re-sending messages.	Consumer logic can handle repeated messages by checking offsets or unique IDs.

Conclusion

It’s unusual to introduce gRPC Streaming into a distributed system where Kafka (or another message broker) is widely used unless a use case arises where extremely low latency is of utmost importance (e.g., video/audio chat, real-time collaboration). On the other hand, gRPC Streaming is the better option if most of the following are true:

The technical aptitude exists.
The feature’s scale is not extremely high.
The feature’s traffic is not highly variable.
The overhead of introducing Kafka is too high.

In contrast, if your system demands guaranteed message persistence, higher variable throughput, and simpler consumer failover, Kafka is typically the better choice.

Quick Decision Flowchart

                      +-----------------------------------+
                      |   Do you need very low latency?   |
                      +-----------------------------------+
                                /              \
                               / YES            \ NO
                              v                  v
+--------------------------------------+   +--------------------------------------+
|  Do you also need guaranteed         |   | Is guaranteed message persistence    |
|  message persistence out-of-the-box? |   | or replay crucial for your use case? |
+---------+--------------+-------------+   +----------+--------------+------------+
       NO |              | YES                     NO |              | YES
          v              v                            v              v
Use gRPC Streaming    Use Kafka             Use gRPC Streaming    Use Kafka

gRPC vs Kafka

Overview

High-Level Guidance

Table of Contents

Comparison

Architectural Diagrams

gRPC Streaming Flow

Kafka Message Flow

gRPC Streaming

Kafka

Advanced Trade-Offs

Conclusion

Quick Decision Flowchart

Related blog posts: