PySpark with Kafka
Start Zookeeper and Kafka Broker
Create a topic and Start Producer
Read from a Kafka Topic
Process Data
Output Sink
Specify offset for Reading Data (Batch Processing)
Syntax
{topic:{partition:offset}}
Offset -2: Earliest Message
Offset -1: Latest Message
Structured Streaming + Kafka Integration Guide - Spark 3.5.0 Documentation