What is the data pulling method for Spark Structure Streaming when consuming Kafka data
val kafkaDF: DataFrame = spark .readStream .format(“kafka”) .option(“kafka.bootstrap.servers”, “localhost:9092”) .option(“subscribe”, “test-topic”) .option(“startingoffsets”,”earliest”) .option(“group.id”, “spark-streaming-kafka”) .option(“maxOffsetsPerTrigger”, “1”) .load() val query: StreamingQuery = ds .writeStream .option(“checkpointLocation”, “file:///spark_tutorial/cp”) .format(“console”) .outputMode(OutputMode.Append()) .trigger(Trigger.ProcessingTime(5, TimeUnit.SECONDS)) .start() I have the following questions regarding the code that simulates real-life scenarios above: 1: Is Spark constantly consuming Kafka data, even without triggering batch calculations? […]