1

When I try integrate Kafka with spark it fails.

Here is my code:

from pyspark.sql import SparkSession
from pyspark.streaming import StreamingContext
import findspark
import kafka

import json

if __name__ == "__main__":

    findspark.init("D:\\spark-3.0.1-bin-hadoop2.7\\")

    spark = (
        SparkSession.builder.appName("Kafka Pyspark Streaming Learning").config(
            "spark.jars.packages", "org.apache.spark:spark-sql-kafka-0-10_2.12:3.3.0").master("local[*]").getOrCreate()
    )

    ssc = StreamingContext(spark, 10)

    df = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "localhost:9092").option(
        "subscribe", "registered_user").option("startingOffsets", "earliest").load()

    print(df.show())

    ssc.start()
    ssc.awaitTermination()

Here is the error traceback:

Ivy Default Cache set to: C:\Users\BERNARD JOSHUA\.ivy2\cache
The jars for the packages stored in: C:\Users\BERNARD JOSHUA\.ivy2\jars
:: loading settings :: url = jar:file:/D:/spark-3.0.1-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.spark#spark-sql-kafka-0-10_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-37ff0b88-bb9a-4566-ac9e-d00b45eb6881;1.0
        confs: [default]
        found org.apache.spark#spark-sql-kafka-0-10_2.12;3.3.0 in central
        found org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.3.0 in central
        found org.apache.kafka#kafka-clients;2.8.1 in central
        found org.lz4#lz4-java;1.8.0 in central
        found org.xerial.snappy#snappy-java;1.1.8.4 in central
        found org.slf4j#slf4j-api;1.7.32 in central
        found org.apache.hadoop#hadoop-client-runtime;3.3.2 in central
        found org.spark-project.spark#unused;1.0.0 in central
        found org.apache.hadoop#hadoop-client-api;3.3.2 in central
        found commons-logging#commons-logging;1.1.3 in central
        found com.google.code.findbugs#jsr305;3.0.0 in central
        found org.apache.commons#commons-pool2;2.11.1 in central
:: resolution report :: resolve 653ms :: artifacts dl 22ms
        :: modules in use:
        com.google.code.findbugs#jsr305;3.0.0 from central in [default]
        commons-logging#commons-logging;1.1.3 from central in [default]
        org.apache.commons#commons-pool2;2.11.1 from central in [default]
        org.apache.hadoop#hadoop-client-api;3.3.2 from central in [default]
        org.apache.hadoop#hadoop-client-runtime;3.3.2 from central in [default]
        org.apache.kafka#kafka-clients;2.8.1 from central in [default]
        org.apache.spark#spark-sql-kafka-0-10_2.12;3.3.0 from central in [default]
        org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.3.0 from central in [default]
        org.lz4#lz4-java;1.8.0 from central in [default]
        org.slf4j#slf4j-api;1.7.32 from central in [default]
        org.spark-project.spark#unused;1.0.0 from central in [default]
        org.xerial.snappy#snappy-java;1.1.8.4 from central in [default]
        ---------------------------------------------------------------------
        |                  |            modules            ||   artifacts   |
        |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
        ---------------------------------------------------------------------
        |      default     |   12  |   0   |   0   |   0   ||   12  |   0   |
        ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-37ff0b88-bb9a-4566-ac9e-d00b45eb6881
        confs: [default]
        0 artifacts copied, 12 already retrieved (0kB/19ms)
22/09/10 00:17:00 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.apache.spark_spark-sql-kafka-0-10_2.12-3.3.0.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.apache.spark_spark-sql-kafka-0-10_2.12-3.3.0.jar not found     
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.apache.spark_spark-token-provider-kafka-0-10_2.12-3.3.0.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.apache.spark_spark-token-provider-kafka-0-10_2.12-3.3.0.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.apache.kafka_kafka-clients-2.8.1.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.apache.kafka_kafka-clients-2.8.1.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/com.google.code.findbugs_jsr305-3.0.0.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\com.google.code.findbugs_jsr305-3.0.0.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.apache.commons_commons-pool2-2.11.1.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.apache.commons_commons-pool2-2.11.1.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.spark-project.spark_unused-1.0.0.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.apache.hadoop_hadoop-client-runtime-3.3.2.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.apache.hadoop_hadoop-client-runtime-3.3.2.jar not found        
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.lz4_lz4-java-1.8.0.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.lz4_lz4-java-1.8.0.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.xerial.snappy_snappy-java-1.1.8.4.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.xerial.snappy_snappy-java-1.1.8.4.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.slf4j_slf4j-api-1.7.32.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.slf4j_slf4j-api-1.7.32.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.apache.hadoop_hadoop-client-api-3.3.2.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\org.apache.hadoop_hadoop-client-api-3.3.2.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Failed to add file:///C:/Users/BERNARD%20JOSHUA/.ivy2/jars/commons-logging_commons-logging-1.1.3.jar to Spark environment
java.io.FileNotFoundException: Jar C:\Users\BERNARD%20JOSHUA\.ivy2\jars\commons-logging_commons-logging-1.1.3.jar not found
        at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1833)
        at org.apache.spark.SparkContext.addJar(SparkContext.scala:1887)
        at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:490)
        at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:490)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:490)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File file:/C:/Users/BERNARD%20JOSHUA/.ivy2/jars/org.apache.spark_spark-sql-kafka-0-10_2.12-3.3.0.jar does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:428)
        at org.apache.spark.SparkContext.addFile(SparkContext.scala:1534)
        at org.apache.spark.SparkContext.addFile(SparkContext.scala:1498)
        at org.apache.spark.SparkContext.$anonfun$new$12(SparkContext.scala:494)
        at org.apache.spark.SparkContext.$anonfun$new$12$adapted(SparkContext.scala:494)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:494)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)
22/09/10 00:17:02 WARN MetricsSystem: Stopping a MetricsSystem that is not running

I also have all the dependencies installed in my .ivy2 file: enter image description here

1 Answer 1

1

Firstly, Spark might have issues with you having spaces in your Windows user name... The error includes saying path C:\Users\BERNARD%20JOSHUA does not exist, which is "true" because of the space being URI encoded.

But, you're running spark-3.0.1, so you should be using 3.0.1 of spark-sql-kafka-0-10 rather than 3.3.0.

Alternatively, upgrade Spark.

See example I wrote here - https://github.com/OneCricketeer/docker-stacks/blob/master/hadoop-spark/spark-notebooks/kafka-sql.ipynb

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.