1

To start things off I created a jar file using this How to build jars from IntelliJ properly?.

My Jar files path is

out/artifacts/sparkProgram_jar/sparkProgram.jar

My spark program, in general, reads a table from MongoDB, transforms it using spark's mllib and writes it to MySQL. Here is my build.sbt file.

name := "sparkProgram"

version := "0.1"

scalaVersion := "2.12.4"
val sparkVersion = "3.0.0"
val postgresVersion = "42.2.2"

resolvers ++= Seq(
  "bintray-spark-packages" at "https://dl.bintray.com/spark-packages/maven",
  "Typesafe Simple Repository" at "https://repo.typesafe.com/typesafe/simple/maven-releases",
  "MavenRepository" at "https://mvnrepository.com"
)

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-sql" % sparkVersion,
  "org.apache.spark" %% "spark-mllib" % sparkVersion,
  // logging
  "org.apache.logging.log4j" % "log4j-api" % "2.4.1",
  "org.apache.logging.log4j" % "log4j-core" % "2.4.1",
  "org.mongodb.spark" %% "mongo-spark-connector" % "2.4.1",

  //"mysql" % "mysql-connector-java" % "5.1.12",
  "mysql" % "mysql-connector-java" % "8.0.18"
).

My main class is in the package com.testing in a scala object named

mainObject

When I run the following spark-submit command

spark-submit --master local --class com.testing.mainObject
--packages mysql:mysql-connector-java:8.0.18,org.mongodb.spark:mongo-spark-connector_2.12:2.4.1 out/artifacts/sparkProgram_jar/sparkProgram.jar

I receive this error

Error: Missing application resource.

Usage: spark-submit [options] <app jar | python file | R file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]

Options:


... zsh: command not found: --packages

And then when I attempt to run my spark-submit without the --packages(just to check what would happen) I receive this error.

command:

spark-submit --master local --class com.testing.mainObject out/artifacts/sparkProgram_jar/sparkProgram.jar

error: Error: Failed to load class com.testing.mainObject

I've used spark-submit before and it worked ( a couple months back). I'm not sure why this is still giving me an error. My MANIFEST.MF is the following

Manifest-Version: 1.0
Main-Class: com.testing.mainObject

1 Answer 1

3

My answer so far, was to first build the jar file differently.(IntelliJ creation)

File -> Project Structure -> Project Settings -> Artifacts -> Jar, however instead of extracting to jar, I clicked on

Copy to Output and link to manifest

From there, I did a spark-submit command which did not have --packages part of it. It was

spark-submit --class com.testing.mainObject --master local out/artifacts/sparkProgram_jar/sparkProgram.jar

Also be careful about spacing, and copying and pasting into your terminal. White space can give you weird errors.

From there I had another error, which is shown here. https://github.com/Intel-bigdata/HiBench/issues/466. The solution is in the comments

"This seems to happen with hadoop 3. I solved it removing a hadoop-hdfs-2.4.0.jar that was in the classpath."
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.