0

I'm trying to run Python Script in Pyspark on cloudera VM

First I run pyspark by

$ which pyspark
$ pyspark

After launching the spark, i tried:

$ spark-submit /home/cloudera/test.py

Gives me "name 'spark' is not defined"

$ ./bin/spark-submit /home/cloudera/test.py

Gives me "SyntaxError: invalid syntax"

I know there are many similiar questions online but I still can't figure it out. Can someone please help?

2 Answers 2

1

You will have to run the spark-submit shell from the cluster itself. You do not have to pyspark into it.

If you want to run the code interactively (type line by line or copy/paste)then you would use pyspark.

Sign up to request clarification or add additional context in comments.

Comments

0

Check spark is installed as expected by invoking spark-shell. Also try PySpark Shell and try to test whats in your test.py file. Once you are successful try spark-submit

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.