Apache-Spark error on python : java.lang.reflect.InaccessibleObjectException

Question

it's my first time using Apache-Spark with python (pyspark), and I was trying to run Quick Start Examples, but when I run the line saying:

>>> textFile = spark.read.text("README.md")

it gives me the following error (I'm pasting just the first part because i think it's the most important):

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/daniele/Scaricati/spark/python/pyspark/sql/readwriter.py", line 311, in text
    return self._df(self._jreader.text(self._spark._sc._jvm.PythonUtils.toSeq(paths)))
  File "/home/daniele/Scaricati/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
  File "/home/daniele/Scaricati/spark/python/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
  File "/home/daniele/Scaricati/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o22.text.
: java.lang.reflect.InaccessibleObjectException: Unable to make field private transient java.lang.String java.net.URI.scheme accessible: module java.base does not "opens java.net" to unnamed module @779d0812
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:335)

Can someone help me to solve this? Sorry if my post is not that clear, but it's the first one on this forum. Thanks to everyone who will try to help, Daniele.

What is your Java version? Support for Java 7 was removed as of Spark2.2.0 — MaFF
– MaFF, Commented Nov 7, 2017 at 22:49
openjdk version "9-Ubuntu" OpenJDK Runtime Environment (build 9-Ubuntu+0-9b161-1) OpenJDK 64-Bit Server VM (build 9-Ubuntu+0-9b161-1, mixed mode) — Daniele Foffano
– Daniele Foffano, Commented Nov 8, 2017 at 0:21
Can you check if you get the same error trying to read another text file (use the full absolute path to make sure it is correct). Also try loading a parquet. If the error persists, there might be a problem with your spark-hadoop installation — MaFF
– MaFF, Commented Nov 8, 2017 at 7:07
I tried to read another file (using the full path) and I got the same error, I don't know what a parquet is. To install Spark I did something like this: - installed latest Java version: $sudo apt-get install openjdk-9-jre - downloaded Apache Spark ("Pre-built for Apache Hadoop 2.7 and later") Did I skip something important? — Daniele Foffano
– Daniele Foffano, Commented Nov 8, 2017 at 10:35
I think the problem comes from the fact that spark does not support Java 9 (it will in Spark 3.X probably). Try installing Java 8 instead, setting all necessary environment variables (JAVA_HOME, JRE_HOME) — MaFF
– MaFF, Commented Nov 8, 2017 at 11:22

Sohum Sachdev · Accepted Answer · 2021-06-16 04:47:13Z

4

The issue is that your spark version and java version are incompatible. In order to resolve this you must do the following:

Check you PySpark version:

pyspark
Check which Java version is required for your PySpark version (e.g. for PySpark 2.4.6 we need Java 8 - https://spark.apache.org/docs/2.4.6/)
Check your available Java versions installed

/usr/libexec/java_home -V
If your Java version is not available install it (e.g. brew install adoptopenjdk8)
Change your JAVA_HOME to point to the correct version. Example:

export JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home"
Confirm version java -version

After this you should be able to perform your functions as required

textFile = spark.read.text("README.md")
textFile.show()

answered Jun 16, 2021 at 4:47

Sohum Sachdev

1,4071 gold badge12 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Apache-Spark error on python : java.lang.reflect.InaccessibleObjectException

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related