java.lang.UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainDoubleDictionary

I am trying to read multiple parquet files form gcs using a dataproc spark job.

df = sc.read.option("mergeSchema", "true").parquet(remote_path)

The above code throws error saying:-

   `org.apache.spark.SparkException: Failed merging schema of file gs://x/2023-04-03T11:33:15.parquet
    org.apache.spark.SparkException: Failed to merge fields 'group_size__c' and 'group_size__c'. Failed to merge incompatible data types double and string
`

To overcome this issue I changed the code to use a specified schema. The 'group_size__c' column is set as string:-

df = sc.read.schema(schema).parquet(remote_path)

This line does not throw any error. But when I try to print distinct values in column 'group_size__c' using this code

 df = df.withColumn("group_size__c", col("group_size__c").cast("string"))
 LOG.info(df.select("group_size__c").distinct().show())

it throws the error

java.lang.UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainDoubleDictionary

what might be causing this error? I have tried disabling the dictionary encoding but it doesn't solve the problem.

spark = SparkSession.builder.config("parquet.enable.dictionary","false").getOrCreate()

asked May 16, 2024 at 13:35

ak1234

2213 silver badges11 bronze badges

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

java.lang.UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainDoubleDictionary

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest