1

I am reading data from a CSV file and then creating a DataFrame. But when I try to access the data in the DataFrame I get TypeError.

fields = [StructField(field_name, StringType(), True) for field_name in schema.split(',')]
schema = StructType(fields)

input_dataframe = sql_context.createDataFrame(input_data_1, schema)

print input_dataframe.filter(input_dataframe.diagnosis_code == '11').count()

Both 'unicode' and 'str' are not working with Spark DataFrame. I get the below TypeError:

TypeError: StructType can not accept object in type TypeError: StructType can not accept object in type

I tried encoding in 'utf-8' as below but still get the error but now complaining about TypeError with 'str':

input_data_2 = input_data_1.map(lambda x: x.encode("utf-8"))
input_dataframe = sql_context.createDataFrame(input_data_2, schema)

print input_dataframe.filter(input_dataframe.diagnosis_code == '410.11').count()

I also tried parsing the CSV directly as utf-8 or unicode using the param use_unicode=True/False

0

1 Answer 1

3

Reading between the lines. You are

reading data from a CSV file

and get

TypeError: StructType can not accept object in type <type 'unicode'>

This happens because you pass a string not an object compatible with struct. Probably you pass data like:

input_data_1 = sc.parallelize(["1,foo,2", "2,bar,3"])

and schema

schema = "x,y,z"

fields = [StructField(field_name, StringType(), True) for field_name in schema.split(',')]
schema = StructType(fields)

and you expect Spark to figure things out. But it doesn't work that way. You could

input_dataframe = sqlContext.createDataFrame(input_data_1.map(lambda s: s.split(",")), schema)

but honestly just use Spark csv reader:

spark.read.schema(schema).csv("/path/to/file")
Sign up to request clarification or add additional context in comments.

1 Comment

I get pyspark.sql.utils.IllegalArgumentException: 'Unsupported class file major version 55' when I try spark.read.schema. I am reading from a directory inside which I have partitioned data with multiple gzipped .csv files

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.