0

I am trying to check if a dataframe is empty in Pyspark using below.

print(df.head(1).isEmpty)

But, I am getting an error

Attribute error: 'list' object has no attribute 'isEmpty'.

I checked if my object is really a dd using type(df) and it is class 'pyspark.sql.dataframe.Dataframe'

1

5 Answers 5

2

I used df.first() == None to evaluate if my spark dataframe is empty

Sign up to request clarification or add additional context in comments.

1 Comment

df.first() is None would be the preferred syntax.
1

When u do a head(1) it returns a list. So that’s the reason for your error.

You have to just do df.isEmpty().

2 Comments

Now getting dataframe object has no attribute isEmpty
Df.first() == None worked for me
0

df.head(1) returns a list corresponding to the first row of df.

You can check if this list is empty "[ ]" using a bool type condition as in:

if df.head(1):
    print("there is something")
else:
    print("df is empty")

>>> 'df is empty'

Empty lists are implicity "False".

For better explanation, please head over to python docs.

Comments

0

Another way to do this would be to check if df.count()==0

1 Comment

This could cause unnecessary computation on Spark, especially for large datasets.
0

Df.isEmpty would work for everyone starting from spark version 3.3.0. Earlier spark version might not support isEmpty method for pyspark.

1 Comment

For versions earlier to 3.3.0, you can do df.rdd.isEmpty() instead. Source

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.