I have a pyspark dataframe which contains string json. Looks like below:
+---------------------------------------------------------------------------+
|col |
+---------------------------------------------------------------------------+
|{"fields":{"list1":[{"list2":[{"list3":[{"type":false}]}]}]}} |
+----------------------------------------------------------------------------+--
I wrote udfs to try to parse the json and then count the value that matches phone and return to a new column in df
def item_count(json,type):
count=0
for i in json.get("fields",{}).get("list1",[]):
for j in i.get("list2",[]):
for k in j.get("list3",[]):
count+=k.get("type",None)==type
return count
def item_phone_count(json):
return item_count(json,False)
df2= df\
.withColumn('item_phone_count', (F.udf(lambda j: item_phone_count(json.loads(j)), t.StringType()))('col'))
But I got the error:
AttributeError: 'NoneType' object has no attribute 'get'
Any idea what's wrong?
item_count()isNone, but there is no way to figure out which one from the information you've posted. Please post the full error traceback and an minimal reproducible example with enough information so that someone else can reproduce your error.None.