I am currently trying to filter my dataframe into an if and get the field returned into variable.
Here is my code:
if df_table.filter(col(field).contains("val")):
id_2 = df_table.select(another_field)
print(id_2)
# Recursive call with new variable
The problem is : it looks like the if filtering works, but id_2 gives me the column name and type where I want the value itself from that field. The output for this code is:
DataFrame[ID_1: bigint]
DataFrame[ID_2: bigint]
...
If I try collect like this : id_2 = df_table.select(another_field).collect()
I get this : [Row(ID_1=3013848), Row(ID_1=319481), Row(ID_1=391948)...] which looks like just listing all id in a list.
I thought of doing : id_2 = df_table.select(another_field).filter(col(field).contains("val"))
but I still get the same result as first attempt.
I would like my id_2 for each iteration of my loop to take value from the field I am filtering on. Like :
3013848
319481
...
But not a list from every value of matching fields from my dataframe.
Any idea on how I could get that into my variable ?
Thank you for helping.
.collect. If you would like a deeper support, please provide a small reproducible example with your desired output.if df_table.filter(col(field).contains("val")), but in order to have the list of only ids (and notRow), try use list comprehension:result = [i[0] for i in id_2]