0

I am trying to print threshold for the dataframe values using pyspark. Below is the R code which I wrote but I want this in Pyspark and I am unable to figure out how to do it in pyspark. Any help will be greatly appreciated!

Values dataframe looks something like

values dataframe is

vote
0.3
0.1
0.23
0.45
0.9
0.80
0.36
# loop through all link weight values, from the lowest to the highest
for (i in 1:nrow(values)){
  # print status
  print(paste0("Iterations left: ", nrow(values) - i, "   Threshold: ", values[i, w_vote]))
}

What I am trying in pyspark is, but I am stuck here

for row in values.collect():
     print('iterations left:',row - i, "Threshold:', ...)

1 Answer 1

1

Every language or tool has a different way to handle things. Below I am providing answer in the way you tried -

df = sqlContext.createDataFrame([
[0.3],
[0.1],
[0.23],
[0.45],
[0.9],
[0.80],
[0.36]
], ["vote"])

values = df.collect()
toal_values = len(values)

#By default values from collect are not sorted using sorted to sort values in ascending order for vote column
# If you don't want to sort these values at python level just sort it at spark level by using df = df.sort("vote", ascending=False).collect()
# Using enumerate to knowing about index of row

for index, row in enumerate(sorted(values, key=lambda x:x.vote, reverse = False)):
     print ('iterations left:', toal_values - (index+1), "Threshold:", row.vote)

iterations left: 6 Threshold: 0.1
iterations left: 5 Threshold: 0.23
iterations left: 4 Threshold: 0.3
iterations left: 3 Threshold: 0.36
iterations left: 2 Threshold: 0.45
iterations left: 1 Threshold: 0.8
iterations left: 0 Threshold: 0.9

It is not encouraged to use collect If you are dealing with big data it will break your program.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Rakesh, its working. But As my dataframe is huge, How should i handle it, if its not recommended to use collect()
@Tilo it depends on what you want to achieve from it?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.