| match_id | player_id | team | win |
| 0 | 1 | A | A |
| 0 | 2 | A | A |
| 0 | 3 | B | A |
| 0 | 4 | B | A |
| 1 | 1 | A | B |
| 1 | 4 | A | B |
| 1 | 8 | B | B |
| 1 | 9 | B | B |
| 2 | 8 | A | A |
| 2 | 4 | A | A |
| 2 | 3 | B | A |
| 2 | 2 | B | A |
I have a dataframe that looks like above.
I need to to create a map (key,value) pair such that for every
(k=>(player_id_1, player_id_2), v=> 1 ), if player_id_1 wins against player_id_2 in a match
and
(k=>(player_id_1, player_id_2), v=> 0 ), if player_id_1 loses against player_id_2 in a match
I will have to thus iterate through the entire data frame comparing each player id to another based upon the other 3 columns.
I am planning to achieve this as follows.
Group by match_id
In each group for a player_id check against other player_id's the following
a. If match_id is same and team is different Then
if team = win (k=>(player_id_1, player_id_2), v=> 0 ) else team != win (k=>(player_id_1, player_id_2), v=> 1 )
For example, after partitioning by matches consider match 1. player_id 1 needs to be compared to player_id 2,3 and 4. While iterating, record for player_id 2 will be skipped as the team is same for player_id 3 as team is different the team & win will be compared. As player_id 1 was in team A and player_id 3 was in team B and team A won the key-value formed would be
((1,3),1)
I have a fair idea of how to achieve this in imperative programming but I am really new to scala and functional programming and can't get a clue as to how while iterating through every row for a field create a (key,value) pair by having checks on other fields.
I tried my best to explain the problem. Please do let me know if any part of my question is unclear. I would be happy to explain the same. Thank you.
P.S: I am using Spark 1.6
((1,3),1)would also generate((2,4),1)or do you just want to skip the second player ids altogether?