3

I have 1 DF w/ the first and last date of the games for each NBA team. I have another DF w/ the ELO of the team before and after each game. I would like to add 2 columns to DF1 w/ the ELO of the team and at the first and last dates specified. For dates in the first column, I would like ELO1 and dates in the second column I would like ELO2. It's even better if there's some way to get the difference between the 2 ELO's directly into 1 column since that is what I'll be computing eventually.

DF1:

         first      last
team        

ATL 2017-10-18  2018-04-10

BOS 2017-10-17  2018-04-11

BRK 2017-10-18  2018-04-11

CHI 2017-10-19  2018-04-11
[...]

DF2:

          date      team       ELO_before        ELO_after
65782 2017-10-18  ATL        1648.000000  1650.308911

65783 2017-10-17  BOS        1761.000000  1753.884111

65784 2017-10-18  BRK        1427.000000  1439.104231

65785 2017-10-19  CHI        1458.000000  1464.397752

65786 2018-04-10  ATL        1406.000000  1411.729285
[...]

Thanks in Advance!

Edit - The resulting data frame I want would look like:

DF3:

       first        last      ELO_before    ELO_after
team        

ATL 2017-10-18  2018-04-10   1648.000000   1411.729285

BOS 2017-10-17  2018-04-11   1761.000000   [Elo2 for last game]

BRK 2017-10-18  2018-04-11   1427.000000   [Elo2 for last game]

CHI 2017-10-19  2018-04-11   1458.000000   [Elo2 for last game]
2
  • Can you edit the question with a sample of the resulting DataFrame you want? Commented Jul 7, 2018 at 7:24
  • @nijm just added the resulting DataFrame I'm looking for. Thanks in advance! Commented Jul 7, 2018 at 7:45

1 Answer 1

3

You can use pandas.DataFrame.merge for this:

import pandas as pd

# frames from the question
df1 = pd.DataFrame(data={
  'team': ['ATL', 'BOS', 'BRK', 'CHI'],
  'first': ['2017-10-18', '2017-10-17', '2017-10-18', '2017-10-19'],
  'last': ['2018-04-10', '2018-04-11', '2018-04-11', '2018-04-11']
}).set_index('team')

df2 = pd.DataFrame(data={
  'date': ['2017-10-18', '2017-10-17', '2017-10-18', '2017-10-19', '2018-04-10'],
  'team': ['ATL', 'BOS', 'BRK', 'CHI', 'ATL'],
  'ELO_before': [1648.0, 1761.0, 1427.0, 1458.0, 1406.0],
  'ELO_after': [1650.308911, 1753.884111, 1439.104231, 1464.397752, 1411.729285]
})

# merge on first and last
df1.reset_index(inplace=True)
df3 = df1.merge(df2.drop('ELO_after', axis=1), how='left', left_on=['team', 'first'], right_on=['team', 'date']).drop(['date'], axis=1)
df3 = df3.merge(df2.drop('ELO_before', axis=1), how='left', left_on=['team', 'last'], right_on=['team', 'date']).drop(['date'], axis=1)

# calculate the differences
df3['ELO_difference'] = df3['ELO_after'] - df3['ELO_before']
df3.set_index('team', inplace=True)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.