I have a pandas series of keys and would like to create a dataframe by selecting values from other dataframes.
eg.
data_df = pandas.DataFrame({'key' : ['a','b','c','d','e','f'],
'value1': [1.1,2,3,4,5,6],
'value2': [7.1,8,9,10,11,12]
})
keys = pandas.Series(['a','b','a','c','e','f','a','b','c'])
data_df
# key value1 value2
#0 a 1.1 7.1
#1 b 2.0 8.0
#2 c 3.0 9.0
#3 d 4.0 10.0
#4 e 5.0 11.0
#5 f 6.0 12.0
I would like to get the result like this
result
key value1 value2
0 a 1.1 7.1
1 b 2.0 8.0
2 a 1.1 7.1
3 c 3.0 9.0
4 e 5.0 11.0
5 f 6.0 12.0
6 a 1.1 7.1
7 b 2.0 8.0
8 c 3.0 9.0
one way I have successfully done this is by using
def append_to_series(key):
new_series=data_df[data_df['key']==key].iloc[0]
return new_series
pd.DataFrame(key_df.apply(append_to_series))
However, this function is very slow and not clean. Is there a way to do this more efficiently?