I have two dataframes, with cell names and some values for that cells, like this: cell_df:
cell_name cell_values
abc1b (h 1, a 2, a4)
adc2g (h 2, a 4, a5)
daf1g (h 3, a 7, a2)
adg2d (h 1, a 4, a4)
And the other one:
record_df:
record_id record_values
1 start abc1b 1 2 , daf1g 3 5
2 start adc2g 6 7 , adg2d 6 5
3 start abc1b 10 13 , adc2g 2 3
What I need is to put cell_values before each comma, for that cell_name appear before that same comma and string "from" before first number, string "to" between two numbers
Desired output:
record_id record_values
1 start abc1b from 1 to 2 (h 1, a 2, a4), daf1g from 3 to 5 (h 3, a 7, a2)
2 start adc2g from 6 to 7 (h 2, a 4, a5), adg2d from 6 to 5 (h 1, a 4, a4)
3 start abc1b from 10 to 13 (h 1, a 2, a4), adc2g from 2 to 3 (h 1, a 4, a4)
I think I got that with my code below, but it takes a huge amount of time to proceed, a few minutes, but dataframe has just 80 rows.
for cn, cv in cell_df[['cell_name', 'cell_values']].values:
record_df['record_values'] = record_df['record_values'].apply(lambda x: (re.sub(r"%s(\s+)(\d+)\s+(\d+)" % cn, r"%s from \1 to \2 %s" % (cn, cv), x)))
So, the question is: is there any way to speed that up? Maybe a whole different approach?
I am using Python 2.7