I have this Data Frame
dd = pd.DataFrame({'text':["random text with pol # as 111 and ein no as 101",
"random text with pol # as 222",
"random text with ein # as 333 and ein no as 303"],
'label':[
[[26,29,"pol"],[44,47,"ein"]],
[[26,29,"pol"]],
[[26,29,"ein"],[44,47,"ein"]] ]})
Giving this output
text label
0 random text with pol # as 111 and ein no as 101 [[26,29,pol],[44,47,ein]]
1 random text with pol # as 222 [[26,29,pol]
2 random text with ein # as 333 and ein no as 303 [[26,29,ein],[44,47,ein]]
I want this output
text label \
0 random text with pol # as 111 and ein no as 101 [[26,29,pol],[44,47,ein]]
1 random text with pol # as 222 [[26,29,pol]
2 random text with ein # as 303 and ein no as 304 [[26,29,ein],[44,47,ein]]
pol ein_1 ein_2
0 111 101
1 222
2 303 304
I want to create columns dynamically using column information label where this column is a list of list one list contains start_index , end_index, label_type . By accessing the text in the text column using the start and end index we can get the actual label.
For eg text : "random text with pol # as 222" and label is '[[26,29,pol]'
so pol = Text[26:29] which is pol = 222
so I have to create pol as a column name and give it value 222.
so Far I could come up with this
dd["pol"] = dd.apply(lambda row: row.text[ row.label[0][0] : row.label[0][1]], axis=1)
This only works if the data is static and every time all data labels comes once and in the same place.