1

I created a map where they key is a string and the value is a tuple. I also have a dataframe that looks like this

d = {'comments' : ['This is a bad website', 'The website is slow']}
df = pd.DataFrame(data = d)

The maps value for This is a bad website contains something like this

[("There isn't enough support to our site",
  'Staff Not Onsite',
  0.7323943661971831),
 ('I would like to have them on site more frequently',
  'Staff Not Onsite',
  0.6875)]

What I want to do now is create 6 new columns inside the data frame using the first two tuple entries in the map.

So what I would want is something like this

d = {'comments' : ['This is a bad website', 'The website is slow'],
     'comment_match_1' : ['There isn't enough support to our site', ''],
     'Negative_category_1' : ['Staff Not Onsite', ''],
     'Score_1' : [0.7323, 0],
     'comment_match_2' : ['I would like to have them on site more frequently', ''],
     'Negative_category_2' : ['Staff Not Onsite', ''],
     'Score_2' : [0.6875, 0]}
df = pd.DataFrame(data = d)

Any suggestions on how to achieve this are greatly appreciated.

Here is how I generated the map for reference

d = {}
a = []
for x, y in zip(df['comments'], df['negative_category']):
    for z in unlabeled_df['comments']:
        a.append((x, y, difflib.SequenceMatcher(None, x, z).ratio()))
        d[z] = a

Thus when I execute this line of code

d['This is a bad website']

I get

[("There isn't enough support to our site",
  'Staff Not Onsite',
  0.7323943661971831),
 ('I would like to have them on site more frequently',
  'Staff Not Onsite',
  0.6875), ...]
2
  • How does the mapping dictionary looks like? Can you give an example with key:value pair? Commented Feb 25, 2021 at 16:57
  • 1
    @ShubhamSharma I added an example, the ... indicates that there is a lot more tuples. I am just interested in the first two. Commented Feb 25, 2021 at 17:05

1 Answer 1

1

You can recreate a mapping dictionary by flattening the values corresponding to each of the key in the dictionary, then with the help of Series.map substitute the values in the column comments from mapping dictionary, finally create new dataframe from these substituted values and join this new dataframe with the comments column:

mapping = {k: np.hstack(v) for k, v in d.items()}
df.join(pd.DataFrame(df['comments'].map(mapping).dropna().tolist()))

                comments                                       0                 1                   2                                                  3                 4       5
0  This is a bad website  There isn't enough support to our site  Staff Not Onsite  0.7323943661971831  I would like to have them on site more frequently  Staff Not Onsite  0.6875
1    The website is slow                                     NaN               NaN                 NaN                                                NaN               NaN     NaN
Sign up to request clarification or add additional context in comments.

4 Comments

Could we only take the first two tuples in the mapping though? There was a lot more than just two but I am only interested in the first two.
@justanewb If you want only first two tuples you can slice them before using hstack for example you could do np.hstack(v[:2])
Thanks, this is close to what I need but the thing is I want to do this for each comment in the dataframe, does that make sense?
In other words say that there was a value for The website is slow then I want to take its associated first two tuples and put them into the 0 1 2 3 4 5 columns

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.