2

I have a dataframe column that has the occasional tuple inserted into it. I would like to join all of these tuples into one string separated by ','.

EX

Data      People
 A        XYZ
 B        ABX,LMN
 C       ('OPP', 'GGG')
 D        OAR

I am only trying to 'target' the tuple here and convert it to a string giving the following dataframe:

Data      People
 A        XYZ
 B        ABX,LMN
 C        OPP,GGG
 D        OAR

df['People'] = df['People'].apply(','.join)

I tried this, but it ends up inserting commas in between every character in all of the 'OK' strings.

1
  • 2
    I'd look into first fixing this upstream -- what monstrosity of a process causes data to be written like this? Fix that first. No need for hacky solutions, especially when dealing with mixed object columns. Commented Jun 25, 2019 at 17:59

3 Answers 3

3

If you must, you can do something like below.

df['People'] = df['People'].apply(lambda x: ', '.join(x) if isinstance(x,tuple) else x)

Output:

  Data  People
0   A   XYZ
1   B   ABX, LMN
2   C   OPP, GGG
3   D   QAR
Sign up to request clarification or add additional context in comments.

5 Comments

This gives a type error 'type' object is not subscriptable
Probably better to have ', '.join(x) to deal with tuples with any number of elements
Also probably better to use ', '.join(str(x)) in case any of the elements is an integer
@TPereira, that will separate out each character in a string.
@MaxB, have you used tuple as a variable name in your script?
1

This works may not be the most elegant solution:

df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': ['AA', 'ABC, LMN', ('XYZ', 'PQR'), 'OLA']})

# Output

    A   B
0   1   AA
1   2   ABC, LMN
2   3   (XYZ, PQR)
3   4   OLA
df['B'].apply(lambda x: ','.join([val for val in x]) if isinstance(x, tuple) else x)

# Output

0          AA
1    ABC, LMN
2     XYZ,PQR
3         OLA
Name: B, dtype: object

Comments

0

You may avoid apply by using map to create mask True on tuple. Use this mask to slice on rows having tuple and use str.join directly on it.

m = df.People.map(type).eq(tuple)
df.loc[m, 'People'] = df.loc[m, 'People'].str.join(',')


Out[2206]:
  Data   People
0    A      XYZ
1    B  ABX,LMN
2    C  OPP,GGG
3    D      OAR

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.