-2

I'd like to create new columns in my dataframe using unique values from another column, for example

Column 1 has the following values:

Apple
Apple
Banana
Strawberry
Strawberry
Strawberry

When I check unique values in Column 1, the output would be :

Apple
Banana
Strawberry

Now I want to use these three values to create columns named "Apple","Banana","Strawberry" and I want to keep the code dynamic to adapt to however number of unique values are present in Column 1

I'm new to python, any help will be appreciated!

So far, I've been doing getting that output by manually creating new columns in the dataset, I need this to happen automatically depending on the unique values in Column 1

4
  • provide minimal reproducible code in text format ( no screenshots) Commented Nov 23, 2022 at 16:28
  • If this is a pandas dataframe, please add that tag to your question Commented Nov 23, 2022 at 16:33
  • Does this answer your question? Pandas Python : how to create multiple columns from a list (I know this question asks about adding columns from a list, but the idea is the same for any iterable) Commented Nov 23, 2022 at 16:34
  • Here's an example of the data and code: My original column ('Rating') has two values "Agree" & "Disagree" I'm manually creating new columns like this data['Agree'] = np.where(data['Rating']== 'Agree', 1, 0) data['Disagree'] = np.where(data['Rating']== 'Disagree', 1, 0) data['Total'] = data[['Agree', 'Disagree']].sum(axis=1) I want to do the same without having to do it manually, irrespective of how many unique values would be present in 'Rating' column Commented Nov 23, 2022 at 16:48

1 Answer 1

0

extract unique values, iterate on them to create columns and fill in data.

Here I inly put boolean values based on matching with the col1 value ...

df = pd.DataFrame({"col1": ["apple", "apple", "banana", "pineapple", "banana", "apple"]})

data=

        col1
0      apple
1      apple
2     banana
3  pineapple
4     banana
5      apple

transform:

unique_col1_val = df["col1"].unique().tolist()
for u in unique_col1_val:
    df[u] = df["col1"] == u # you need to determine how to fill these new columns
    # here we just put a bool indicating a match between new col name and col1 content ...
    # to put an int truth value use:
    # df[u] = (df["col1"] == u).astype(int)
In [72]: df
Out[72]:
        col1  apple  banana  pineapple
0      apple   True   False      False
1      apple   True   False      False
2     banana  False    True      False
3  pineapple  False   False       True
4     banana  False    True      False
5      apple   True   False      False

using df[u] = (df["col1"] == u).astype(int):

        col1  apple  banana  pineapple
0      apple      1       0          0
1      apple      1       0          0
2     banana      0       1          0
3  pineapple      0       0          1
4     banana      0       1          0
5      apple      1       0          0
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks so much, this is exactly what I was looking for. May I ask, instead of using "True" & "False", how can I assign 1 & 0 to the same.
try .astype(int)
(btw you can mark answer as accepted if you think it is the case.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.