3

I want to add a column to a pandas DataFrame that has a sequence of int or even str.

This is the pandas DataFrame:

import pandas as pd

df = [{"us": "t1"},
{"us": "t2"},
{"us": "t3"},
{"us": "t4"},
{"us": "t5"},
{"us": "t6"},
{"us": "t7"},
{"us": "t8"},
{"us": "t9"},
{"us": "t10"},
{"us": "t11"},
{"us": "t12"}
    ]
df = pd.DataFrame(df)
df

I just want to add a column of a list of int or str like these:

list_int = [1, 2, 3, 4]

list_str = ['one','two','three','four']

Of course the code df['list_int']=list_int is not working because of the length.

The output should be this:

    us   list_int  list_str
0   t1      1        one
1   t2      2        two
2   t3      3        three
3   t4      4        four
4   t5      1        one
5   t6      2        two
6   t7      3        three
7   t8      4        four
8   t9      1        one
9   t10     2        two
10  t11     3        three
11  t12     4        four

4 Answers 4

4

You can use np.tile:

df['list_int'] = np.tile(list_int, len(df)//len(list_int) + 1)[:len(df)]

or simply

df['list_int'] = np.tile(list_int, len(df)//len(list_int)]

if len(df) is divisible by len(list_int).

Sign up to request clarification or add additional context in comments.

3 Comments

can we look at tile with np.column_stack((list_int,list_str)) ? to get both columns repeated? I am unaware how to tile a 2D array
@anky_91 np.tile(np.array([list_int, list_str]), 2) works as expected.
Thanks so totally np.column_stack(np.tile(np.array([list_int, list_str]), len(df))) :)
1

Let us do something new np.put

df['list_int']=''
df['list_str']=''
np.put(df.list_str,np.arange(len(df)),list_str)
np.put(df.list_int,np.arange(len(df)),list_int)


df
Out[83]: 
     us list_int list_str
0    t1        1      one
1    t2        2      two
2    t3        3    three
3    t4        4     four
4    t5        1      one
5    t6        2      two
6    t7        3    three
7    t8        4     four
8    t9        1      one
9   t10        2      two
10  t11        3    three
11  t12        4     four

Comments

0

multiplying lists with a number n repeats the list n times.

2 * ['one', 'two', 'three'] 

equals

['one', 'two', 'three', 'one', 'two', 'three']

so if you data frame is df you and your snippet is s=['one', 'two', 'three'] you can build your column like this:

col = (len(df) % len(s)) * s + s[:int(len(df) / len(s))]

Comments

0

Try this - might not be the best universal option but it works...

df["list_str"]=np.array(list_str)[df["us"].index.values%4]
df["list_int"]=np.array(list_int)[df["us"].index.values%4]

output


    us  list_str    list_int
0   t1  one         1
1   t2  two         2
2   t3  three       3
3   t4  four        4
4   t5  one         1
5   t6  two         2
6   t7  three       3
7   t8  four        4
8   t9  one         1
9   t10 two         2
10  t11 three       3
11  t12 four        4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.