Adding values in column

Question

I have a Data Frame df and I want to add '/' this in cast and genres column So that each cell contain 3 '/'

id  movie      cast      genres  runtime
1   Furious    a/b/c/d   a/b        23
2   Minions    a/b/c     a/b/c      55
3   Mission    a/b       a          67
4   Kingsman   a/b/c/d   a/b/c/d    23
5   Star Wars  a         a/b/c      45

So, that its output looks like this

id  movie      cast      genres  runtime
1   Furious    a/b/c/d   a/b//      23
2   Minions    a/b/c/    a/b/c/     55
3   Mission    a/b//     a///       67
4   Kingsman   a/b/c/d   a/b/c/d    23
5   Star Wars  a///      a/b/c/     45

Share the code you've written and explained what's wrong with that code. That shows your effort. — E.Praneeth
– E.Praneeth, Commented Jul 8, 2019 at 13:07
This looks like Assignment/Homework question. you should try yourself first then ask when you get stuck. — WhySoSerious
– WhySoSerious, Commented Jul 8, 2019 at 13:09

yatu · Accepted Answer · 2019-07-08 13:25:41Z

1

Here's one approach defining a custom function:

def add_values(df, *cols):
    for col in cols:
        # amount of "/" to add at each row
        c = df[col].str.count('/').rsub(3)
        # translate the above to as many "/" as required
        ap = [i * '/' for i in c.tolist()]
        # Add the above to the corresponding column
        df[col] = [i + j for i,j in zip(df[col], ap)]
    return df

add_values(df, 'cast', 'genres')

   id     movie     cast   genres  runtime
0   1   Furious  a/b/c/d    a/b//       23
1   2   Minions   a/b/c/   a/b/c/       55
2   3   Mission    a/b//     a///       67
3   4  Kingsman  a/b/c/d  a/b/c/d       23
4   5  StarWars     a///   a/b/c/       45

edited Jul 8, 2019 at 13:25

answered Jul 8, 2019 at 13:17

yatu

88.6k12 gold badges93 silver badges148 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Arsh Singh Over a year ago

Thanks, this is the perfect solution which I want.

Adam.Er8 · Accepted Answer · 2019-07-08 13:16:01Z

you can split by /, fill the resulting list with empty strings until it is of size 4, and then join with / again.

use .apply to change the values in the entire column.

try this:

import pandas as pd
from io import StringIO

df = pd.read_csv(StringIO("""id  movie      cast      genres  runtime
1   Furious    a/b/c/d   a/b        23
2   Minions    a/b/c     a/b/c      55
3   Mission    a/b       a          67
4   Kingsman   a/b/c/d   a/b/c/d    23
5   Star Wars  a         a/b/c      45"""), sep=r"\s\s+")


def pad_cells(value):
    parts = value.split("/")
    parts += [""] * (4 - len(parts))
    return "/".join(parts)


df["cast"] = df["cast"].apply(pad_cells)
df["genres"] = df["genres"].apply(pad_cells)

print(df)

Light Yagami · Accepted Answer · 2019-07-08 13:26:09Z

0

Use this function on each element in each column to update them.

def update_string(string):
    total_occ = 3 #total no. of occurrences of character '/' 
    for element in string: # for each element,
        if element == "/": # if there is '/', decrease 'total_occ'
            total_occ=total_occ-1;
    for i in range(total_occ): # add remaining no. of '/' at the end
        string+="/"
    return string

x = "a/b"    
print(update_string(x))

Output is:

a/b//

edited Jul 8, 2019 at 13:26

answered Jul 8, 2019 at 13:15

Light Yagami

1,0551 gold badge16 silver badges38 bronze badges

Comments

Zaraki Kenpachi · Accepted Answer · 2019-07-08 13:26:11Z

Here You go:

=^..^=

import pandas as pd
from io import StringIO

# create raw data
raw_data = StringIO("""
id movie cast genres runtime
1 Furious a/b/c/d a/b 23
2 Minions a/b/c a/b/c 55
3 Mission a/b a 67
4 Kingsman a/b/c/d a/b/c/d 23
5 Star_Wars a a/b/c 45
""")

# load data into data frame
df = pd.read_csv(raw_data, sep=' ')

# iterate over rows and add character
for index, row in df.iterrows():
    count_character_cast = row['cast'].count('/')
    if count_character_cast < 3:
        df.set_value(index, 'cast', row['cast']+'/'*(3-int(count_character_cast)))

    count_character_genres = row['genres'].count('/')
    if count_character_genres < 3:
        df.set_value(index, 'genres', row['genres'] + '/' * (3 - int(count_character_genres)))

Output:

   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d    a/b//       23
1   2    Minions   a/b/c/   a/b/c/       55
2   3    Mission    a/b//     a///       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star_Wars     a///   a/b/c/       45

RomanPerekhrest · Accepted Answer · 2019-07-08 13:32:44Z

Short solution with itertools features and Dataframe.applymap function:

In [217]: df
Out[217]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d      a/b       23
1   2    Minions    a/b/c    a/b/c       55
2   3    Mission      a/b        a       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars        a    a/b/c       45

In [218]: from itertools import chain, zip_longest

In [219]: def ensure_slashes(x):
     ...:     return ''.join(chain.from_iterable(zip_longest(x.split('/'), '///', fillvalue='')))
     ...: 
     ...: 

In [220]: df[['cast','genres']] = df[['cast','genres']].applymap(ensure_slashes)

In [221]: df
Out[221]: 
   id      movie     cast   genres  runtime
0   1    Furious  a/b/c/d    a/b//       23
1   2    Minions   a/b/c/   a/b/c/       55
2   3    Mission    a/b//     a///       67
3   4   Kingsman  a/b/c/d  a/b/c/d       23
4   5  Star Wars     a///   a/b/c/       45

The crucial function to apply is:

def ensure_slashes(x):
    return ''.join(chain.from_iterable(zip_longest(x.split('/'), '///', fillvalue='')))

Kaies LAMIRI · Accepted Answer · 2019-07-08 13:39:09Z

Ok, so the idea is to create a function that do the necessary work and apply it to the wanted columns :

The function will substitute the current slashs with empty strings and creates a zip of the string within the cell and a constant slash list with exactly 3 elements.

The result is the concatination of the elements of this zip and Hoppla it works :)

import pandas as pd
import re 
df = pd.DataFrame({
                    'id': [1, 2, 3, 4, 5], 
                    'movie': ['furious', 'Mininons', 'mission', 'Kingsman', 'star Wars'], 
                    'cast': ['a/b/c/d', 'a/b/c', 'a/b', 'a/b/c/d', 'a'], 
                    'genres': ['a/b', 'a/b/c', 'a', 'a/b/c/d', 'a/b/c'],
                    'runtime': [23, 55, 67, 23, 45], 
                    })

def slash_func(x):
    slash_list = ['/'] * 3
    x = re.sub('/', '', str(x))
    list_ = list(x)

    for i in range(3 - len(list_)): 
        list_.append('')
    output_list = [v[0]+v[1] for v in list(zip(list_, slash_list))]

    return ''.join(output_list) 


df['cast'] = df['cast'].apply(lambda x: slash_func(x))
df['genres'] = df['genres'].apply(lambda x: slash_func(x))

Output :

id  movie       cast    genres  runtime
1   furious     a/b/c/  a/b//   23
2   Mininons    a/b/c/  a/b/c/  55
3   mission     a/b//   a///    67
4   Kingsman    a/b/c/  a/b/c/  23
5   star Wars   a///    a/b/c/  45

Collectives™ on Stack Overflow

Adding values in column

6 Answers 6

1 Comment

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

1 Comment

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related