Write a python code to create a new csv column from existing column using conditional statements

Question

I'm new to python coding and wanted to create a new column based on the condition statement on existing columns.

I use python 2.7 version and run the code on CentOs.

import pandas as pd                                                     
file1 = pd.read_csv("/root/Documents/temp_file_{}.csv".format(timestr))
file1['FileName'] = ''
file1['FileName'] = file1['FileType'].apply(lambda x: df['Path'].str.extract('[^/]+$', expand=False) if x=='f' else '')
file1.to_csv('/root/Documents/temp1_file_{}.csv'.format(timestr),index = False)

Below is my CSV file:

FileType,Path

d,/

f,/documents/csv/.zip

d,/documents/images

d,/hive/28374849

f,/hadoop/jdjdjd/dnejfn.img

Required CSV file:

FileType,Path,FileName

d,/,

f,/documents/csv/.zip,.zip

d,/documents/images,

d,/hive/28374849,

f,/hadoop/jdjdjd/dnejfn.img,dnejfn.img

I want to create a new column FileName where the data in that column should be extracted from Path column only when column Filetype=='f' else the data in that column should be a null value or no data

Chris · Accepted Answer · 2019-05-28 04:31:27Z

1

Using numpy.where with pandas.Series.str.rsplit:

import numpy as np
import pandas as pd

df['FileName'] = np.where(df['FileType'].eq('f'),df['Path'].str.rsplit('/').str.get(-1), '')

Output:

  FileType                       Path    FileName
0        d                          /            
1        f        /documents/csv/.zip        .zip
2        d          /documents/images            
3        d             /hive/28374849            
4        f  /hadoop/jdjdjd/dnejfn.img  dnejfn.img

answered May 28, 2019 at 4:31

Chris

29.8k3 gold badges34 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

WGS Over a year ago

numpy is not needed but this is the solution to go for anyway.

ashish14 · Accepted Answer · 2019-05-28 04:40:53Z

Try using this approach to get the last part from the paths. Split the string from the rightmost side one time using "/" as a separator and fetch the last element (which is basically the filename) using the negative indexing:


file1['FileName'] = ''
file1['FileName'] = file1.apply(lambda x: x['Path'].rsplit("/", 1)[-1]  if x["FileType"]=='f' else '' , axis=1)

file1

FileType    Path    FileName
0   d   /   
1   f   /documents/csv/.zip .zip
2   d   /documents/images   
3   d   /hive/28374849  
4   f   /hadoop/jdjdjd/dnejfn.img   dnejfn.img

using numpy's where which is faster than the apply function because of the vectorization:

file1['FileName'] = np.where(file1["FileType"]=='f', file1['Path'].str.rsplit("/", n=1).str[-1], '')

FileType    Path    FileName
0   d   /   
1   f   /documents/csv/.zip .zip
2   d   /documents/images   
3   d   /hive/28374849  
4   f   /hadoop/jdjdjd/dnejfn.img   dnejfn.img

Collectives™ on Stack Overflow

Write a python code to create a new csv column from existing column using conditional statements

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related