Adding column in pandas python

Question

After adding column in dataframe, am not getting proper output file. Here is my input file

   Security Wise Delivery Position - Compulsory Rolling Settlement
   10,MTO,01022018,592287763,0001583
   Trade Date <01-FEB-2018>,Settlement Type <N>,Settlement No <2018023>,Settlement Date <05-FEB-2018>
   Record Type,Sr No,Name of Security,Quantity Traded,Deliverable Quantity(gross across client level),% of Deliverable Quantity to Traded Quantity
   20,1,20MICRONS,EQ,53466,27284,51.03
   20,2,3IINFOTECH,EQ,7116046,3351489,47.10
   20,3,3MINDIA,EQ,2613,1826,69.88
   20,4,5PAISA,EQ,8463,5230,61.80
   20,5,63MOONS,EQ,324922,131478,40.46

Expecting output

 20,1,20MICRONS,EQ,53466,27284,51.03,01022018
 20,2,3IINFOTECH,EQ,7116046,3351489,47.10,01022018
 20,3,3MINDIA,EQ,2613,1826,69.88,01022018
 20,4,5PAISA,EQ,8463,5230,61.80,01022018
 20,5,63MOONS,EQ,324922,131478,40.46,01022018

My code

 import pandas as pd
 df = pd.read_csv('C:/Working/dalal/MTO_11052018.DAT', sep='\t',skiprows=1)
 df=df.iloc[1]
 l1=list(str(df).split(","))
 l2=l1[2]
 df2=pd.read_csv('C:/Working/dalal/MTO_11052018.DAT',sep='\t',skiprows=3)
 df2['Trans_dt']=df2.apply(lambda row:[l2],axis=1)
 df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT',sep=',')

am not getting expected out. Please help on this

Why are you using sep='\t'if your columns in the file are comma separated? — SpghttCd
– SpghttCd, Commented May 18, 2018 at 4:40

jezrael · Accepted Answer · 2018-05-18 05:53:42Z

I think need header=1 for second row to columns, nrows=0 for no rows and usecols=[2] for read only third column:

import pandas as pd

temp=u"""Security Wise Delivery Position - Compulsory Rolling Settlement
10,MTO,01022018,592287763,0001583
Trade Date <01-FEB-2018>,Settlement Type <N>,Settlement No <2018023>,Settlement Date <05-FEB-2018>
Record Type,Sr No,Name of Security,Quantity Traded,Deliverable Quantity(gross across client level),% of Deliverable Quantity to Traded Quantity
20,1,20MICRONS,EQ,53466,27284,51.03
20,2,3IINFOTECH,EQ,7116046,3351489,47.10
20,3,3MINDIA,EQ,2613,1826,69.88
20,4,5PAISA,EQ,8463,5230,61.80
20,5,63MOONS,EQ,324922,131478,40.46"""
#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
a = pd.read_csv(pd.compat.StringIO(temp), nrows=0, header=1, usecols=[2]).columns
print (a)
Index(['01022018'], dtype='object')

Then read all necessary data and assign new column:

#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'    
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=3).assign(Trans_dt=a[0])
print (df)
    Record Type   ...    Trans_dt
20            1   ...     1022018
20            2   ...     1022018
20            3   ...     1022018
20            4   ...     1022018
20            5   ...     1022018

[5 rows x 7 columns]

df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT')
#if columns names is necessary remove
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', header=None)

Or similar if need default rangeindex:

#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'    
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=3).rename_axis('val').reset_index().assign(Trans_dt=a[0])
print (df)
   val    ...     Trans_dt
0   20    ...      1022018
1   20    ...      1022018
2   20    ...      1022018
3   20    ...      1022018
4   20    ...      1022018

[5 rows x 8 columns]

If columns names are not important:

#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=4, header=None).assign(Trans_dt=a[0])
print (df)
    0  1           2   3        4        5      6  Trans_dt
0  20  1   20MICRONS  EQ    53466    27284  51.03   1022018
1  20  2  3IINFOTECH  EQ  7116046  3351489  47.10   1022018
2  20  3     3MINDIA  EQ     2613     1826  69.88   1022018
3  20  4      5PAISA  EQ     8463     5230  61.80   1022018
4  20  5     63MOONS  EQ   324922   131478  40.46   1022018

And last:

df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', index=False)
#if columns names is necessary remove
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', index=False, header=None)

SpghttCd · Accepted Answer · 2018-05-18 04:51:16Z

0

Your import parameters in the calls of read_csv() don't fit to your textfile. Your expected output will be in df by calling:

df = pd.read_csv('C:/Working/dalal/MTO_11052018.DAT', skiprows=6, header=None)

Line2 could be imported like:

df_tr = pd.read_csv('C:/Working/dalal/MTO_11052018.DAT', skiprows=1, nrows=1, header=None)

edited May 18, 2018 at 4:51

answered May 18, 2018 at 4:44

SpghttCd

10.9k2 gold badges23 silver badges28 bronze badges

4 Comments

Gower Over a year ago

how can i load l2 values? l2 contains transaction, which available in row2

SpghttCd Over a year ago

Note I changed additionally the import of your df2, because there are several newlines in your file, which make it hard to automatically detet column names.

jezrael Over a year ago

No problem, nice day :)

SpghttCd Over a year ago

Thanks, Same to you!

Collectives™ on Stack Overflow

Adding column in pandas python

2 Answers 2

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related