1

After adding column in dataframe, am not getting proper output file. Here is my input file

   Security Wise Delivery Position - Compulsory Rolling Settlement
   10,MTO,01022018,592287763,0001583
   Trade Date <01-FEB-2018>,Settlement Type <N>,Settlement No <2018023>,Settlement Date <05-FEB-2018>
   Record Type,Sr No,Name of Security,Quantity Traded,Deliverable Quantity(gross across client level),% of Deliverable Quantity to Traded Quantity
   20,1,20MICRONS,EQ,53466,27284,51.03
   20,2,3IINFOTECH,EQ,7116046,3351489,47.10
   20,3,3MINDIA,EQ,2613,1826,69.88
   20,4,5PAISA,EQ,8463,5230,61.80
   20,5,63MOONS,EQ,324922,131478,40.46

Expecting output

 20,1,20MICRONS,EQ,53466,27284,51.03,01022018
 20,2,3IINFOTECH,EQ,7116046,3351489,47.10,01022018
 20,3,3MINDIA,EQ,2613,1826,69.88,01022018
 20,4,5PAISA,EQ,8463,5230,61.80,01022018
 20,5,63MOONS,EQ,324922,131478,40.46,01022018

My code

 import pandas as pd
 df = pd.read_csv('C:/Working/dalal/MTO_11052018.DAT', sep='\t',skiprows=1)
 df=df.iloc[1]
 l1=list(str(df).split(","))
 l2=l1[2]
 df2=pd.read_csv('C:/Working/dalal/MTO_11052018.DAT',sep='\t',skiprows=3)
 df2['Trans_dt']=df2.apply(lambda row:[l2],axis=1)
 df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT',sep=',')

am not getting expected out. Please help on this

2
  • Something is up with row:[l2] Commented May 18, 2018 at 4:28
  • Why are you using sep='\t'if your columns in the file are comma separated? Commented May 18, 2018 at 4:40

2 Answers 2

1

I think need header=1 for second row to columns, nrows=0 for no rows and usecols=[2] for read only third column:

import pandas as pd

temp=u"""Security Wise Delivery Position - Compulsory Rolling Settlement
10,MTO,01022018,592287763,0001583
Trade Date <01-FEB-2018>,Settlement Type <N>,Settlement No <2018023>,Settlement Date <05-FEB-2018>
Record Type,Sr No,Name of Security,Quantity Traded,Deliverable Quantity(gross across client level),% of Deliverable Quantity to Traded Quantity
20,1,20MICRONS,EQ,53466,27284,51.03
20,2,3IINFOTECH,EQ,7116046,3351489,47.10
20,3,3MINDIA,EQ,2613,1826,69.88
20,4,5PAISA,EQ,8463,5230,61.80
20,5,63MOONS,EQ,324922,131478,40.46"""
#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
a = pd.read_csv(pd.compat.StringIO(temp), nrows=0, header=1, usecols=[2]).columns
print (a)
Index(['01022018'], dtype='object')

Then read all necessary data and assign new column:

#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'    
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=3).assign(Trans_dt=a[0])
print (df)
    Record Type   ...    Trans_dt
20            1   ...     1022018
20            2   ...     1022018
20            3   ...     1022018
20            4   ...     1022018
20            5   ...     1022018

[5 rows x 7 columns]

df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT')
#if columns names is necessary remove
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', header=None)

Or similar if need default rangeindex:

#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'    
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=3).rename_axis('val').reset_index().assign(Trans_dt=a[0])
print (df)
   val    ...     Trans_dt
0   20    ...      1022018
1   20    ...      1022018
2   20    ...      1022018
3   20    ...      1022018
4   20    ...      1022018

[5 rows x 8 columns]

If columns names are not important:

#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=4, header=None).assign(Trans_dt=a[0])
print (df)
    0  1           2   3        4        5      6  Trans_dt
0  20  1   20MICRONS  EQ    53466    27284  51.03   1022018
1  20  2  3IINFOTECH  EQ  7116046  3351489  47.10   1022018
2  20  3     3MINDIA  EQ     2613     1826  69.88   1022018
3  20  4      5PAISA  EQ     8463     5230  61.80   1022018
4  20  5     63MOONS  EQ   324922   131478  40.46   1022018

And last:

df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', index=False)
#if columns names is necessary remove
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', index=False, header=None)
Sign up to request clarification or add additional context in comments.

Comments

0

Your import parameters in the calls of read_csv() don't fit to your textfile. Your expected output will be in df by calling:

df = pd.read_csv('C:/Working/dalal/MTO_11052018.DAT', skiprows=6, header=None)

Line2 could be imported like:

df_tr = pd.read_csv('C:/Working/dalal/MTO_11052018.DAT', skiprows=1, nrows=1, header=None)

4 Comments

how can i load l2 values? l2 contains transaction, which available in row2
Note I changed additionally the import of your df2, because there are several newlines in your file, which make it hard to automatically detet column names.
No problem, nice day :)
Thanks, Same to you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.