3

I have an array that has values separated by '|'. I would like to parse it to a pandas data frame.

import pandas as pd    
arr = ['19345360853|5264654|100530|2017-01-07', '19345360853|13518371|100530|2018-10-08']
pd.DataFrame([{'Id': item.split('|')[0] ,'Code_A': item.split('|')[1] , 'Code_B': item.split('|')[2],'Reg_Date': item.split('|')[3]} for item in arr ])

I would like the pandas dataframe to be in the following schema,

'Id' string 'Code_A' string 'Code_B' string 'Reg_Date' date

So the resulting Pandas dataframe would be similar to this. result dataframe

Any help is appreciated.

1
  • Python does not have arrays, python has lists. That said, why not just split the strings by | separator Commented Jan 29, 2019 at 21:04

1 Answer 1

5

First, convert to two dimensional list

arr = [a.split("|") for a in arr]

Second, convert to pandas dataframe

data = pd.DataFrame(arr,columns=['Id','Code_A','Code_B','Reg_Date'])

            Id    Code_A  Code_B    Reg_Date
0  19345360853   5264654  100530  2017-01-07
1  19345360853  13518371  100530  2018-10-08

Convert column Reg_Date using astype (Ref: astype)

a =pd.DataFrame(arr,columns=['Id','Code_A','Code_B','Reg_Date'])
a['Reg_Date'] = a['Reg_Date'].astype('datetime64[ns]')
Sign up to request clarification or add additional context in comments.

3 Comments

How to get the Reg_Date column as date?
@KeerikkattuChellappan using astype, i edit the answer
Just to note, you can make the [a.split("|") for a in arr] into a generator comprehension and it won't take up as much space in memory.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.