1

I have the following (caret) delimited csv (the file needs to be in this format):

HEADER^20181130
[Col1]^[Col2]^[Col3]^[Col4]^[Col5]
The^quick^"bro,wn"^fox^jumped
over^the^fat^lazy^dog
m1213^4,12r4^fr,34^,56,gt^12fr,12fr
Trailer^N

and I need to read the file while preserving the order of the headers so that the output matches the following:

enter image description here

However, when I try:

df = pd.read_csv(source_file, header=[0,1], sep=r"[| ^]", engine='python')

I get:

enter image description here

and if I try:

df = pd.read_csv(source_file, header=[1], sep=r"[| ^]",engine='python')

I just get:

enter image description here

Any way to import this file with both headers? Bonus points if we can remove the opening and closing brackets for the header without removing them elsewhere in the file.

Note: I have sep=r"[| ^] because the file could be delimited with pipes as well.

1 Answer 1

3

To keep both header rows, I would suggest to create a pd.Multindex from the first two rows of your data.

Therefore, you will need to import your data without header.

import numpy as np
import pandas as pd

df = pd.read_csv('~/Desktop/stackoverflow_data.csv', sep=r"[| ^]", header=None, engine='python')
df.reset_index(inplace=True)
df.fillna(np.nan, inplace=True)
df.head()

Output:

    level_0     level_1     level_2     0   1
0   HEADER  20181130    NaN     NaN     NaN
1   [Col1]  [Col2]  [Col3]  [Col4]  [Col5]
2   The     quick   "bro,wn"    fox     jumped
3   over    the     fat     lazy    dog
4   m1213   4,12r4  fr,34   ,56,gt  12fr,12fr

Then you will need to zip the two first rows as tuples (and btw remove the square brackets) and create a Multindex object:

cols = tuple(zip(df.iloc[0], df.iloc[1].apply(lambda x: x[1:-1])))

header = pd.MultiIndex.from_tuples(cols, names=['Lvl_1', 'Lvl_2'])

# delete the header rows and assign new header
df.drop([0,1], inplace=True)
df.columns = header

df.head()

This is the output:

Lvl_1   HEADER 20181130       NaN                   
Lvl_2     Col1     Col2      Col3    Col4       Col5
2          The    quick  "bro,wn"     fox     jumped
3         over      the       fat    lazy        dog
4        m1213   4,12r4     fr,34  ,56,gt  12fr,12fr
5      Trailer        N       NaN     NaN        NaN
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.