1

I have a csv file where the data is as follows:

    Col1    Col2    Col3
v1  5       9       5
v2  6       10      6
    Col1    Col2    Col3
x1  2       4       6
x2  1       2       10
x3  10      2       1
    Col1    Col2    Col3
y1  9       2       7

i.e. there are 3 different tables with same headers laid on top of each other. I am trying to pythonically get rid of repeating header rows and get the following result:

    Col1    Col2    Col3
v1  5       9       5
v2  6       10      6
x1  2       4       6
x2  1       2       10
x3  10      2       1
y1  9       2       7

I am not sure how to proceed.

2 Answers 2

4

You can read the data and remove the rows that are identical to the columns:

df = pd.read_csv('file.csv')

df = df[df.ne(df.columns).any(1)]

Output:

   Col1 Col2 Col3
v1    5    9    5
v2    6   10    6
x1    2    4    6
x2    1    2   10
x3   10    2    1
y1    9    2    7
Sign up to request clarification or add additional context in comments.

2 Comments

What is any(1) refer to?
that checks along the row for a True value.
0

An alternative solution is to detect the repeated header rows first, and then use the skiprows=... argument in read_csv().

This has the downside of reading the data twice, but has the advantage that it allows read_csv() to automatically parse the correct datatypes, and you won't have to cast them afterwards using astype().

This example uses hard-coded column name for the first column, but a more advanced version could determine the header from the first row, and then detect repeats of that.

# read the file once to detect the repeated header rows
header_rows = []
header_start = "Col1"
with open('file.csv') as f:
    for i, line in enumerate(f):
        if line.startswith(header_start):
            header_rows.append(i)

# the first (real) row should always be detected
assert header_rows[0] == 0

# skip all header rows except for the first one (the real one)
df = pd.read_csv('file.csv', skiprows=header_rows[1:])

Output:

   Col1 Col2 Col3
v1    5    9    5
v2    6   10    6
x1    2    4    6
x2    1    2   10
x3   10    2    1
y1    9    2    7

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.