0

I have a sheet that looks like this.

Fleet Risk Control Communication Interpersonal relationships Demographic Demographic
Q_21086 Q_21087 Q_21088 AGE GENDER
1 3 4 27 Male

What I'm trying to achieve is where there is a row with 'Q_' inside of it, merge that cell with the top row and return a new dataframe.

So the existing data above would become something like this:

Fleet Risk Control - Q_21086 Communication - Q_21087 Interpersonal relationships - Q_21088
1 3 4

I honestly have no idea where to even begin with something like this.

1
  • Can there only be 1 or 0 rows with "Q_"? Commented Dec 17, 2020 at 11:15

2 Answers 2

2

You could try this one. This is for input:

import pandas as pd

df = pd.DataFrame({'Fleet Risk Control': ['Q_21086', 1],
              'Communication': ['Q_21087', 3],
              'Interpersonal relationships': ['Q_21088', 4],
              'Demographic': ['AGE', 27],
              'Demographic 2': ['Gender', 'Male']})

Now concat the header line with the first line of df:

df.columns = df.columns + ' - ' + df.iloc[0, :]

Extract every line without the first and dropping the last both columns

df = df.iloc[1:, :-2]
Sign up to request clarification or add additional context in comments.

Comments

0
# rename columns
df.columns = [x + ' - ' + y if y.startswith('Q_') else x for x, y in zip(df.columns, df.iloc[0])]

#drop not matching columns
to_drop = [c for c, _ in df.iloc[0].apply(lambda x: not x.startswith('Q_')).items() if _]
df.drop(to_drop, axis=1)[1:]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.