how to manipulate column header strings in a dataframe

Question

how to remove part of string "test_" in column headers. image the dataframe has many columns, so df.rename(columns={"test_Stock B":"Stock B"}) is not the solution i am looking for!


import pandas as pd

data = {'Stock A':[1, 1, 1, 1],
           'test_Stock B':[3, 3, 4, 4],
           'Stock C':[4, 4, 3, 2],
           'test_Stock D':[2, 2, 2, 3],
           }

df = pd.DataFrame(data)

# expect
data = {'Stock A':[1, 1, 1, 1],
           'Stock B':[3, 3, 4, 4],
           'Stock C':[4, 4, 3, 2],
           'Stock D':[2, 2, 2, 3],
           }

df_expacte = pd.DataFrame(data)

I expect all column headers only labeled as "Stock x" instead of "test_Stock x". Thank you for the ideas!

Celius Stingher · Accepted Answer · 2022-03-23 13:49:17Z

3

You can redefine the columns via list comprehension with:

df.columns = [x.replace("test_","") for x in df]

This outputs:

   Stock A  Stock B  Stock C  Stock D
0        1        3        4        2
1        1        3        4        2
2        1        4        3        2
3        1        4        2        3

answered Mar 23, 2022 at 13:49

Celius Stingher

18.4k6 gold badges26 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Kawish Qayyum · Accepted Answer · 2022-03-23 13:51:07Z

1

You can clean your data before converting it to the dataframe using this code:

cleaned_data = {k.replace('test_', ''): v for k,v in data.items()}

answered Mar 23, 2022 at 13:51

Kawish Qayyum

1819 bronze badges

Comments

jezrael · Accepted Answer · 2022-03-23 13:51:29Z

0

If need extract values Stock x use Series.str.extract:

#if need uppercase letter after Stock + space
df.columns = df.columns.str.extract('(Stock\s+[A-Z]{1})', expand=False)
#if need any value after Stock + space
#df.columns = df.columns.str.extract('(Stock\s+.*)', expand=False)
print (df)
   Stock A  Stock B  Stock C  Stock D
0        1        3        4        2
1        1        3        4        2
2        1        4        3        2
3        1        4        2        3

Or if need remove test_ use Series.str.replace:

df.columns = df.columns.str.replace('test_', '')

answered Mar 23, 2022 at 13:51

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Comments

Ibrahim Halici · Accepted Answer · 2022-03-23 13:52:36Z

0

import pandas as pd

data = {'Stock A':[1, 1, 1, 1],
           'test_Stock B':[3, 3, 4, 4],
           'Stock C':[4, 4, 3, 2],
           'test_Stock D':[2, 2, 2, 3],
           }

df = pd.DataFrame(data)

df.columns = [x.replace('test_','') for x in df.columns]

output :

print(df)
Out[9]: 
   Stock A  Stock B  Stock C  Stock D
0        1        3        4        2
1        1        3        4        2
2        1        4        3        2
3        1        4        2        3

answered Mar 23, 2022 at 13:52

Ibrahim Halici

725 bronze badges

Comments

fokkerplanck · Accepted Answer · 2022-03-23 14:15:09Z

You can use a regular expression (see python documentation) to replace or remove the prefix "test_". The column headers can be treated either as a python list or as a pandas series. In any case you can apply iteratively the substitution on each element of the column headers.

Option A

Pandas has a collection of string processing methods which you can access via the str attribute of pandas Series. As column headers is a Series, you can replace the desired pattern with,

df.columns = df.columns.str.replace(r'^test_', '')

Option B

The regex module can be used to replace the desired pattern using the re.sub method on each column header, using a list comprehension.

import re
df.columns = [re.sub(r'^test_', '', col) for col in df.columns]

Collectives™ on Stack Overflow

how to manipulate column header strings in a dataframe

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related