Remove part of the column name of a dataframe using a regular expression in Python

Question

I have a dataframe "counts" and I would like to change the name of the second column using a regular expression because I have multiple files with this "extra information", so I have:

| GeneID |  /home/rmachado/Biotec/ARJNA231684/mapa_fin_starterar/SRR1212121_mapped.bamAligned.sortedByCoord.out.bam   |
| -------- | -------------- |
|  Ciclev10010164m.g.v1.0    | 2            |
|  Ciclev10007306m.g.v1.0    | 647            |
|  Ciclev10009318m.g.v1.0   | 39            |
|  Ciclev...   | ...           |
|  Ciclev10007306m.g.v1.0    | 112            |

I tried with the following code with no success:

for col in counts1:
  counts1.rename(columns={col:col.upper().replace("/home/rmachado/Biotec/ARJNA231684/mapa_fin_starterar/SRR1212121_mapped.bamAligned.sortedByCoord.out.bam","SRR[\d]{6}")},inplace=True)

How can I obtain a df with the following format?

| GeneID |  SRR1212121   |
| -------- | -------------- |
|  Ciclev10010164m.g.v1.0    | 2            |
|  Ciclev10007306m.g.v1.0    | 647            |
|  Ciclev10009318m.g.v1.0   | 39            |
|  Ciclev...   | ...           |
|  Ciclev10007306m.g.v1.0    | 112            |

can you post the data as dictionary

Himanshu Poddar
– Himanshu Poddar

2022-07-26 15:56:09 +00:00
Commented Jul 26, 2022 at 15:56 — Himanshu Poddar
– Himanshu Poddar, Commented Jul 26, 2022 at 15:56

mozway · Accepted Answer · 2022-07-26 16:02:04Z

3

You could try:

df.columns = df.columns.str.extract(r'((?<=/)SRR\d+|^[^/]+$)', expand=False)

regex:

(?<=/)SRR\d+  # match SDD + digits if preceded by "/"
^[^/]+$       # else match full string if it doesn't contain "/"

answered Jul 26, 2022 at 16:02

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rodrigo Machado Over a year ago

Thank you mozway! That's the answer that I looking for :D

Collectives™ on Stack Overflow

Remove part of the column name of a dataframe using a regular expression in Python

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related