split string based on pattern python

Question

I am trying to delete a pattern off my string and only bring back the word I want to store.

example                                return

2022_09_21_PTE_Vendor                  PTE
2022_09_21_SSS_01_Vendor               SSS_01
2022_09_21_OOS_market                  OOS

what I tried

fileName = "2022_09_21_PTE_Vendor"
newFileName = fileName.strip(re.split('[0-9]','_Vendor.xlsx'))

Seeing your profile came to know you never accepted any answer of questions Give it sometime when few answers are there, check them and reply back to users how it went. You could accept an answer out of them all. you could see this link What one could do when someone gets helpful answer on SO cheers and happy learning. Now also you could go to your OLD answers and could accept an answer wherever its applicable, cheers. — RavinderSingh13
– RavinderSingh13, Commented Oct 12, 2022 at 17:44

RavinderSingh13 · Accepted Answer · 2022-10-12 17:39:16Z

2

With Python's re module please try following Python code with its sub function written and tested in Python3 with shown samples. Documentation links for re and sub are added in hyperlinks used in their names in 1st sentence.

Here is the Online demo for used Regex.

import re
fileName = "2022_09_21_PTE_Vendor"

re.sub(r'^\d{4}(?:_\d{2}){2}_(.*?)_.+$', r'\1', fileName)
'PTE'

Explanation: Adding detailed explanation for used regex.

^\d{4}   ##From starting of the value matching 4 digits here.
(?:      ##opening a non-capturing group here.
_\d{2}   ##Matching underscore followed by 2 digits
){2}     ##Closing non-capturing group and matching its 2 occurrences.
_        ##Matching only underscore here.
(.*?)    ##Creating capturing group here where using lazy match concept to get values before next mentioned character.
_.+$     ##Matching _ till end of the value here.

edited Oct 12, 2022 at 17:39

answered Oct 12, 2022 at 17:33

RavinderSingh13

135k14 gold badges61 silver badges100 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Barmar · Accepted Answer · 2022-10-12 17:17:12Z

2

Use a regular expression replacement, not split.

newFileName = re.sub(r'^\d{4}_\d{2}_\d{2}_(.+)_[^_]+$', r'\1', fileName)

^\d{4}_\d{2}_\d{2}_ matches the date at the beginning. [^_]+$ matches the part after the last _. And (.+) captures everything between them, which is copied to the replacement with \1.

answered Oct 12, 2022 at 17:17

Barmar

789k57 gold badges554 silver badges669 bronze badges

Comments

robbamyers · Accepted Answer · 2022-10-12 17:31:13Z

0

Assuming that the date characters at the beginning are always "YYYY_MM_DD" you could do something like this:

fileName = "2022_09_21_SSS_01_Vendor"
fileName = fileName.lstrip()[11:] // Removes the date portion
fileName = fileName.rstrip()[:fileName.rfind('_')] // Finds the last underscore and removes underscore to end
print(fileName)

edited Oct 12, 2022 at 17:31

answered Oct 12, 2022 at 17:30

robbamyers

212 bronze badges

1 Comment

AlexK Over a year ago

Inline comments need to be preceded with a #, not slashes.

rekin · Accepted Answer · 2022-10-12 17:19:55Z

-1

This should work:

newFileName = fileName[11:].rsplit("_")[0]

edited Oct 12, 2022 at 17:19

answered Oct 12, 2022 at 17:19

rekin

91 silver badge2 bronze badges

Collectives™ on Stack Overflow

split string based on pattern python

4 Answers 4

Comments

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related