Dynamically Removing string with regex python

Question

I am currently having trouble removing the end of strings using regex. I have tried using .partition with unsuccessful results. I am now trying to use regex unsuccessfully. All the strings follow the format of some random words **X*.* Some more words. Where * is a digit and X is a literal X. For Example 21X2.5. Everything after this dynamic string should be removed. I am trying to use re.sub('\d\d\X\d.\d', string). Can someone point me in the right direction with regex and how to split the string?

The expected output should read: some random words 21X2.5

Thanks!

what is your expected output? Do you want to replace 21X2.5 with something else? or remove end of strings? — Kedar
– Kedar, Commented Mar 19, 2015 at 3:35

Vinod Sharma · Accepted Answer · 2015-03-19 03:47:06Z

2

Use following regex:

re.search("(.*?\d\dX\d\.\d)", "some random words 21X2.5 Some more words").groups()[0]

Output:

'some random words 21X2.5'

edited Mar 19, 2015 at 3:47

answered Mar 19, 2015 at 3:41

Vinod Sharma

8836 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Janne Karila Over a year ago

Instead of groups()[0] you could use group(), and the parentheses are not needed in the pattern.

Jason B · Accepted Answer · 2015-03-19 03:49:23Z

0

Your regex is not correct. The biggest problem is that you need to escape the period. Otherwise, the regex treats the period as a match to any character. To match just that pattern, you can use something like:

re.findall('[\d]{2}X\d\.\d', 'asb12X4.4abc')

[\d]{2} matches a sequence of two integers, X matches the literal X, \d matches a single integer, \. matches the literal ., and \d matches the final integer.

This will match and return only 12X4.4.

It sounds like you instead want to remove everything after the matched expression. To get your desired output, you can do something like:

re.split('(.*?[\d]{2}X\d\.\d)', 'some random words 21X2.5  Some more words')[1]

which will return some random words 21X2.5. This expression pulls everything before and including the matched regex and returns it, discarding the end.

Let me know if this works.

edited Mar 19, 2015 at 3:49

answered Mar 19, 2015 at 3:38

Jason B

7,5159 gold badges41 silver badges50 bronze badges

1 Comment

beepboop Over a year ago

Thanks Jason! worked like a charm - I wasn't sure about the regex.

Alex Martelli · Accepted Answer · 2015-03-19 03:42:10Z

0

To remove everything after the pattern, i.e do exactly as you say...:

s = re.sub(r'(\d\dX\d\.\d).*', r'\1', s)

Of course, if you mean something else than what you said, something different will be needed! E.g if you want to also remove the pattern itself, not just (as you said) what's after it:

s = re.sub(r'\d\dX\d\.\d.*', r'', s)

and so forth, depending on what, exactly, are your specs!-)

answered Mar 19, 2015 at 3:42

Alex Martelli

887k175 gold badges1.3k silver badges1.4k bronze badges

Collectives™ on Stack Overflow

Dynamically Removing string with regex python

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related