2

I want to extract a number of exactly 5 digits from a string.

If I try

\d{5}

this works for "12345" or "a12345a", but it also matches "12345" in the string "123456" which I don't want.

I can try

\d{5}\D

but then the string "12345a" will be matched in `"a12345a". Is there away to get just the number?

4
  • The proper way to post the last bit is as an answer, not part of the question. Commented May 27, 2013 at 19:50
  • Did you test out your regex? It doesn't appear as though it'd match 12345d... Commented May 27, 2013 at 19:52
  • I don't have enough reputation! Commented May 27, 2013 at 19:56
  • ^D means "start of string, followed by the letter D". Commented May 27, 2013 at 20:12

3 Answers 3

2

To match a number of exactly five digits, even if it's surrounded by letters, use the regex

(?<!\d)\d{5}(?!\d)

This matches five digits (\d{5}) which are neither preceded ((?<!\d)) nor followed ((?!\d)) by a digit.

Word boundaries (\b) won't work here because they wouldn't allow 12345 to match in a12345a.

See a demo on regex101.com.

Sign up to request clarification or add additional context in comments.

4 Comments

What if I can't assert? Is it impossible?
@AlexChap: What do you mean? I thought it was your aim to assert that no letter or digit follows after the 5-digit match?
Oh, I thought you meant that it only works if I assert "mentally".Anyway My whole point was that there might be a letter after the number..like 12345a, but it will get only 12345 same for a12345a and a12345 and "12345". I hope you understand
Then your question is very misleading. You wrote "get a 5 digit number which doesn't end with a letter". I hope I understood now what you meant. Editing my answer...
1

(\d{5})[^a-zA-Z] is the correct way.

(\d{5}) is capturing is the 5 numbers and the [^a-zA-Z] says the next character can't be a letter.

EDIT: For the sake of clarity: \b(\d{5})\b is used when you want to have 5 digit numbers that are surrounded by boundary's (tokens like ' , . " and of course the space.

5 Comments

The new version should do the trick (or was my other solution what you meant?)
Also, note that [^a-zA-Z] does not even begin to cover the definition of "letter" unless you're restricting the input to 7 bit ASCII.
You one doesn't because it still gets me 123456 which I want only 5 number!
If you look at the second part of the answer, it says that way is used when you need exactly 5 numbers. The first part is an exact answer to the question, so 5 numbers not followed by a letter. (But i know that in your case you wanted the second part ;) )
The first regex fails if the 5-digit number is at the end of the string. [^a-zA-Z] says that there must be a next character (which can't be a letter).
0
/\d{5}[^\d]/

This matches a sequence of five digits followed by a non-digit character.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.