1

I'm trying to match a string with an optional part in the middle.

Example strings are:

20160131_abc.pdf
20160131_abc_xx.pdf
20160131_def.pdf

The result should include the name of the file (w/o the optional _xx).

What I have so far:

/[0-9]{8}_(abc(_xx)?|def)\.pdf/i

This kind of works but will return the name as abc_xx for the second string - I only want the abc part of it. Is it somehow possible to ignore the subgroup?

3
  • What is the problem if you simply replace "_xx" without a regex? And then use split. Commented Apr 14, 2016 at 9:38
  • Exactly my thought. Can't my_string = my_string.replace("_xx", ""); do the job? Commented Apr 14, 2016 at 9:41
  • I found a way, please check: [0-9]{8}_(abc|def(?=\.pdf))(?:_xx)?\.pdf Commented Apr 14, 2016 at 9:42

2 Answers 2

1

You can restrict the def part with a (?=\.pdf) lookahead that will require .pdf to appear right after def if there is def before the .pdf and add the optional group (?:_xx)? before the .pdf:

[0-9]{8}_(abc|def(?=\.pdf))(?:_xx)?\.pdf

See the regex demo

Explanation:

  • [0-9]{8} - 8 digits
  • _ - underscore
  • (abc|def(?=\.pdf)) - Capture group 1 matching abc or def (def is only matched if .pdf follows it immediately)
  • (?:_xx)? - optional _xx part that can only appear in the match (not in the capture) if preceded with abc
  • \.pdf - literal .pdf substring
Sign up to request clarification or add additional context in comments.

2 Comments

Note that only checking the dot in the lookahead suffices.
Yes, it is possible to omit pdf in the lookahead. [0-9]{8}_(abc|def(?=\.))(?:_xx)?\.pdf is enough, but the gain is tiny.
0

You can use non-capturing groups in the regex and then "implode" the match results:

var re = /([0-9]{8}_)(abc|def)(?:_xx)?(\.pdf)/;
var tests = [
  '20160131_abc.pdf',
  '20160131_abc_xx.pdf',
  '20160131_def.pdf'
];
var container = document.getElementById('container');
tests.forEach(function(test){
  var match = test.match(re);
  var fileName = match.slice(1).join('');
  container.innerHTML += "test:" + test + " → ";
  container.innerHTML += fileName + "<br/>";
});

See fiddle

1 Comment

this also matches 20160131_def_xx.pdf and it shouldn't

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.