0

Trying to figure this out and if there is another method aside from regex, I am open to it.

Need to take a pattern similar to the following:

  • One has spaces between the dash and the other does not.
  • Sometimes there may be 3 periods and sometimes 4.
  • Between the periods will always be numbers which may vary such as 1.111.1

    1. 1.1.1-50
    2. 1.1.1 - 50
    3. 1.1.1- 50
    4. 1.1.1 -50

The above should output to:

  • string1: 1.1.
  • string2: 1
  • string3: 50

I can't figure out how to just choose the number between the last period and the dash, choose the numbers after the dash, and also ignore any white spaces.

Update: Complete and Working Code

Utilized the information provided by hakre and Niels and created the following code:

Not sure if my code is optimized but this is basically what I need to accomplished.

<form action="" method="post">
    <p>
        <strong>Records Range:</strong> <input type="text" name="records_range" size="30" maxlength="22" />
        <br />
        <strong>Internal ID:</strong> <input type="text" name="internal_id"  size="40" />
        <select name="id_options">
            <option value="default_internal_id">Default Internal ID</option>
            <option value="new_internal_id">New Internal ID</option>
        </select>
        <br />
        <input type="submit" value="Generate" />
    </p>
</form>

<?php 

    $id_options = NULL;                         
    if (isset($_POST['records_range'])) {   
        $id_options = $_POST['id_options'];
        $internal_id = strip_tags(trim(($_POST['internal_id'])));
        $records_range = strip_tags(trim($_POST['records_range']));
        preg_match('~^((?:\d+\.){2,3})(\d+)\s?-\s?(\d+)$~', $records_range, $record_segements);
        $range_prefix = $record_segements[1];
        $range_start = $record_segements[2];
        $range_end = $record_segements[3];
        echo "<p><strong>Record Data Generated For:</strong> ".$range_prefix.$range_start." - ".$range_end."</p>";
    }


    switch ($id_options){
        case 'default_internal_id':         
        echo "<textarea cols=\"65\" rows=\"10\">";

        // start output
        while($range_start <= $range_end){

            if($range_start < $range_end){
                echo "EUI-ZQ50-N-".$range_prefix.$range_start."\n";
            }

            else{
                echo "EUI-ZQ50-N-".$range_prefix.$range_start;
            }

            $range_start++;
        }
        echo "</textarea>";
        break;  

        case 'new_internal_id':         
        echo "<textarea cols=\"65\" rows=\"10\">";

        // start output
        while($range_start <= $range_end){

            if($range_start < $range_end){
                echo $internal_id." ".$records_prefix.$range_start"\n";
            }

            else{
                echo $internal_id." ".$records_prefix.$range_start;
            }

            $range_start++;
        }
        echo "</textarea>";
        break;
        default:
         echo "<h4>Example:</h4>";
         echo "<p><strong>Records Range</strong>: 1.22.333.444-500 = 1.22.333.444 <strong>THROUGH</strong> 500</p>";
    }

?>      
4
  • You don't have showed any code. So output is relative. One could just hardencode it. Don't get me wrong, the description is verbose, it's just missing the example code that shows how far you've come or the context where this is to be used. E.g. do you need an array of integers maybe? Is the input in a string already or is it an array as well etc.. These things are easy shown with some little lines of code which will improve all answers you will get here. Commented May 7, 2013 at 8:39
  • If you ask about how to write a regex for Notepad++ instead, you should say that. This might explain why you don't have any code example. If so, Notepad++ uses a regular expression similar to Perl as well as PHP does - but with a different library. Therefore regular expressions given you in the answers might not work for you because it's not clear from your question what you ask about. Commented May 7, 2013 at 9:17
  • My apologies I didn't know that php and notepad++ processed the regex differently and my apologies for the lack of details. My post now shows what I was trying to accomplish. Commented May 10, 2013 at 9:04
  • 1
    I meant Notepad++ regex in search and replace inside that editor. If you have PHP code in the editor and you execute it, that is a different pair of shoes. I was just wondering because you had not code added, but that now made it clear it is about PHP. Commented May 10, 2013 at 11:47

2 Answers 2

4

Create three groups that match your parts.

Explicitly saying that inside the first group the digits+dot need to repeat two times:

  ~^((?:\d+\.){2,3})(\d+)\s?-\s?(\d+)$~
    `-------1------´`-2-´       `-3-´
            ^         ^           `--- end number
            |         | 
            |    middle number
            |       
first two/three incl. the dot

Everything non-matching, like the spaces and the dash, then are not captured, which could be also described as "ignored".

I hope this is helpful and shows a bit of how it works.

Interactive Code-Example:

<?php

$strings = [
    '1.1.1-50',
    '1.1.1 - 50',
    '1.1.1- 50',
    '1.1.1 -50',
    '1.1.1.1 -50',
];

foreach($strings as $subject)
{
    $pattern = '~^((?:\d+\.){2,3})(\d+)\s?-\s?(\d+)$~';
    $result  = preg_match($pattern, $subject, $matches);

    printf("%s -> %s\n", var_export($subject, 1), var_export($matches, 1));
}
Sign up to request clarification or add additional context in comments.

7 Comments

You're kind of overcomplicating it with the lookahead - the use case just describes match 1 as 'everything up to the last number', which could include more than 2 periods, which you can use with greediness as I did. But your solution is incorrect since it requires 2 periods.
@Niels: There is no lookahead in the answer (just a non-matching group if you mean that (?:[...])). But you are right that OP wrote that there are sometimes 4 periods, so I need to change the answer to 2,3 for the first parts. (I guess this is what OP meant, I normally read periods as dots but that might just be my bad English).
What goes for your English goes for my regex terminology at times ;) Dot and period are synonyms though in nearly all cases.
Yes, I thought I was not too far off with what I wrote, but languages are learned by using them so it's good to admit mistakes and talk/write about it :)
Mine is more flexible in the dot count (also works for 1.1.1.1.1.1.1 or 1.1), and I suspect a tad faster because I omit the group reference. Don't know if that'll be really noticeable though. Biggest optical difference is that @hakre uses \d (any digit character) while I prefer to use the more verbose but equivalent [0-9] notation, but they're internally treated identically by the regexp engine, that's just different programming styles.
|
3

From the top of my head this should work in preg_match:

/^([0-9\\.]+?)([0-9]+) ?\- ?([0-9]+)$/

5 Comments

thanks for such a quick response. So I tried that expression within notepad++ and it selects all the 1's in 1.1.1-1, as long as they are single digits. So for 1.111.1-1 or 1.111.111-1 , it does now choose the 111, just the 1's. Very interesting though, as I am understanding less than I thought. :)
Should've specified that in the cases ;) Use /^([0-9\\.]+?)([0-9]+?) ?\- ?([0-9]+)$/ instead, just tested it and catches these cases too (toggled greediness on the first match).
@Damainman: Take care that Notepad++ might handle regular expressions differently than PHP. Just saying, if you ask for a PHP regex here, try it in PHP code.
Verified my solution with an online preg_match validator and works fine.
My apologies, I didn't know that notepad++ and php processed the regex incorrectly. I am new to this. Thanks for the preg_match validator link Niels! I +1 your post.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.