7

I'm trying to create an expression that extracts strings greater than 125 from a given string input.

var input = "YH300s, H900H, 234, 90.5, +12D, 48E, R180S, 190A, 350A, J380S";

Please view the link for further reference to my script/data example.

DonotFiddle_Regex example

Here is my current expression attempt (*):

Regex.Matches(input,@"(?!.*\..*)[^\s\,]*([2-5][\d]{2,})[^\s\,]*"))

From the above expression, the only output is 350A, J380S.

However I would like to extract the following output from the input string (see link above for further reference):

YH300s, H900H, R180S, 190A, 350A, J380S

Any further guide as to where I may be going wrong would be very much appreciated. Apology in advance if my context is not clear, as I am still novice in writing regex expressions.

4
  • 2
    I've added some of the essential information for the question into the question itself. While providing an example on another site is great, its also good to put enough information into the actual question so that it doesn't depend on an external site to be helpful Commented May 28, 2015 at 13:13
  • Have a look here: regular-expressions.info/numericranges.html Commented May 28, 2015 at 13:14
  • Does it have to be solved 100% with regular expressions? Commented May 28, 2015 at 13:15
  • @Ryan, not necessarily. I am open to other suggestion. Thank you for your reply. Commented May 28, 2015 at 13:16

4 Answers 4

3
using System;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        // an example of input
        var input = "YH300s, H900H, 234, 90.5, +12D, 48E, R180S, 190A, 350A, J380S";

        var parts = input.Split(new[]{", "}, StringSplitOptions.RemoveEmptyEntries);
        // regex for numbers (including negative and floating-point)
        var regex = new Regex(@"[-]?[\d]?[\.]?[\d]+");

        foreach(var part in parts)
        {
            // there can be many matches, e.g. "A100B1111" => "100" and "1111"            
            foreach(Match m in regex.Matches(part))
            {
                if (double.Parse(m.Value) > 125)
                {
                    Console.WriteLine(part);
                    break;
                }
            }                   
        }           
    }
}

output

YH300s
H900H
234
R180S
190A
350A
J380S
Sign up to request clarification or add additional context in comments.

14 Comments

Dangit that's what I was gonna say.
That's utterly genius. Thank you very much for your time and help.
@user3070072, see update. i added + to regex, it is important in order to get full number
Thank you for the update. I can use your answer as guide, if I need to add anythign further. This great help and saves so much time. Thanks a million for the update.
This solution has 1 major flaw: it will treat negative numbers as positive ones. My solution takes care of them. Try with -1234F, it will be output with this code.
|
2

You can use the following regex (for numbers greater than 125) if you dont want to process the matches:

(?!.*\..*)[^\s\,]*(12[5-9]|1[3-9]\d|[2-9]\d{2}|\d{4,})[^\s\,]*
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Code:

var input = "YH300s, H900H, 234, 90.5, +12D, 48E, R180S, 190A, 350A, J380S";

foreach(var match in Regex.Matches(input,@"(?!.*\..*)[^\s\,]*(12[5-9]|1[3-9]\d|[2-9]\d{2}|\d{4,})[^\s\,]*"))
    {

        Console.WriteLine(match);

    }

See Demo on Fiddle

Comments

2

You can make it even shorter with LINQ and a regex that would take care of double values (with - and +), even on OSes with a comma set as decimal separator:

var input = "YH300s, H900H, 234, 90.5, +12D, 48E, R180S, 190A, 350A, J380S";
var reslts = input.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries)
                 .Where( p.Any(n => Char.IsDigit(n)) && 
                         p => double.Parse(Regex.Match(p, @"[-+]?\d+(?:\.\d+)?").Value,
                 System.Globalization.CultureInfo.GetCultureInfo("en-us")) > 125).ToList();

Output:

enter image description here

The p.Any(n => Char.IsDigit(n)) part checks if we have any digits inside, then we match numbers with [-+]?\d+(?:\.\d+)? regex, and parse them as double values for further comparison.

Comments

1

In your regex you wrote your numbers should start with one digit between 2 and 5 but in your desired result there is H900H, R180S, 190A which dont begin with one of those - so there is a Problem with what you actually want as result.

If you want to match strings with starting digits between 2-5 maybe try: @"\w*[2-5]\d{2,}\w*"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.