0

I'm receiving my data in a string through a serial port communication. That part is working fine. The data is in the following format:

Distance run: 36in.
Direction in degrees: 275

Total of person founds:11
New Person found in:
Lat/Long: 18.38891, -66.12174
Date: 5/4/2013  Time: 19:13:35.0

Total of person founds:12
New Person found in:
Lat/Long: 18.38891, -66.12175
Date: 5/4/2013  Time: 19:13:37.0


Distance run: 15in.
Direction in degrees: 215

Total of person founds:13
New Person found in:
Lat/Long: 18.38891, -66.12174
Date: 5/4/2013  Time: 19:13:39.0


Distance run: 30in.
Direction in degrees: 180

But this can vary a little cause in between of each blog of persons founds (which includes it position in lat/long, date and time) there can be another distance run and direction in degrees.

I've tried Regex but not sure on how to use it well. I even have a regular expression that extract only the numbers.

var xnumbers = Regex.Split(strFileName, @"[^0-9\.\-]+").Where(c => c != "." && c.Trim() != "");

What I want is to extract the specific value of let say distance run: which the first value is 36in and store it and so on. Then get the value of the direction in degrees and store it in another variable and finally get the lat & long and store it in other variable. I need this values to create a List to later use that data and plot it. I already have the drawing part.

I tried this:

I know this patter takes only in consideration that distance is 2 numbers only but that value can be 1 or 3 numbers (example: Distance Run: 1in or Distance run: 219 in)

string pattern = @"^Distance Run:(?<distance>.{2}),Direction in degrees:,(?<degress>. {3}),Lat/Long:(\+|\-)(?<latlong>.{18})$";

string distance = string.Empty;
string degrees = string.Empty;
string latlong = string.Empty;


Regex regex = new Regex(pattern);

if (regex.IsMatch(strFileName)) // strFileName is the string with the data
{
    Match match = regex.Match(strFileName);
    foreach (Capture capture in match.Groups["distance"].Captures)
        distance = capture.Value;
    foreach (Capture capture in match.Groups["degree"].Captures)
        degrees = capture.Value;
    foreach (Capture capture in match.Groups["Lat/Long"].Captures)
        latlong = capture.Value;
}

But is not working. I would thank for any help and advice. Thanks in advance.

6
  • 1
    I believe you are pushing beyond the casual use of regex. Have you considered rolling your own parser instead? The grammar is simple enough, and simple parsing is a skill that should be in every developer's toolkit. Commented May 5, 2013 at 20:27
  • Can you please give an example code to do the operations?? Thanks in advance. Commented May 6, 2013 at 2:49
  • With apologies to Fermat: I have found a perfectly elegant means of presenting that to you; unfortunately the margin of this comment is too small to contain it. Commented May 6, 2013 at 3:04
  • Can you send it through e-mail? Commented May 6, 2013 at 3:25
  • No; I am not your personal code-writing service. My last comment was a joke and I apologize if you did not get it. (Google Fermat's Last Theorem if you are interested.) I submitted a comment to try and steer you in a useful direction. Do with that what you will. There a plethora of open source parsing tools and tutorials on the web. Commented May 6, 2013 at 3:29

2 Answers 2

2

You can just define a variable repetition by passing 2 values within {}, e.g. \d{1,3} would match 1-3 digits.

But overall, I'd probably use less regular expressions for this, given the format is rather nice to parse:

  • First of all I'd loop through all lines as an array of strings.
  • If one line is empty, ignore it, if two consecutive lines are empty, a new dataset is created.
  • If the line includes a : you split the line there: the left part becomes the key, the right part the value.
  • If the line doesn't include a : it's either ignored or an error flag raised (or exception thrown or whatever).
  • You can then use a simple string match (faster than a regular expression) to determine the meaning of the "key".
  • Another regular expression (or just something like Double.Parse()) then extracts the value from the right. Keep in mind that these conversion functions will simply skip invalid characters and drop anything trailing them.
  • The more complex entries (like coordinates or time stamp; or the entries where the unit is important) can then be parsed using a simple regular expression.

Simplified code (not necessarily 100% compileable/correct):

String[] lines = fileContents.Split({'\n'}); // split the content into separate lines

bool wasEmpty = false;
foreach (String line in lines) {
    line = line.Trim(); // remove leading/trailing whitespaces
    if (line.Length == 0) { // line is empty
        if (wasEmpty) { // last line was empty, too
            // init a new dataset
        }
        else
            wasEmpty = true;
        continue; // skip to next entry
    }
    wasEmpty = false;
    String content = line.split({':'}); // split the line into a key/value pair
    if (content.Length != 2) // not exactly two entries
        continue; // skip
    // content[0] now has the "key" (like "Lat/Long")
    // content[1] now has the "value" (like "18.38891, -66.12175")
    // both can be evaluated using regular expressions
}
Sign up to request clarification or add additional context in comments.

3 Comments

Last night I was working in this but is not working at 100%. When i = 0 entered in the first if and do the operation fine. Then it supposed to goes to the others else if, compare do nothing and then increment i and evaluate again, this time it suppose to enter to the first else if but is not entering. Any suggestion??
for(int i=0;i<distance.Count();i++) {if (distance[i] == "Distancia recorrida") {var num = Regex.Split(distance.ElementAt(i+1), @"[^0-9\.\-]+").Where(c => c != "." && c.Trim() != ""); x= double.Parse(num.ElementAt(0)); i = i + 1;} else if (distance[i]=="Direccion en grados") {var num = Regex.Split(distance.ElementAt(i+1), @"[^0-9\.\-]+").Where(c => c != "." && c.Trim()!=""); z = double.Parse(num.ElementAt(0)); }else if (distance[i] == "Persona encontrada en") {var num = Regex.Split(distance.ElementAt(i+1), @"[^0-9\.\-]+").Where(c => c != "." && c.Trim() != ""); latlong = num.ElementAt(0); } }
Comments are bad for such things, you should add it to your initial question, if related, otherwise create a new question.
2

In your regular expression, the name of the second part is degress. In code, you're using degree instead.

In your regular expression, the name of the second part is latlong. In code, you're using Lat/Long instead.

So no, as is, you won't be able to get those two groups.

1 Comment

It was an error in typo when I translated from Spanish to English. In the code is correct.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.