1

I Have string in the form

var dummyString = $@"SIGNED APPLICATION AND AFFIDAVIT REQUIRED  LOCATION:  BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ  BASED ON:  VACANT LAND";

What I would like to do is to extract the location/address from this string. I can easily find the index of the LOCATION: but can't think of of efficient solution for the index where i should terminate the string. The easiest option is to iterate over the list and find the index of a state code but this won't be very efficient way of handling it.

What i thought would be the solution to this problem is to use a list of US state codes and then find the index of the first match of any state code after the index of LOCATION: substring with a whitespace so I can find the complete state code and its index.

public const List<string> USStateCodes = new List<string> { "AL", "AK", "AS", "AZ", "AR", "CA", "CO", "CT", "DE", "DC", "FM", "FL", "GA", "GU", "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MH", "MD", "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", "NC", "ND", "MP", "OH", "OK", "OR", "PW", "PA", "PR", "RI", "SC", "SD", "TN", "TX", "UT", "VT", "VI", "VA", "WA", "WV", "WI", "WY" };

Any idea on how to proceed from here?

The output i want is:

BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ

The problem stated here is part of bigger logic where I use regex to find the index of zip code (5 digits) as terminator but in some cases, the zip code may not be present in address (user error). I still have to be able to extract the address.

6
  • but can't think of of efficient solution for the index where i should terminate the string. index of "BASED ON:" ? Commented Apr 10, 2020 at 9:38
  • I could do a for loop for each state code in list and try to find index of the code in the string....but that requires iterating over every item...not very elegant. Commented Apr 10, 2020 at 9:41
  • Would probably need regex Commented Apr 10, 2020 at 9:41
  • SIGNED APPLICATION AND AFFIDAVIT REQUIRED LOCATION: and BASED ON: VACANT LAND remain same always in the input string? Commented Apr 10, 2020 at 9:42
  • No....its a variable value...it keeps on changing except LOCATION is present all the times. Commented Apr 10, 2020 at 9:43

1 Answer 1

3

You may use

var dummyString = @"SIGNED APPLICATION AND AFFIDAVIT REQUIRED  LOCATION:  BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ  BASED ON:  VACANT LAND";
var USStateCodes = new List<string> { "AL", "AK", "AS", "AZ", "AR", "CA", "CO", "CT", "DE", "DC", "FM", "FL", "GA", "GU", "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MH", "MD", "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", "NC", "ND", "MP", "OH", "OK", "OR", "PW", "PA", "PR", "RI", "SC", "SD", "TN", "TX", "UT", "VT", "VI", "VA", "WA", "WV", "WI", "WY" };
var result = Regex.Match(dummyString, $@"LOCATION:\s*(.*?\b(?:{string.Join("|", USStateCodes)}))\b")?.Groups[1].Value;

See the C# demo, result output: BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ.

The resulting pattern is

LOCATION:\s*(.*?\b(?:AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY))\b

See the regex demo.

Details

  • LOCATION: - a fixed starting string
  • \s* - 0+ whitespaces
  • (.*?\b(?:{string.Join("|", USStateCodes)})) - Group 1 (the result will be captured in the group):
    • .*? - any 0 or more chars other than newline chars (use RegexOptions.Singleline to match newlines, too), as few as possible
    • \b - a word boundary
    • (?:{string.Join("|", USStateCodes)}) - creates an alternation group with the state codes (like (?:AL|AK|AS|...|WY)) and matches any one of the alternatives
  • \b - a word boundary.
Sign up to request clarification or add additional context in comments.

3 Comments

Very close but it seems to be missing the state code.
No, it is there. See the demo.
You are life saver!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.