1

I have potential strings like this. The fist few characters are a symbol that can be one to a few letters, but could contain weird characters like "/". Then the next six characters are always a date, YYMMDD where YY,MM,DD are always integers, but are always padded to the left with a 0 as shown. This is followed by a single character that is always 'C' or 'P', then finally a double.

AAPL220819C152.5
AAPL220819P195
AAPL220902P187.5
AAPL220819C155
AAPL220930C180

What is a regular expression that parses these strings into its constituent parts,

Symbol,
Date,
COP,
Strike

fast?

So the expected output would be:

"AAPL220819C152.5" {Symbol = "AAPL", Date = 2022-08-19, COP = "C", Strike = 152.5 }
"AAPL220819P195"   {Symbol = "AAPL", Date = 2022-08-19, COP = "P", Strike = 195.0}

I have seen similar posts here but I don't understand enough to modify it.

3
  • No unfortunately not. Commented Aug 12, 2022 at 19:51
  • can you show what the expected outcome would look like from one of your examples? Commented Aug 12, 2022 at 19:52
  • See Original Post for example of expected output Commented Aug 12, 2022 at 19:55

1 Answer 1

1

Try this:


        static void Main(string[] args)
        {
            TestParsingRegex("AAPL220819C152.5", "AAPL220819P195", "AAPL220902P187.5", "AAPL220819C155", "AAPL220930C180");
        }

        private static void TestParsingRegex(params string[] strings)
        {
            var regex = new Regex(@"([A-Z]+)(\d{6})([CP]{1})(.*)");
            foreach (var s in strings)
            {
                var match = regex.Match(s);
                foreach (var g in match.Groups)
                {
                    Console.WriteLine(g);
                }
            }
        }

it should have the following output:

AAPL220819C152.5
AAPL
220819
C
152.5
AAPL220819P195
AAPL
220819
P
195
AAPL220902P187.5
AAPL
220902
P
187.5
AAPL220819C155
AAPL
220819
C
155
AAPL220930C180
AAPL
220930
C
180

Notice that the first group is the entire string

This regex uses groups to get the desired parsing like so:

([A-Z]+) all upper case letters up to the next group

(\d{6}) exactly six digits

([CP]{1}) exactly one C or P character

(.*) everything else

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.