1

Say I have this string:

var results = 
[{\r\n    \"ninja\": \"Leonardo - $0.99\",\r\n    \"data\": [[1336485655241,0.99],[1336566333236,0.99],[1336679536073,0.99],[1336706394834,0.99],[1336774593068,0.99],[1366284992043,0.99]]},
\r\n{\r\n    \"ninja\": \"Donatello - $0.25\",\r\n    \"data\": [[1361061084420,0.23],[1366102471587,0.25],[1366226367262,0.25],[1366284992043,0.25]]},
\r\n{\r\n    \"ninja\": \"Raphael - $0.15\",\r\n    \"data\": [[1327305600000,0.15], [1365583220422,0.15],[1365669396241,0.15],[1365669396241,0.15],[1365753433493,0.15],[1366284992043,0.15]]},\r\n\
r\n{\r\n    \"ninja\": \"Michelangelo - $0.14\",\r\n    \"data\": [1366284992043,0.14]]};

I wanted to build a dictionary that would store the names of the ninjas and their price, so that I would have:

Key \ Value

Leonardo \ 0.99

Donatello \ 0.25

Raphael \ 0.15

Michelangelo \ 0.14

So I have been reading a LOT since a few days about regex, and I don't know how it works yet. Up to now I have this line of code:

var dictNinjas = Regex.Matches(priceListValue, @"\*(\w+)=(a-zA-Z)|\*(\$(0-9))").Cast<Match>()
                                        .ToDictionary(x => x.Groups[0].Value,
                                                      x => x.Groups[1].Value);

My comprehension was that is would first seek all words with letters a-zA-Z, then all values located right after the $ symbol. The | symbol is the grouping, so the first parameters was group 0 and the second parameter would be group 1. But this does not work.

Can anyone help me out? I'm trying hard to understand how to make this work, thank you.

8
  • Where is this string coming from? Commented Apr 19, 2013 at 16:00
  • 12
    You string looks like a JSon string. Should't you use a json deserializer? Commented Apr 19, 2013 at 16:00
  • @DGibbs Parsing through a html document using html agility pack, a string I got from a node. Commented Apr 19, 2013 at 16:01
  • 3
    @HerveS It looks like json to me, check out json.net Commented Apr 19, 2013 at 16:03
  • 1
    I see a couple of things in the RegEx that need changing: 1. in two places you have accidentally escaped the * character, so you are saying "zero or more \ characters". You need \\* instead of \*. 2. (0-9) matches the string "0-9", but it looks like you want any digit; use [0-9] instead (or [0-9]* or [0-9]+) -- the same thing goes for (a-zA-Z). There may be something else, but those are the first things I saw. Commented Apr 19, 2013 at 16:10

2 Answers 2

1

Groups[0].Value is the whole match, so you need 1 and 2

var dictNinjas = Regex.Matches(str, @"""(\w+) - \$([\d.]+)").Cast<Match>()
                                    .ToDictionary(x => x.Groups[1].Value,
                                                  x => x.Groups[2].Value);

Groups[1].Value refers to the content captured in the first () in the regex, and `Groups[2].Value the second.

I am not sure why you have a = in your regex but t looks like you have misunderstood something along the way.

Sign up to request clarification or add additional context in comments.

12 Comments

([\d.]+) could match things like 1..231.21399.. but as long as the inputs are carefully typed it should be fine.
Actually, this works fine! So let me resume: first, @MikeM, you are right: it's my first "try" with Regex and my comprehension is... close to nothing. So now, I think that this looks for any complete words (the \w looks for words, though I don't know what the + does exactly), then follows up to look for any $ sign and any numbers (The d sign) located right after. Right?
@Izzy. Yes, I agree that something like (\d+(?:\.\d+)?) would be better.
@HerveS Check out zytrax.com/tech/web/regex.htm, it will help a load (has a neat little tool to test your RE too)
@HerveS. Where is CCG? I am not sure what you mean, but if you change (\w+) to (.+?) it will match several words not just one.
|
1

Firstly:

so the first parameters was group 0 and the second parameter would be group 1

  • Group 0 is the whole matched string
  • Group 1 is the group that connects to the first close bracket.

Don't worry, it's a common mistake to make.

This site has a very handy regex tester tool as well as lots of RE info - just remember that when you put your Regular expression search string into C# you might need to escape some more characters and verbatim might not interpret things correctly.

For example: I plug (\w+) - \$(\d+\.\d{2}) is as my RE string and get:

First match: Leonardo - $0.99 at position 24 Backreferences: $1 = Leonardo $2 = 0.99 Additional matches: Found: Donatello - $0.25 at position 217 Found: Raphael - $0.15 at position 369 Found: Michelangelo - $0.14 at position 566

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.