I'm currently struggling to parse an C Output.map File using regex. I'm treating each line separately: A single line could look like this
__func_name |00010d88| T | FUNC|00000010| |.text
Expected Output:
1) "__func_name"
2) "00010d88"
3) "T"
4) "FUNC"
5) "00000010"
6) (empty string)
7) ".text"
8) (empty string)
However the number of white spaces between the texts varies: Another line could Look like this:
__func_name2|0007bb7c| T | FUNC|00000034| |.text sourcefile.c:49
1) "__func_name2" 2) "0007bb7c" 3) "T" 4) "FUNC" 5) "00000034" 6) (empty string)
7) ".text"
8) "sourcefile.c:49"
As you can see not only the number of white spaces varies, but there is also the source file listed. Now i did try to solve this Problem using the regexr. I basically need the following requirements for my regex
Alphanumeric string
A (hex)Number
A single letter
A String
A (hex)number
An optional string
Another optional string
Each Group is separated by a | character.
I tried this regex. Although incomplete, regexr tells me that I'm only matching the first group.
Could you help me to figure out what's wrong with my regex?
([__A-Za-z0-9])\w+|((([\|]{1})&[0-9a-h]&([\|]{1})))\w+|([A-Z])\w+
You can try a live demo here: https://regexr.com/4gpvf
Edit: Expected Outputs added
|is being used as a delimiter. Wouldn't it be far simpler to split by that, then trim each resulting string? The last segment would be.text sourcefile.c:49, and that can be easily parsed with a much simpler regex.string[] single_element = single_line.Split((char)('|'));?single_line.Split('|'). I would not remove empty columns if you want to preserve the column indices.