I have a string "<wpf><xaml><wpf-controls>".
I need the string between the tags in array format.
How do I get this?
-
1You really don't want to parse XML with a regular expression. Use an XML parser like Nokigiri or some specialised library for XAML. But please, don't parse XML with regexes.Holger Just– Holger Just2013-05-15 10:17:45 +00:00Commented May 15, 2013 at 10:17
Add a comment
|
2 Answers
The regex for this problem is really simple it is: /<(.*?)>/
For the array part is would reference to the answer on how to use one line regular expression to get matched content
EDIT:
for array of the insides of the tags use <wpf><xaml><wpf-controls>".scan(/(?:<(.*?)>)*/)
The (?: .. ) groups the tag together and the * says we want 0 or more of that group :)
4 Comments
Ashwin Yaprala
"<wpf><xaml><wpf-controls>".scan(/<(.*?)>/) i am getting array of arrays, I need only array of strings
B8vrede
In that case use
<wpf><xaml><wpf-controls>".scan(/(?:<(.*?)>)*/) the (?: .. ) groups the tag together and the * says we want 0 or more of that group :)jethroo
rubular.com is a nice way to work your regexes out, you can provide a testtring on which you can try your regex while editing it, great help for understanding whats going on
Tom De Leu
I find this answer pretty hard to read, see my answer for a way to do it without making the regex itself more complex.
'<wpf><xaml><wpf-controls>'.scan(/<(.*?)>/).map(&:first)
2 Comments
Stephan
Can you explain your answer ?
Tom De Leu
String#scan returns an array of arrays if your regex contains groups. In this case there is exactly one group, so the result array will be [["wpf"],["xaml"],["wpf-controls"]]. So get all the strings from each sub-array via Array#first. You could also use Array#flatten instead, but my solution would also work if there were multiple groups in the regex.