0

I have a string:

string s= "<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>";

Which looks - formatted like this:

<tr>
    <td>abc</td>
    <td>1</td>
    <td>def</td>
</tr>
<tr>
    <td>aaa</td>
    <td>2</td>
    <td>bbb</td>
</tr>

Now I want get values "1" and "2", how do I do this? I have tried convert it to XML but not success.

3
  • 1
    A valid XML document must have a single root node. Wrap your string in one before converting. Commented Jun 15, 2017 at 7:29
  • because in that string have some symbol <tr><td>1</td><td align='center'><i class='cls'></i></td><td><a href='test.aspx?id=1&ct=0&lt=2'style='color:#4169E1'>abc</a></td><td>1</td><td><span style='display:none;'>xxxx</span>xxxx</td><td>def</td></tr> Commented Jun 15, 2017 at 7:31
  • Can you give me all your string? Commented Jun 15, 2017 at 7:47

5 Answers 5

2

You can use HTML Agility Pack. to achieve this

HtmlDocument doc = new HtmlDocument();
doc.Parse(str);

IEnumerable<string> cells = doc.DocumentNode.Descendants("td").Select(td => td.InnerText);
Sign up to request clarification or add additional context in comments.

1 Comment

i'm using .net framework 2.0 and maybe it is not support this
1
string s = "<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>";
s = s.Replace("<tr>","").Replace("</tr>","").Replace("</td>","");
string[] val = s.Split(new string[] { "<td>" }, StringSplitOptions.None);

string one = val[2];
string two = val[5];

I hope it will work for you.

Comments

0
Regex regex = new Regex("<td>(.*?)<\\/td>");
var maches = regex.Matches("<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>");
var values = maches.Cast<Match>().Select(m => m.Groups[1].Value).ToList();

Comments

0
            string s = "<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>";

            var regexPunctuation = s;
            while (regexPunctuation != "")
            {
                regexPunctuation = System.Text.RegularExpressions.Regex.Match(s, @"\d+").Value;
                s = s.Substring(s.IndexOf(regexPunctuation)+regexPunctuation.Length);
                MessageBox.Show(regexPunctuation);
            }

The regex matches every number in the string and the while loop goes through all of them. Do what ever you want intead of MessageBox.Show and you're good to go.

Comments

0

Good day Brom

This might not be the solution you were looking for but it will definitely provide one of the many help.

I would use this regex to extract all the tags

(<\/[a-z]*>)+(<[a-z]*>)+|(<[a-z]*>)+(<\/[a-z]*>)+|(<[a-z]*>)+|(<\/[a-z]*>)+

Example:

  string input = "<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>";
  string replacement = "#";

  string pattern = "(<\/[a-z]*>)+(<[a-z]*>)+|(<[a-z]*>)+(<\/[a-z]*>)+|(<[a-z]*>)+|(<\/[a-z]*>)+";

  RegexOptions options = RegexOptions.IgnoreCase | RegexOptions.Compiled | 
  RegexOptions.Multiline;

  Regex rgx = new Regex(pattern, options);

  string result = rgx.Replace(input, replacement);
  // result == "#abc#1#def#aaa#2#bbb#"

This regex expression will grab the tags as groups or as individuals and then you could replace it with a delimiter line a pipe "|" or "#" and split on that. I hope this helps.

Kind Regards

Ps. Regex explanation: Pipes are used as or operators

(<\/[a-z]*>)+(<[a-z]*>)+ // Closing tag(s) that are followed by opening tag(s)
(<[a-z]*>)+(<\/[a-z]*>)+ // Opening tags followed by closing tags
(<[a-z]*>)+ // one or more opening tags
(<\/[a-z]*>)+ // one or more closing tags    

1 Comment

Also just to mention this regex will work on any and all html/xml elements, not completely sure what the outcome will be with self closing tags.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.