2

I have the following content stored inside a database column:

<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>

It is displayed using the following: <%= m.article %>

What I would like to do is strip out the <p> tags and then only show say 40 characters and end it with ... so that I would end up with something like:

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat...

I remember reading that using Html.Encode would strip the HTML tags, but not 100% sure about that, and not sure how to combine this with a way of truncating the content to 40 characters. If someone could help me with this that'd be awesome. Thanks.

2
  • are your values always in the format <p>blah</p><p>blah</p><p>blah</p> or is there a more general requirement? e.g. more complex html? Commented Nov 2, 2010 at 11:27
  • For the time being just like that. Commented Nov 2, 2010 at 11:31

4 Answers 4

2

Off the top of my head, something like this?

public static string EncodeAndTrimText(this HtmlHelper helper, string text)
{
   string result = string.Empty;
   result = HttpUtility.HtmlEncode(result);
   if (result.length > 40)
      return result.Substring(0, 40) + "...";
   return result;
}

HttpUtility.HtmlEncode will only encode the <p> tag, not remove it. If want it removed totally, just do result = result.Replace("<p>", string.Empty);

Usage:

<% Html.EncodeAndTrimText(Model.SomeProperty) %>
Sign up to request clarification or add additional context in comments.

8 Comments

That's how I'd do it too, although depending on requirements you may want to detect if you're cutting a word in half and trim it before the start of the word at the last space.
Where would that first part of code go? Inside the controller for that view?
The string "(39 characters)&" breaks this, as the output would be "(39 characters)&..." - which is invalid HTML.
@Cameron - it's a custom HTML helper, basically an extension method. Put it in a file somewhere in your web application. (ie HtmlExtensions.cs)
@Travis Gockel - are you sure? I was under the impression the HttpUtility.HtmlEncode would replace &, with &amp;, which is actually valid html. am i wrong?
|
2
<%= String.Format("{0:40}...", m.article.Replace("<p>", string.Empty).Replace("</p>", string.Empty) %>

4 Comments

what if part m.article has "<p>" or "</p>" in the text instead just of at the beginning and end?
@SquidScareMe m.article is a single string, the two replace methods chained together will strip out all the <p> and </p> tags wherever they occur in the string. Any line breaks remaining in the string will be ignored by the browser when rendered.
Yes, that was my point. Your solution would work in most practical cases. However, if the string is supposed to have a <p> tag it will be stripped out as well. Perhaps regex would be a slightly better solution?
@SquidScareMe If you want the <p> tag you wouldn't be stripping them out. If you want to wrap the resulting string in a <p> element just alter the format string to this "<p>{0:40}...</p>", you've sanitised the input and then added back the paragraph when you have a clean string.
1

I have some functions that do what you want. It uses regular expressions to strip the html. Here is an example program:

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {    
            string s = @"<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>";    

            Console.WriteLine(Truncate(StripHTML(s), 40) + "...");
            Console.ReadLine();
        }    

        public static string StripHTML(string html)
        {
            return RegularExpressions.Regex.Replace(html, @"<.*?>", string.Empty, RegularExpressions.RegexOptions.IgnoreCase);    
        }

        public static string Truncate(string input, int length)
        {
            bool isTruncated = true;

            if (input.Length > length)
            {
                char[] TextEnds = { ' ', '\n', '\r', '\0', '\f', '\t', '\v' };
                string temp = input.Substring(0, length + 1);
                string temp2 = temp.TrimEnd(TextEnds);

                if (temp2 == temp)
                {
                    //we truncated in the middle of a word
                    temp2 = temp.Substring(0, temp.LastIndexOfAny(TextEnds));    
                }
                else
                {
                    //we did not truncate in the middle of a word
                    //now we just need to return temp2 

                    //we do need to determine if the actual text of the word 
                    //changed before we decide if we have really truncated the
                    if (temp2 == input.TrimEnd(TextEnds))
                        isTruncated = false;    
                }                    
                return temp2;
            }
            else
            {    
                return input;
            }    
        }    
    }
}

Comments

0

I accomplished this by adding a function to my Models class, example:

namespace InventorySystem.Models
{
    public class InventoryItem
    {
       public Int Id { get; set; }
       public string Notes { get; set; }
       ...
       public string NotesTruncated
       {
           get
           {
               //you could add some additional code here to remove the <p>
               return (Notes.Length > 50) ? Notes.Substring(0, 50) + "..." : Notes;
           }
       }
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.