8

Remove all HTML tags like &nbsp;or <p> from string. I used below code but it's not working.

var content = "<p>&nbsp;&nbsp;test result</p><br/>"; // My String

content.replacingOccurrences(of: "<[^>]+>", with: "", options: String.CompareOptions.regularExpression, range: nil)

but it does not remove all HTML tags from string.

3
  • 3
    And what's the result? Is "&nbsp;" really part of the tag? In you regex, where does it appear? That's supposed to be a space. Commented Jan 26, 2017 at 13:34
  • @Larme, I doubt the OP wrote the regex - I suspect that he picked it up as a piece of magic code that "removes HTML tags"... Commented Jan 26, 2017 at 13:37
  • @Grimxn I don't doubt that. But I think that comments area is a good place to point out where could be the issue, giving hints, ideas, and it's up to the author too do some research on its end. Commented Jan 26, 2017 at 13:39

7 Answers 7

21
var content = "<p>&nbsp;&nbsp;test result</p><br/>"; // My String

let a = content.replacingOccurrences(of: "<[^>]+>", with: "", options: String.CompareOptions.regularExpression, range: nil)

a will be: &nbsp;&nbsp;test result

let b = a.replacingOccurrences(of: "&[^;]+;", with: "", options: String.CompareOptions.regularExpression, range: nil)

b will now be: test result

This will also take care of &lt; and so on. There is no magic. Find out what you need and then write the proper RegEx.

Sign up to request clarification or add additional context in comments.

Comments

15

Swift 4 tested: Removes all HTML tags and decodes entities

Provides more stable result

extension String {
    public var withoutHtml: String {
        guard let data = self.data(using: .utf8) else {
            return self
        }

        let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
        ]

        guard let attributedString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else {
            return self
        }

        return attributedString.string
    }
}

1 Comment

Beware. There are known issues with this approach. forums.developer.apple.com/thread/115405
13

For this we can use

extension String {
    var withoutHtmlTags: String {
    return self.replacingOccurrences(of: "<[^>]+>", with: "", options: 
    .regularExpression, range: nil).replacingOccurrences(of: "&[^;]+;", with: 
    "", options:.regularExpression, range: nil)
    }
}

Comments

8

Use Following Extension tested on Playground in Swift 3.0

extension String {
    var withoutHtmlTags: String {
      return self.replacingOccurrences(of: "<[^>]+>", with: "", options: .regularExpression, range: nil)
    }
}

Usage

let result = "<strong>HTML</strong> Tags <em>Contain</em> <img /> <a href=\"\">String</a>".withoutHtmlTags

Comments

4

Try to build an attributed string:

 let data = content.data(using: .utf8)
 let options = [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType] as [String : Any]
 let attrStr = try NSAttributedString(data:data!, options:options ,documentAttributes:nil)
 content = attrStr.string

Comments

2

I used extensions. Extended String and Data. First I convert the HTML to NSAttributedString and then convert to a plain String.

extension String {
    var htmlToAttributedString: NSAttributedString? {
        return Data(utf8).htmlToAttributedString
    }

    var htmlToString: String {
        return htmlToAttributedString?.string ?? ""
    }
}

extension Data {
    var htmlToAttributedString: NSAttributedString? {
        // Converts html to a formatted string.
        do {
            return try NSAttributedString(data: self, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
        } catch {
            print("error:", error)
            return nil
        }
    }
    var htmlToString: String {
        return htmlToAttributedString?.string ?? ""
    }
}

Example:

let html = "<div><p>Example</p></div>"
html.htmlToString() //returns example

Comments

1

add extension

extension String {

   func removeHTMLTag() -> String {

       return self.replacingOccurrences(of: "<[^>]+>", with: "", options: String.CompareOptions.regularExpression, range: nil)

    }

}

and use this

let htmlString : String = "<div> <p>I cannot understand </p> </div>"

htmlString.removeHTMLTag() // I cannot understand 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.