0

I got HTML string from API like this:

let a: String = "<a href="https://www.google.com.tw">https://www.google.com.tw </a>"
let b: String = "<a href="myAppName://app/user/aa3b77411825b88b318d77gg">@Tim </a>Hello Tim"
let c: String = "<a href="myAppName://app/user/aa3b77411825b88b318d77gg">@Tim </a><a href="https://www.google.com.tw">https://www.google.com.tw </a>"

let splitedArray1: [String] = a.componentsSeparatedByString("?????") //splited string which is the best 
let splitedArray2: [String] = b.componentsSeparatedByString("?????") //splited string which is the best
let splitedArray3: [String] = c.componentsSeparatedByString("?????") //splited string which is the best

I want to separate link from them and get the data like following

print(splitedArray1) //["https://www.google.com.tw","https://www.google.com.tw"]
print(splitedArray2) //["myAppName://app/user/aa3b77411825b88b318d77gg","@Tim ","Hello Tim"]
print(splitedArray3) //["myAppName://app/user/aa3b77411825b88b318d77gg","@Tim ","https://www.google.com.tw","https://www.google.com.tw "]

4 Answers 4

1

Possible solution: Use NSAttributedString then enumerate on the NSLinkAttributeName, if there isn't, it means there were no link tag, so you just keep the "string", else, you add the link, then the string.

Quickly written in Playground:

let a: String = "<a href=\"https://www.google.com.tw\">https://www.google.com.tw </a>"
let b: String = "<a href=\"myAppName://app/user/aa3b77411825b88b318d77gg\">@Tim </a>Hello Tim"
let c: String = "<a href=\"myAppName://app/user/aa3b77411825b88b318d77gg\">@Tim </a><a href=\"https://www.google.com.tw\">https://www.google.com.tw </a>"

let values:[String] = [a, b, c]



for aHTMLString in values
{
    let attributedString = try! NSAttributedString.init(data: aHTMLString.data(using: .utf8)!,
                                                        options: [.documentType: NSAttributedString.DocumentType.html],
                                                        documentAttributes: nil)
    var retValues = [String]()
    attributedString.enumerateAttribute(.link,
                                        in: NSRange(location: 0, length: attributedString.string.count),
                                        options: [],
                                        using: { (attribute, range, pointerStop) in
                                            if let attribute = attribute as? URL
                                            {
                                                retValues.append(attribute.absoluteString)
                                            }
                                            let subString = (attributedString.string as NSString).substring(with: range)
                                            retValues.append(subString)
    })

    print("*** retValues: \(retValues)")
}

let targetResult1 = ["https://www.google.com.tw","https://www.google.com.tw"]
let targetResult2 = ["myAppName://app/user/aa3b77411825b88b318d77gg","@Tim ","Hello Tim"]
let targetResult3 = ["myAppName://app/user/aa3b77411825b88b318d77gg","@Tim ","https://www.google.com.tw","https://www.google.com.tw "]
print("targetResult1: \(targetResult1)")
print("targetResult2: \(targetResult2)")
print("targetResult3: \(targetResult3)")

Output:

*** retValues: ["https://www.google.com.tw/", "https://www.google.com.tw "]
*** retValues: ["myappname://app/user/aa3b77411825b88b318d77gg", "@Tim ", "Hello Tim"]
*** retValues: ["myappname://app/user/aa3b77411825b88b318d77gg", "@Tim ", "https://www.google.com.tw/", "https://www.google.com.tw "]
targetResult1: ["https://www.google.com.tw", "https://www.google.com.tw"]
targetResult2: ["myAppName://app/user/aa3b77411825b88b318d77gg", "@Tim ", "Hello Tim"]
targetResult3: ["myAppName://app/user/aa3b77411825b88b318d77gg", "@Tim ", "https://www.google.com.tw", "https://www.google.com.tw "]

There are small differences, I copied your "target" (splitArray), and it's missing a space in the last one, and my code tend to add a final "/" on links.

Sign up to request clarification or add additional context in comments.

Comments

1

I've created this extension to get url.

extension String {
  func getUrl() -> String? {
      let rss = self.split { (char) -> Bool in
          return char == ">"
      }
      if let final = rss.last?.split(separator: "<"), let first = final.first {
          return String(first)
      }
      return nil
  }

  var hrefUrl: String {
    let matchString = "=\""
    let arrComponents = self.components(separatedBy: matchString)
    if let first = arrComponents.last, let str = first.split(separator: "\"").first {

        return String(str)
    }
    return ""
  }
}

Usage:

let a: String = "<a href=\"https://www.google.com.tw\">https://www.google.com.tw </a>"
a.getUrl()  //output: https://www.google.com.tw 

//or

a.hrefUrl //output: https://www.google.com.tw 

1 Comment

This is swift4? I use swift3 playgound to test and "let rss = self.split ......." here get an error. String don't have split function.
0

Simple solution without libraries - just use String.replaceOccurences(of:... to replace odd strings like a href, a into split parameter (like "|") and then use componentsSeparatedByString("|") to get your components.

Comments

0

Use the regular expression for extracting the URL. Below I have written the code snippet.

        let text = "<a href=\"https://www.google.com\">"

        let regex = try! NSRegularExpression(pattern: "<a[^>]+href=\"(.*?)\"[^>]*>")
        let range = NSMakeRange(0, text.characters.count)
        let matches = regex.matches(in: text, range: range)
        for match in matches {
            let strURL = (text as NSString).substring(with: match.rangeAt(1))
            print(strURL)
        }

1 Comment

But my text is <a href=\"google.com.tw\">google.com.tw </a> . And I want to print the array not string

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.