3

I am currently getting response error in html format. It is of type string.

"<!DOCTYPE html>\r\n
<html>
  <head>
    <title>Data already exists</title>
  </head>
</html>"

I wanted to retrieve the content inside the <title>, for above instance "Data already exists". Can anybody suggest a appropriate regular expression to capture that text.

Please any help is appreciated!

1
  • I really appreciate everyone's suggestion and thanks for taking time to share the knowledge. You guys are awesome. Commented Aug 29, 2012 at 14:07

3 Answers 3

5

First, you can do it without regex, by creating a dummy element to inject the HTML:

var s = "your_html_string";
var dummy = document.createElement("div");
dummy.innerHTML = s;
var title = dummy.getElementsByTagName("title")[0].innerText;

But if you really insist on using regex:

var s = "your_html_string";
var title = s.match(/<title>([^<]+)<\/title>/)[1];

Here's a DEMO illustrating both approaches.

Sign up to request clarification or add additional context in comments.

5 Comments

You don't need to use getElementsByTagName, there is a document.title property that is more convenient. Also, the title element can have attributes, so the regular expression needs to be more sophisticated (parsing HTML with a regular expression is generally a bad idea).
@RobG: I absolutely agree that parsing HTML with a regex is generally a bad idea; however, OP explicitly said that it it was a response error that follows the above format. document.title will get the current document's title. Note that OP is no trying to parse the current document but a specific response message (probably from an ajax call).
Hmm... One line of regex, or three lines of dummy element manipulation? One or three? I know which I'd choose. (I too agree that in a general sense parsing HTML with regex is not the way to go, but as you said João, for a specific case with a known format I think it is OK.)
Yes, all good. The OP could use the response text to create a new document, then just use document.title.
I really appreciate everyone's suggestion and thanks for taking time to share the knowledge. You guys are awesome.
2

The very basics of parsing html tags in regex is this. http://jsbin.com/oqivup/1/edit

var text = /<(title)>(.+)<\/\1>/.exec(html).pop();

But for more complicated stuff I would consider using a proper parser.

2 Comments

Given the response is already a string can't you skip the jQuery line?
I really appreciate everyone's suggestion and thanks for taking time to share the knowledge. You guys are awesome.
1

You could parse it using DOMParser():

var parser=new DOMParser(),
    doc=parser.parseFromString("<!DOCTYPE html><html><head><title>Data already exists</title></head></html>","text/html");

doc.title; /* "Data already exists" */

10 Comments

You probably need to use an ActiveXObject for IE < 9.
and how we can use the doc variable with jquery?
@DariushJafari Do you mean $(doc)?
Chrome 23 Canary doesn't parse HTML with DOMParser though. If the HTML string is XML-valid, you can always use the application/xml parsing for cross-browser parsing.
@Oriol how do you select some elements of doc? $('div.cc') selects the current document elements.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.