1

I have text as

<p>Some text to extract</p>

Is there a way by which i can get the text between the tags in as3. That is "Some text to extract" only.

I have tried using regular expressions

string.match(/<p>(.*?)<\/p>/g)

but its returning with <p> tags.

Also similarly need to extract the text from :

<caption><![CDATA[<p>Some text to extract.<span> -- Span text</span></p>]]></caption>

Thanks

2 Answers 2

1

This should do:)

var reg:RegExp = /<p>(.*?)<\/p>/gi;

var str:String = "<p>Some text to extract</p>";

var raw:String = str.replace(reg, "$1");

trace("str", str);//str <p>Some text to extract</p>
trace("raw", raw);//raw Some text to extract
Sign up to request clarification or add additional context in comments.

4 Comments

ok thanks this works, but if i got cdata or any other tag like in <caption><![CDATA[<p>Some text to extract.<span> -- Span text</span></p>]]></caption> how can i remove all the tags and cdata info using regex.
in the same way - first fortify:) yourself with the RegExr tool then with RegEx tutorials and write the regex that fits your needs - you may also find it written by someone:)
yes thanks i got it working using regular expressions only, though i have to do it more than 1 step.
my point is that you can write RegEx that will recognise any tag and fetch it's content, so if you know which tags you want to target then you could write single regex that will match those tags (so just one step).
1

If your tags are proper, you could try parsing it as xml. This will work on your given example:

var input:String = "<p>Some text to extract</p>";
var xml:XML = new XML(input);
trace(xml.text().toString()); // traces "Some text to extract"

Edit

The following is not really a clean answer...I couldn't get it until I spent some time messing with it. You may not want to accept this as an answer, but I'm posting it as I did manage to get the result...maybe someone else can make it cleaner.

I had never really encountered a case where the node I'm interested in (the

node in this case) had text content AND a child node (same with CDATA in my xml). The code below is after some random guessing and checking the api. Learn something new everyday. =b

var inputString:String = "<caption><![CDATA[<p>Some text to extract.<span> -- Span text</span></p>]]></caption>";

var xml:XML = new XML(inputString);

// oddly this seems to filter out the caption and CDATA tag...but the resulting output is all in 1 element still
trace(xml); // traces out: <p>Some text to extract.<span> -- Span text</span></p>

xml = new XML(xml.toString()); // turn this into xml again

trace(xml); // this looks better now...traces out the expected xml

trace("{"+ xml.p +"}"); // traces out blank for some reason...
trace(xml.span); // traces out the expected span tag contents: "-- Span text"

trace(xml.descendants()[0]); // traces out "Some text to extract." -got it!
trace(xml.descendants()[1]); // traces out "-- Span text"

2 Comments

thanks this worked out , but if i have something like <caption><![CDATA[<p>Some text to extract.<span> -- Span text</span></p>]]></caption> will xml parsing work over here?
and of course when the inputString contains valid XML and we all know that HTML is not always valid XML:) in that case the nw XML(inputString) throws error.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.