0

I have a string

String customHtml = "<html><body><iframe src=https://zarabol.rediff.com/widget/end-of-cold-war-salman-hugs-abhishek-bachchan?search=true&header=true id=rediff_zarabol_widget name=rediff_zarabol_widget scrolling=auto transparency= frameborder=0 height=500 width=100%></iframe></body></html>";

I need to replace the last index of weburl with another string. In the above example replace

end-of-cold-war-salman-hugs-abhishek-bachchan

with

srk-confesses-found-gauri-to-be-physically-attractive

I tried using Lazy /begin.*?end/ but it fails. Any help will be highly appreciated. Thanks in advance.

6
  • You definitely need a html parser. Do not use regex for html content. It slows down. Commented Jul 22, 2014 at 7:27
  • I don't think regex is the right tool for parsing HTML. You might gain something from reading stackoverflow.com/q/1732348 Commented Jul 22, 2014 at 7:27
  • try using JSoup for parsing HTML instead of regex based match. Commented Jul 22, 2014 at 7:29
  • I think regex is the right choice because html parser can parse only the whole url in this case Commented Jul 22, 2014 at 7:30
  • @Faheem did all these tags are on a single line? Commented Jul 22, 2014 at 7:31

4 Answers 4

2

Regex:

(?<=\/)[^\/]*(?=\?)

Java regex:

(?<=/)[^/]*(?=\\?)

Replacement string:

srk-confesses-found-gauri-to-be-physically-attractive

DEMO

Java code would be,

String url= "<html><body><iframe src=https://zarabol.rediff.com/widget/end-of-cold-war-salman-hugs-abhishek-bachchan?search=true&header=true id=rediff_zarabol_widget name=rediff_zarabol_widget scrolling=auto transparency= frameborder=0 height=500 width=100%></iframe></body></html>";
String m1 = url.replaceAll("(?<=\\/)[^\\/]*(?=\\?)", "srk-confesses-found-gauri-to-be-physically-attractive");
System.out.println(m1);

IDEONE

Sign up to request clarification or add additional context in comments.

1 Comment

No need to escape forward slashes.
1

This should do it:

url = url.replaceAll("(?<=/)[^/?]+(?=\\?)", "your new text");

Comments

0

Regex: [^/]*?(?:\?)

You must remove "/" from regex.

Comments

0

As others have said, a DOM parser would be better. For completion, here is a regex solution that will work for your input:

String replaced = yourString.replaceAll("(https://\\S+/)[^?]+", 
                    "$1srk-confesses-found-gauri-to-be-physically-attractive");

Explanation

  • (https://\\S+/) captures to Group 1 the literal https://, any chars that are not white-spaces \S+, and a forward slash /
  • [^?]+ matches any chars that are not ? (the text to replace)
  • We replace with $1 Group 1 (unchanged) and the text you specified

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.