0

I'm trying to pseudo translate the text embedded within HTML in a string. I don't want to touch the actual html tags or its attributed, just the content.

So for example, if I have something like:

<td colspan='2'><a>This is a Text in <b>Bold</b></a></td>

I want this to be eventually modified into

<td colspan='2'><a>Thìs ís à Tèxt îñ <b>Bòlð</b></a></td>

1) I can't use any third party libraries, so I'm using standard regex to parse html 2) I tried both pattern.match() and pattern.split() but both seem to have a few limitations. pattern.split() helps with splitting the string based on a regex pattern, but I lose the actual pattern in that process. Pattern.match helps with retaining the pattern, but I can't guarentee the markup.

So ideally I would want something to take the string with HTML and break it into an array like

array[0]: HTML Tag
array[1]: Plain Text
array[2]: HTML Tag
array[3]: Plain Text
array[4]: HTML Tag
array[5]: Plain Text
array[6]: HTML Tag

Any ideas ?

5
  • 1
    I'd look into an HTML parsing lib like Jsoup to do what you want Commented Nov 30, 2015 at 21:48
  • @RyanJ in his question he states: 1) I can't use any third party libraries, so I'm using standard regex to parse html Commented Nov 30, 2015 at 21:49
  • Show us the code that you are using at the moment. Commented Nov 30, 2015 at 21:50
  • Forcing you to use regex for parsing any html/xml style language is lunacy. The first rule of regex re:html is don't parse html with regex. stackoverflow.com/questions/590747/… Commented Nov 30, 2015 at 21:52
  • Possible duplicate of Question about parsing HTML using Regex and Java Commented Nov 30, 2015 at 21:54

1 Answer 1

0

As regex, you could use this one:

(?<=>)[^>]+(?=<)

I'm assuming here that you have a replace function that can take a captured group and mingle its text:

String str = "<td colspan='2'><a>This is a Text in <b>Bold</b></a></td>";
str.replaceAll("(?<=>)[^>]+(?=<)","");

However, without knowing how you intend to "pseudotranslate" a string, we can't really help you further. For custom replacement methods, this answer may be useful.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.