0

Am using the following version of JSoup (along with Java 1.7):

<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.11.3</version>
</dependency>

My code:

public class HtmlTagUtils {

    private static String mockHtml = "<asset-entity type=\"photo\" id=\"1236ad76-7433-fs34-50d1-b12bdbc308899af\">"
+ "</asset-entity>\r\nAngelie Jolie was seen at Wholefoods with ex-beau Brad Pitt.\r\n <asset-entity type=\"photo\" id=\"2346fe7d-c175-c380-4ab2-dda068b42b033dvf\">"
+ "</asset-entity>\r\n- The majority of their kids were with them.\r\n<asset-entity type=\"video\" id=\"45064086-5d85-1866-4afc-a523c04c2b3e43b6\"> </asset-entity>\r\n";

    public static List<String> extractIdsForPhotos(String html) {
        Document doc = Jsoup.parse(html);
        Elements elements = doc.select("asset-entity[type=photo]");
        List<String> photos = new ArrayList<>();
        for (Element element : elements) {
            String type = element.attributes().get("type");
            String id = element.attributes().get("id");
            photos.add(id);
        }
        return photos;
    } 

    public static List<String> extractIdsForVideos(String html) {
        Document doc = Jsoup.parse(html);
        Elements elements = doc.select("asset-entity[type=video]");
        List<String> videos = new ArrayList<>();
        for (Element element : elements) {
            String type = element.attributes().get("type");
            String id = element.attributes().get("id");
            videos.add(id);
        }
        return videos;
    }

    public static void main (String args []) {
        List<String> photoIds = extractIdsForPhotos(mockHtml);
        for (String photoId : photoIds) {
            System.out.println("\n\tphotoId: " + photoId);
        }

        List<String> videoIds = extractIdsForVideos(mockHtml);
        for (String videoId : videoIds) {
            System.out.println("\n\tvideoId: " + videoId);
        }
    }       
}

Receive the following output to stdout:

photoId: 1236ad76-7433-fs34-50d1-b12bdbc308899af
photoId: 2346fe7d-c175-c380-4ab2-dda068b42b033dvf
videoId: 45064086-5d85-1866-4afc-a523c04c2b3e43b6

Am able to find the necessary assets based on these ids but my question is how to replace the entire tag (along with its contents, inline) using JSoup (e.g. for photos):

<asset-entity type=\"photo\" id=\"4806ad76-7433-fs34-50d1-b12bdbc308899ad\">" + "</asset-entity>

with:

<img src="AngelinaJolie.jpg"> 

So the converted HTML would look like this:

"<img src="AngelinaJolie.jpg">\r\nAngelie Jolie was seen at Wholefoods with ex-beau Brad Pitt.\r\n <img src="BradPitt.jpg">
\r\n- The majority of their kids were with them.\r\n<video><source src="Brangelina.mp4" type="video/mp4"></video>\r\n";

Can anyone point me in the right direction?

2
  • It looks like you're trying to reinvent template engines. Unless there's an existing system that you're needing to replace exactly, use something like Thymeleaf. Commented Oct 16, 2018 at 8:46
  • @Chrylis I am not trying to reinvent template engines. Maybe my post was too detailed. I just want to know how to replace a custom HTML tag with content that I already have. Commented Oct 16, 2018 at 8:59

1 Answer 1

1

You can actually change the tagName of the element and try replacing its attributes with your attributes.

        Document doc = Jsoup.parse(html);
        doc.outputSettings().prettyPrint(false);
        Elements elements = doc.select("asset-entity[type=photo]");
        for (Element element : elements) {
            element.tagName("img");
            element.removeAttr("type");
            element.removeAttr("id");
            element.attr("src","AngelinaJolie.jpg");
        }
        String formattedHtml = doc.html();
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for responding... This is only for one tag, like I said, I have different tags (with different content - e.g. photo and video). How can I do it if I had a list of different <img src> based on the list of ids?
If you have the src for the ids of the elements, for example, say hashmap containing id and its corresponding src then we can query the hashmap based on id and set the elements src.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.