Detect all images with Javascript in an html page

Question

I am writing a chrome extension and I am trying to detect all images in a webpage.

I am trying in my JS code to detect all images on a webpage, and by all I mean:

Images that are loaded once the webpage is loaded
Images that are used as background (either in the CSS or inline html)
Images that could be loaded after the webpage is done loading, for instance, when doing a google image search it is easy to find all images, but once you click on one image to make it bigger, this image is not detected. Same thing for browsing social media website.

The code that I have right now makes it easy to find the initial images (1). But I struggle with the other two parts (2) and (3).

Here is my current code in contentScript.js:

var images = document.getElementsByTagName('img');
for (var i = 0, l = images.length; i < l; i++) {
    //Do something
}

How should I modify it so that it actually can detect all other images (2 and 3).

I have seen a couple of questions on (2) on SO like this one or this one, but none of the answers seem to completely satisfy my second requirement and none of them is about the third.

Regarding Point 3. Couldn't you just do a setInterval() and check if any new images are in the DOM? — filip
– filip, Commented Sep 21, 2018 at 13:31
@filip seems quite computationally heavy (especially if you want new images to be detected right away, which is one requirement I have). I was thinking more of something like catching events. Isn't there any event that I could use to know that something has been added to the DOM and just check what that something contains to see if there is an image? — LBes
– LBes, Commented Sep 21, 2018 at 13:33
Found something called MutationObserver, which checks for changes in the DOM (for example adding an <img> tag) developer.mozilla.org/en-US/docs/Web/API/MutationObserver @LBes — filip
– filip, Commented Sep 21, 2018 at 13:38
Interesting @filip will give this a try when I'm back from work — LBes
– LBes, Commented Sep 21, 2018 at 13:41

jla · Accepted Answer · 2018-10-11 13:34:57Z

5

+50

Live collection of imgs

To find all HTML images, as @vsync has said, is as simple as var images = document.images. This will be a live list so any images that are dynamically added or removed from the page will be automatically reflected in the list.

Extracting background images (inline and CSS)

There are a few ways to check for background images, but perhaps the most reliable way is to iterate over all the page's elements and use window.getComputedStyle to check if each element's backgroundImage does not equal none. This will get background images set both inline and in CSS.

var images = [];
var elements = document.body.getElementsByTagName("*");
Array.prototype.forEach.call( elements, function ( el ) {
    var style = window.getComputedStyle( el, false );
    if ( style.backgroundImage != "none" ) {
        images.push( style.backgroundImage.slice( 4, -1 ).replace(/['"]/g, "")
    }
}

Getting the background image from window.getComputedStyle will return the full CSS background-image property, in the form url(...) so you will need to remove the url( and ). You'll also need to remove any " or ' surrounding the URL. You might accomplish this using backgroundImage.slice( 4, -1 ).replace(/['"]/g, "")

Only start checking once the DOM is ready, otherwise your initial scan might miss elements.

Dynamically added background images

This will not provide a live list, so you will need a MutationObserver to watch the document, and check any changed elements for the presence of backgroundImage.

When configuring your observer, make sure your MutationObserver config has childList and subtree set to true. This means it can watch all children of the specified element (in your case the body).

var body = document.body;
var callback = function( mutationsList, observer ){
    for( var mutation of mutationsList ) {
        if ( mutation.type == 'childList' ) {
            // all changed children are in mutation.target.children
            // so iterate over them as in the code sample above
        }
    }
}
var observer = new MutationObserver( callback );
var config = { characterData: true,
            attributes: false,
            childList: true,
            subtree: true };
observer.observe( body, config );

Since searching for background images requires you to check every element in the DOM, you might as well check for <img>s at the same time, rather than using document.images.

Code

You would want to modify the code above so that, in addition to checking if it has a background image, you would check if its tag name is IMG. You should also put it in a function that runs when the DOM is ready.

UPDATE: To differentiate between images and background images, you could push them to different arrays, for example to images and bg_images. To also identify the parents of images, you would push the image.parentNode to a third array, eg image_parents.

var images = [],
    bg_images = [],
    image_parents = [];
document.addEventListener('DOMContentLoaded', function () {
    var body = document.body;
    var elements = document.body.getElementsByTagName("*");

    /* When the DOM is ready find all the images and background images
        initially loaded */
    Array.prototype.forEach.call( elements, function ( el ) {
        var style = window.getComputedStyle( el, false );
        if ( el.tagName === "IMG" ) {
            images.push( el.src ); // save image src
            image_parents.push( el.parentNode ); // save image parent

        } else if ( style.backgroundImage != "none" ) {
            bg_images.push( style.backgroundImage.slice( 4, -1 ).replace(/['"]/g, "") // save background image url
        }
    }

    /* MutationObserver callback to add images when the body changes */
    var callback = function( mutationsList, observer ){
        for( var mutation of mutationsList ) {
            if ( mutation.type == 'childList' ) {
                Array.prototype.forEach.call( mutation.target.children, function ( child ) {
                    var style = child.currentStyle || window.getComputedStyle(child, false);
                    if ( child.tagName === "IMG" ) {
                        images.push( child.src ); // save image src
                        image_parents.push( child.parentNode ); // save image parent
                    } else if ( style.backgroundImage != "none" ) {
                        bg_images.push( style.backgroundImage.slice( 4, -1 ).replace(/['"]/g, "") // save background image url
                    }
                } );
            }
        }
    }
    var observer = new MutationObserver( callback );
    var config = { characterData: true,
                attributes: false,
                childList: true,
                subtree: true };

    observer.observe( body, config );
});

edited Oct 11, 2018 at 13:34

answered Oct 9, 2018 at 12:45

jla

4,8213 gold badges32 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

LBes Over a year ago

The first part of your answers states that "var iamges = document.images. This will be a live list so any images that are dynamically added or removed from the page will be automatically reflected in the list." I have that already and it doesn't work at all on newly loaded content (cf clicking an image on google image, or scrolling on social media)

jla Over a year ago

@LBes interesting, the spec indicates it should be a live collection. document.images is quite an old standard, the HTML spec suggests getElementsByTagName("img") as another option. However using the MutationObserver solution and checking changed elements for IMG tag names as well as background images is likely to be the most robust and complete solution.

LBes Over a year ago

which is what the second answer suggested, but see my comments there.

jla Over a year ago

@LBes The error is not of type 'Node' means that the element to which you're attaching the observer does not exist when your code runs. As suggested in my answer use document.addEventListener('DOMContentLoaded', function () { to wait till the DOM loads; if the element loads after the DOM see stackoverflow.com/questions/40398054/… for a solution that polls until the element is ready.

LBes Over a year ago

thanks for the update. Yes it was correctly set to true... I also don't understand why it wouldn't work. Accepting the answer anyway as it is a very specific case, but I'd still like to figure it out. If you have ideas, feel free to send them my way :)

|

Community · Accepted Answer · 2020-06-20 09:12:55Z

2

For HTML images (which already exist by the time you run this):

document.images

For CSS images:

You would need to probably use REGEX on the page's CSS (either inline or external files), but this is tricky because you would need to dynamically build the full path out of the relative paths and that might not always work.

Getting all css used in html file

For delayed-loaded images:

You can use a mutation observer, like @filip has suggest in his answer

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Sep 21, 2018 at 14:09

vsync

133k61 gold badges345 silver badges430 bronze badges

1 Comment

LBes Over a year ago

Ok thanks for the pointers. +1. I had this fear that it would not always work indeed.

filip · Accepted Answer · 2018-09-21 14:12:52Z

1

This should solve your 3. problem. I used a MutationObserver.

I check the targetNode for changes and add a callback, if a change happens.

For your case the targetNode should be the root element to check changes in the whole document.

In the callback I ask if the mutation has added a Node or not with the "IMG" tag.

    const targetNode = document.getElementById("root");

    // Options for the observer (which mutations to observe)
    let config = { attributes: true, childList: true, subtree: true };

    // Callback function to execute when mutations are observed
    const callback = function(mutationsList, observer) {
        for(let mutation of mutationsList) {
            if (mutation.addedNodes[0].tagName==="IMG") {
                console.log("New Image added in DOM!");
            }   
        }
    };

    // Create an observer instance linked to the callback function
    const observer = new MutationObserver(callback);

    // Start observing the target node for configured mutations
    observer.observe(targetNode, config);

edited Sep 21, 2018 at 14:12

answered Sep 21, 2018 at 14:07

filip

7092 gold badges11 silver badges32 bronze badges

10 Comments

LBes Over a year ago

Ok seems like the way to go. Upvote! but I get a "Failed to execute 'observe' on 'MutationObserver': parameter 1 is not of type 'Node'." on line observer.observe(targetNode, config);

filip Over a year ago

@LBes what did you set the targetNode var to? You can't just use document.body or something like that. I defined an id to the <html> tag and then got it by document.getElementByID("root");

LBes Over a year ago

I did use document.body. But then what do you suggest using, not sure I understand what you mean

filip Over a year ago

@LBes HTML: <html id="root">....</html> JS: targetNode = document.getElementById("root");

LBes Over a year ago

that cannot work for me. It's a chrome extension, it has to work on every page :s

|

Collectives™ on Stack Overflow

Detect all images with Javascript in an html page

3 Answers 3

Live collection of imgs

Extracting background images (inline and CSS)

Dynamically added background images

Code

10 Comments

For HTML images (which already exist by the time you run this):

For CSS images:

For delayed-loaded images:

1 Comment

10 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Live collection of imgs

Extracting background images (inline and CSS)

Dynamically added background images

Code

10 Comments

For HTML images (which already exist by the time you run this):

For CSS images:

For delayed-loaded images:

1 Comment

10 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related