1

Is there a way to detect and remove zero padding within an image array? In a way my question is very similar to this except the image has already been rotated and I do not know the angle.

I am basically cropping a box out of a larger image which may have zero padding around the edges (due to translations or rotations). Now it's possible that the crop may contain some of this padding. However, in such cases, I want to clip the box where the padding edge starts. The images are in a CHW (can be easily changed to HWC).

The paddings in this case will be 0s in all channels. However, due to rotations, it's possible that sometimes, the 0s might not always be in completely horizontal or vertical strips in the array. Is there a way to detect if there are 0s going all the way to the edge in the array and at what location the edges start?

Example 1 where arr is an image with 3 channels and width and height of 4 (3, 4, 4) and the crop contains vertical padding on the rightmost edge:

array([[[1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.]],

       [[1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.]],

       [[1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.]]])

In this example, I would slice the array as such to get rid of the zero padding: arr[:, :, :-1]

Example 2 where we have some padding on the top right corner:

array([[[1., 1., 0., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 0., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 0., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In this example, I would clip the image to remove any padding by returning arr2[:, 1:, :-1].

I want to do this in Tensorflow so tensor operations would be great but I am trying to figure out any algorithm, for example using numpy, that can achieve this result.

15
  • Sum along the columns and then along the rows. Any column or row with a sum of zero should be removed. Commented Aug 4, 2020 at 4:27
  • 1
    @DavidHoffman Yes, I can do that for rows and columns starting at the edges and would work for example 1. However, this would fail when the 0s are not strictly vertical or horizontal (example 2). Commented Aug 4, 2020 at 5:46
  • Scan the image outline and seed-fill every time you meet a zero. In the end all zero pixels will be filled. How you remove them is your policy. Commented Aug 4, 2020 at 7:42
  • is it possible for the image to contain padding_value (in this case 0)? If so, is it possible for the image to contain it on it's border? Should this be preserved? Commented Aug 4, 2020 at 9:25
  • also, arr2[:, 1:, :-1] will save 9 pixels, whereas the crops you suggested will only save 8 pixels respectively. Commented Aug 4, 2020 at 9:33

3 Answers 3

1

If you don't mind throwing away some of the image and are okay with a liberal crop as long as it doesn't contain padding, you can get a quite efficient solution:

pad_value = 0.0
arr = <test_image>
arr_masked = np.all(arr != pad_value , axis=0)
y_low = np.max(np.argmax(arr_masked, axis=0))
x_low = np.max(np.argmax(arr_masked, axis=1))
y_high = np.min(arr_masked.shape[0] - np.argmax(arr_masked[::-1, :], axis=0))
x_high = np.min(arr_masked.shape[1] - np.argmax(arr_masked[:, ::-1], axis=1))
arr[:, y_low:y_high, x_low:x_high]

If it has to be the biggest possible crop then more work is needed. Essentially we have to check every contiguous sub-image if it contains padding and then compare them all for size.

Main Idea: Assume that the top-left corner of the padding free sub-image is at (x1,y1) and the bottom-right corner is at (x2, y2) then we can understand the number of pixels in the subarray as a rank-4 tensor with dimensions [y1, x1, y2, x2]. We set the number of pixels to 0 if the combination is not a valid sub-image, i.e., if it has a negative width or height, or it contains a padded pixel.

pad_value = 0.0
arr = <test_image>

# indices for sub-image tensor
y = np.arange(arr_masked.shape[0])
x = np.arange(arr_masked.shape[1])
y1 = y[:, None, None, None]
y2 = y[None, None, :, None]
x1 = x[None, :, None, None]
x2 = x[None, None, None, :]

# coordinates of padded pixels
arr_masked = np.all(arr != pad_value , axis=0)
pad_north = np.argmax(arr_masked, axis=0)
pad_west = np.argmax(arr_masked, axis=1)
pad_south = arr_masked.shape[0] - np.argmax(arr_masked[::-1, :], axis=0)
pad_east = arr_masked.shape[1] - np.argmax(arr_masked[:, ::-1], axis=1)

is_padded = np.zeros_like(arr_masked)
is_padded[y[:, None] < pad_north[None, :]] = True
is_padded[y[:, None] >= pad_south[None, :]] = True
is_padded[x[None, :] < pad_west[:, None]] = True
is_padded[x[None, :] >= pad_east[:, None]] = True

y_padded, x_padded = np.where(is_padded)
y_padded = y_padded[None, None, None, None, :]
x_padded = x_padded[None, None, None, None, :]

# size of the sub-image
height = np.clip(y2 - y1 + 1, 0, None)
width = np.clip(x2 - x1 + 1, 0, None)
img_size = width * height

# sub-image contains at least one padded pixel
y_inside = np.logical_and(y1[..., None] <= y_padded, y_padded<= y2[..., None])
x_inside = np.logical_and(x1[..., None] <= x_padded, x_padded<= x2[..., None])
contains_border = np.any(np.logical_and(y_inside, x_inside), axis=-1)

# ignore sub-images containing padded pixels
img_size[contains_border] = 0

# find all largest sub-images
tmp = np.where(img_size == np.max(img_size))
rectangles = (tmp[0], tmp[1], tmp[2]+1, tmp[3]+1)

Now rectangles contains all the corners for the sub-images that have the largest number of pixels without containing any border pixels. It is already quite vectorized, so you should be able to migrate this from numpy to tensorflow.

Sign up to request clarification or add additional context in comments.

2 Comments

So I have tried out your algorithm and it seems to be working for the most part. I need to implement this in tensorflow as my input is going to be a tensor but that shouldn't be too difficult. Could you, however, break it down and explain exactly how you are getting the sub-images e.g. in y_inside = np.logical_and(y1[..., None] <= y_padded, y_padded<= y2[..., None]) and the shapes used in x1, x2, y1, y2. I understand the overall procedure but I am having a little bit of a hard time understanding the implementation here and trying to get a better intuition for each step.
@skbrhmn The sequential idea here is that we take each border/padding point and test if it is contained within the sub-image rectangle. Given a rectangle defined by its corners (y1, x1, y2, x2) we can test if it contains a point by checking if the point's X value is within [x1, x2] and the y-value is within [y1, y2]. The line that you quoted tests the latter (vectorized). We then set contains_border to True, if any border/padding pixel is inside the sub-image.
1

Please try this solution:

def remove_zero_pad(image):
    dummy = np.argwhere(image != 0) # assume blackground is zero
    max_y = dummy[:, 0].max()
    min_y = dummy[:, 0].min()
    min_x = dummy[:, 1].min()
    max_x = dummy[:, 1].max()
    crop_image = image[min_y:max_y, min_x:max_x]

    return crop_image

2 Comments

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
I believe this solution assumes that zeros appear only at the edge. While unlikely, it is still possible t have a zero spot somewhere within the image. The question is specific to cases of zero padding, that is, zeros that are contiguous all the way to the edge and are usually straight lines (not strictly vertical/horizontal). This solution would fail if a zero spot appears in the middle of the icon or is part of the actual image and not padding.
0

My solution takes a different approach, assuming the padded part is black or white:

  1. Go through each row and column in the image and see if the average is 0 or 255.
  2. If is exactly either number delete that row or column completely from the image.
  3. Do this for the entire picture

I use Google Colab so it might be a little different:

import numpy as np
import cv2
from google.colab.patches import cv2_imshow

pic = cv2.imread(your image)
cv2_imshow(pic)
print(pic.shape)

x = pic.shape[0]
y = pic.shape[1]
dpadim = pic

#get rid of y padding
for i in range(x):
  row = pic[i,:,:]
  if(np.average(row)==0):
    dpadim = np.delete(dpadim,0,0)

#get rid of y padding
for j in range(y):
  col = pic[:,j,:]
  if(np.average(col)==0):
    dpadim = np.delete(dpadim,-1,1)

#delete row ignore by range function
x2 = dpadim.shape[0]
if((x-x2)>1):
  dpadim = np.delete(dpadim,0,0)
  print(x,x2)
cv2_imshow(dpadim)

This is what before and after look like.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.