1

I am working with numpy.ndarray including 286 images with the shape of (286, 16, 16, 3). Each image contains 3 bands with varying pixel values with float32 data types. The maximum value of pixel value in each band can be more than 255. Is it possible to normalize this numpy.ndarray between [0-1]?

code for reading the images:

inputPath='E:/Notebooks/data'

images = []

# Load in the images
for filepath in os.listdir(inputPath):
    images.append(cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)))
5
  • normalize each of the 286 images or over all images? Commented Jun 8, 2021 at 10:57
  • what do you mean by normalize ? taking minimum and max float32 and put them at 0 and 255 and distribute the in between values in the 0 to 255 range ? or something else ? Commented Jun 8, 2021 at 11:04
  • Normalize each of the 256 images within this data set. Commented Jun 8, 2021 at 11:07
  • uch that's I believe its more difficult. here: stackoverflow.com/questions/1735025/… I found # Normalised [0,255] as integer: don't forget the parenthesis before astype(int) c = (255*(a - np.min(a))/np.ptp(a)).astype(int) but I believe it normaize over all the images, I can only thing splitting your array in 286 images and apply it. Maybe there is another way Commented Jun 8, 2021 at 11:11
  • ii = (255*(i - np.min(i))/np.ptp(i)).astype(int) gives RuntimeWarning: invalid value encountered in true_divide with numpy array of float32 type !!! need to figure out why Commented Jun 8, 2021 at 17:08

3 Answers 3

2

If you want the range of values of every image to be between 0 and 255, you could loop over the images, calculate min and max of the original image and squeeze them, so the minimum is 0 and the maximum is 255.

import numpy as np
#images = np.random.rand(286,16,16,3)
images = np.random.rand(286,16,16,3).astype(np.float32)

for nr,img in enumerate(images):
    min = np.min(img)
    max = np.max(img)
#   images[nr] = (img - min) * (255/(max-min))
    images[nr] = (img - min) / (max - min) * 255
Sign up to request clarification or add additional context in comments.

12 Comments

Each image contains 3 bands with varying pixel values with float32 data types
works the same, doesn't it? Just use images = np.random.rand(286,16,16,3).astype(np.float32)
havent checked, I'll try if I have time
tried your code(original one) I get: <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 255.00000000000003 <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 255.0 <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 255.0 <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 255.0 <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 254.99999999999997 not sure floats are permitted
images = np.random.rand(286,16,16,3).astype(np.float32) works with same glitch of float64
|
1

Vectorized is much faster than iterative

If you want to scale the pixel values of all your images using numpy arrays only, you may want to keep the vectorized nature of the operation (by avoiding loops).

Here is a way to scale your images :

# Getting min and max per image
maxis = images.max(axis=(1,2,3))
minis = images.min(axis=(1,2,3))
# Scaling without any loop
scaled_images = ((images.T - minis) / (maxis - minis) * 255).T
# timeit > 178 µs ± 1.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

The transposes .T were necessary here to broadcast correctly the subtraction.

We can check if this is correct:

print((scaled_images.min(axis=(1,2,3)) == 0).all())
# > True
print((scaled_images.max(axis=(1,2,3)) == 255).all())
# > True

Scaling into the [0, 1] range

If you want pixel values between 0and 1, we simply remove the x255 multiplication:

scaled_images = ((images.T - minis) / (maxis - minis)).T

Only with numpy arrays and such

You must also make sure you are handling a numpy array in the first place, not a list :

import numpy as np
images = np.array(images)

OpenCV

On-the-go scaling

Since you are using opencv to read your images one by one, you can normalize your images on the go with it:

inputPath='E:/Notebooks/data'

max_scale = 1   # or 255 if needed
# Load in the images 
images = [cv2.normalize(
    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)),
    None, 0, max_scale, cv2.NORM_MINMAX)
    for filepath in os.listdir(inputPath)]

Make sure you have images in the folder

inputPath='E:/Notebooks/data'
images = []

max_scale = 1   # or 255 if needed

# Load in the images 
for filepath in os.listdir(inputPath):
    image = cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH))
    # Scale and append the list if it is an image
    if image is not None:
        images.append(cv2.normalize(image, None, 0, max_scale, cv2.NORM_MINMAX))

Bug on versions of open-cv prior to 3.4

As reported here, there is a bug with opencv's normalize method producing values below the alpha parameter. It was corrected on version 3.4.

Here is a way to scale images on-the-go with older versions of open-cv:

def custom_scale(img, max_scale=1):
    mini = img.min()
    return (img - mini) / (img.max() - mini) * max_scale

max_scale = 1   # or 255 if needed

images = [custom_scale(
    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)), max_scale)
    for filepath in os.listdir(inputPath)]

15 Comments

I added the code for reading the images in the question. is it possible to normalize the images between [0-1]? when I apply your code it returns error: AttributeError: 'list' object has no attribute 'max'
I updated the answer. You said you had an numpy ndarray but it seems that images was a list in your case.
Thanks for the reply. I use the list to read in the 286 images. when I check the print(type(images[1])) result is class 'numpy.ndarray. if I use images = np.array(images), can I read in all the images?
Yes, you had numpy arrays inside a list called "images". I mentioned in my last edit that you should use opencv to normalize your images on the go, since you are already using it and adding your images iteratively. You don't need to use numpy or to cast your list into an array, for that.
Thanks for your help and time. I could run it in google colab and the negative values are 0. I am not sure why I can not upgrade and even uninstall open cv from anaconda.
|
0

I've figured out this piece of code:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Jun  8 13:19:17 2021

@author: Pietro


https://stackoverflow.com/questions/67885596/how-numpy-ndarray-can-be-normalized

"""


import numpy as np

arrayz = np.array(np.random.randn(286,16,16,3), dtype=np.float32)

print(arrayz.shape)

print((arrayz.size))

print(arrayz[0,0,0,:],'            ',type(arrayz[0,0,0,:]))
print(arrayz[0,0,0,0],'            ',type(arrayz[0,0,0,0]))

print(np.min(arrayz),'     ',np.max(arrayz))


print(np.min(arrayz),'     ',np.max(arrayz))

arrayz_split = np.split(arrayz,286,0)

print(type(arrayz_split))

for i in arrayz_split:
    print(i.size,'  ', i.shape,'  ',  np.min(i),'   ', np.max(i))

arrayz_split_flat = []

for i in arrayz_split:
    ii = i[0]
    arrayz_split_flat.append(ii)
    
for i in arrayz_split_flat:
    print(type(i),'  ',i.size,'  ', i.shape,'  ',  np.min(i),'   ', np.max(i))
    
arrayz_split_flat_norm = []



for i in arrayz_split_flat:
      minz = np.min(i)
      manz = np.max(i)
      ii = ((i-minz)/(manz-minz)*255).astype(np.uint8)
      
      arrayz_split_flat_norm.append(ii)

for i in arrayz_split_flat_norm:
    
    print(type(i),'  ',i.size,'  ', i.shape,'  ',  np.min(i),'   ', np.max(i))

out_arr1 = np.stack((arrayz_split_flat_norm), axis = 0) 

print(type(out_arr1), out_arr1.size, '  ', out_arr1.shape, ' ',np.min(out_arr1),np.max(out_arr1), out_arr1[0,0,0,:],out_arr1[0,0,0,0])

I don't understand why:

arrayz = np.array(np.random.randn(286,16,16,3), dtype=np.float32)

seems to work while using:

arrayz1 = np.ndarray((286,16,16,3), dtype="float32")
arrayz = np.nan_to_num(arrayz1)

works but throwing an:

 RuntimeWarning: overflow encountered in float_scalars
  ii = ((i-minz)/(manz-minz)*255).astype(np.uint8)
RuntimeWarning: invalid value encountered in true_divide
  ii = ((i-minz)/(manz-minz)*255).astype(np.uint8)

and I end up whit a series of 16x16x3 arrays full of zeroes

9 Comments

As far as I understood, he wants the normalize every image itself, not over the whole data. So I think you need to move the min/max calculation into the loop.
yes sure thanks for pointing out, thats why I shoul always have test cases to check my scripts !!
Yes I think so.
@pippo1980 When answering a question, please try to create the most efficient, simple and formatted piece of code. All the prints are useless; all the line spacing are useless; and they make your answer very hard to 1. understand; 2. appreciate for someone who does not need to run the code to understand what it does.
@pippo1980 Also all those prints you are using to get information like type, shape, .. are displayed in the variable explorer of your IDE directly. You do not need to print them.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.