how numpy.ndarray can be normalized?

Question

I am working with numpy.ndarray including 286 images with the shape of (286, 16, 16, 3). Each image contains 3 bands with varying pixel values with float32 data types. The maximum value of pixel value in each band can be more than 255. Is it possible to normalize this numpy.ndarray between [0-1]?

code for reading the images:

inputPath='E:/Notebooks/data'

images = []

# Load in the images
for filepath in os.listdir(inputPath):
    images.append(cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)))

what do you mean by normalize ? taking minimum and max float32 and put them at 0 and 255 and distribute the in between values in the 0 to 255 range ? or something else ? — pippo1980
– pippo1980, Commented Jun 8, 2021 at 11:04
uch that's I believe its more difficult. here: stackoverflow.com/questions/1735025/… I found # Normalised [0,255] as integer: don't forget the parenthesis before astype(int) c = (255*(a - np.min(a))/np.ptp(a)).astype(int) but I believe it normaize over all the images, I can only thing splitting your array in 286 images and apply it. Maybe there is another way — pippo1980
– pippo1980, Commented Jun 8, 2021 at 11:11
ii = (255*(i - np.min(i))/np.ptp(i)).astype(int) gives RuntimeWarning: invalid value encountered in true_divide with numpy array of float32 type !!! need to figure out why — pippo1980
– pippo1980, Commented Jun 8, 2021 at 17:08

chillking · Accepted Answer · 2021-06-09 09:21:19Z

2

If you want the range of values of every image to be between 0 and 255, you could loop over the images, calculate min and max of the original image and squeeze them, so the minimum is 0 and the maximum is 255.

import numpy as np
#images = np.random.rand(286,16,16,3)
images = np.random.rand(286,16,16,3).astype(np.float32)

for nr,img in enumerate(images):
    min = np.min(img)
    max = np.max(img)
#   images[nr] = (img - min) * (255/(max-min))
    images[nr] = (img - min) / (max - min) * 255

edited Jun 9, 2021 at 9:21

answered Jun 8, 2021 at 13:55

chillking

3312 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

pippo1980 Over a year ago

Each image contains 3 bands with varying pixel values with float32 data types

chillking Over a year ago

works the same, doesn't it? Just use images = np.random.rand(286,16,16,3).astype(np.float32)

pippo1980 Over a year ago

havent checked, I'll try if I have time

pippo1980 Over a year ago

tried your code(original one) I get: <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 255.00000000000003 <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 255.0 <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 255.0 <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 255.0 <class 'numpy.ndarray'> 768 (16, 16, 3) 0.0 254.99999999999997 not sure floats are permitted

pippo1980 Over a year ago

images = np.random.rand(286,16,16,3).astype(np.float32) works with same glitch of float64

|

Whole Brain · Accepted Answer · 2021-06-09 12:35:50Z

1

Vectorized is much faster than iterative

If you want to scale the pixel values of all your images using numpy arrays only, you may want to keep the vectorized nature of the operation (by avoiding loops).

Here is a way to scale your images :

# Getting min and max per image
maxis = images.max(axis=(1,2,3))
minis = images.min(axis=(1,2,3))
# Scaling without any loop
scaled_images = ((images.T - minis) / (maxis - minis) * 255).T
# timeit > 178 µs ± 1.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

The transposes .T were necessary here to broadcast correctly the subtraction.

We can check if this is correct:

print((scaled_images.min(axis=(1,2,3)) == 0).all())
# > True
print((scaled_images.max(axis=(1,2,3)) == 255).all())
# > True

Scaling into the [0, 1] range

If you want pixel values between 0and 1, we simply remove the x255 multiplication:

scaled_images = ((images.T - minis) / (maxis - minis)).T

Only with numpy arrays and such

You must also make sure you are handling a numpy array in the first place, not a list :

import numpy as np
images = np.array(images)

OpenCV

On-the-go scaling

Since you are using opencv to read your images one by one, you can normalize your images on the go with it:

inputPath='E:/Notebooks/data'

max_scale = 1   # or 255 if needed
# Load in the images 
images = [cv2.normalize(
    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)),
    None, 0, max_scale, cv2.NORM_MINMAX)
    for filepath in os.listdir(inputPath)]

Make sure you have images in the folder

inputPath='E:/Notebooks/data'
images = []

max_scale = 1   # or 255 if needed

# Load in the images 
for filepath in os.listdir(inputPath):
    image = cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH))
    # Scale and append the list if it is an image
    if image is not None:
        images.append(cv2.normalize(image, None, 0, max_scale, cv2.NORM_MINMAX))

Bug on versions of open-cv prior to 3.4

As reported here, there is a bug with opencv's normalize method producing values below the alpha parameter. It was corrected on version 3.4.

Here is a way to scale images on-the-go with older versions of open-cv:

def custom_scale(img, max_scale=1):
    mini = img.min()
    return (img - mini) / (img.max() - mini) * max_scale

max_scale = 1   # or 255 if needed

images = [custom_scale(
    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)), max_scale)
    for filepath in os.listdir(inputPath)]

edited Jun 9, 2021 at 12:35

answered Jun 9, 2021 at 10:03

Whole Brain

2,1872 gold badges13 silver badges20 bronze badges

15 Comments

rayan Over a year ago

I added the code for reading the images in the question. is it possible to normalize the images between [0-1]? when I apply your code it returns error: AttributeError: 'list' object has no attribute 'max'

Whole Brain Over a year ago

I updated the answer. You said you had an numpy ndarray but it seems that images was a list in your case.

rayan Over a year ago

Thanks for the reply. I use the list to read in the 286 images. when I check the print(type(images[1])) result is class 'numpy.ndarray. if I use images = np.array(images), can I read in all the images?

Whole Brain Over a year ago

Yes, you had numpy arrays inside a list called "images". I mentioned in my last edit that you should use opencv to normalize your images on the go, since you are already using it and adding your images iteratively. You don't need to use numpy or to cast your list into an array, for that.

rayan Over a year ago

Thanks for your help and time. I could run it in google colab and the negative values are 0. I am not sure why I can not upgrade and even uninstall open cv from anaconda.

|

pippo1980 · Accepted Answer · 2021-06-09 08:21:42Z

0

I've figured out this piece of code:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Jun  8 13:19:17 2021

@author: Pietro


https://stackoverflow.com/questions/67885596/how-numpy-ndarray-can-be-normalized

"""


import numpy as np

arrayz = np.array(np.random.randn(286,16,16,3), dtype=np.float32)

print(arrayz.shape)

print((arrayz.size))

print(arrayz[0,0,0,:],'            ',type(arrayz[0,0,0,:]))
print(arrayz[0,0,0,0],'            ',type(arrayz[0,0,0,0]))

print(np.min(arrayz),'     ',np.max(arrayz))


print(np.min(arrayz),'     ',np.max(arrayz))

arrayz_split = np.split(arrayz,286,0)

print(type(arrayz_split))

for i in arrayz_split:
    print(i.size,'  ', i.shape,'  ',  np.min(i),'   ', np.max(i))

arrayz_split_flat = []

for i in arrayz_split:
    ii = i[0]
    arrayz_split_flat.append(ii)
    
for i in arrayz_split_flat:
    print(type(i),'  ',i.size,'  ', i.shape,'  ',  np.min(i),'   ', np.max(i))
    
arrayz_split_flat_norm = []



for i in arrayz_split_flat:
      minz = np.min(i)
      manz = np.max(i)
      ii = ((i-minz)/(manz-minz)*255).astype(np.uint8)
      
      arrayz_split_flat_norm.append(ii)

for i in arrayz_split_flat_norm:
    
    print(type(i),'  ',i.size,'  ', i.shape,'  ',  np.min(i),'   ', np.max(i))

out_arr1 = np.stack((arrayz_split_flat_norm), axis = 0) 

print(type(out_arr1), out_arr1.size, '  ', out_arr1.shape, ' ',np.min(out_arr1),np.max(out_arr1), out_arr1[0,0,0,:],out_arr1[0,0,0,0])

I don't understand why:

arrayz = np.array(np.random.randn(286,16,16,3), dtype=np.float32)

seems to work while using:

arrayz1 = np.ndarray((286,16,16,3), dtype="float32")
arrayz = np.nan_to_num(arrayz1)

works but throwing an:

 RuntimeWarning: overflow encountered in float_scalars
  ii = ((i-minz)/(manz-minz)*255).astype(np.uint8)
RuntimeWarning: invalid value encountered in true_divide
  ii = ((i-minz)/(manz-minz)*255).astype(np.uint8)

and I end up whit a series of 16x16x3 arrays full of zeroes

edited Jun 9, 2021 at 8:21

answered Jun 8, 2021 at 17:54

pippo1980

3,3463 gold badges18 silver badges43 bronze badges

9 Comments

chillking Over a year ago

As far as I understood, he wants the normalize every image itself, not over the whole data. So I think you need to move the min/max calculation into the loop.

pippo1980 Over a year ago

yes sure thanks for pointing out, thats why I shoul always have test cases to check my scripts !!

chillking Over a year ago

Yes I think so.

Mathieu Over a year ago

@pippo1980 When answering a question, please try to create the most efficient, simple and formatted piece of code. All the prints are useless; all the line spacing are useless; and they make your answer very hard to 1. understand; 2. appreciate for someone who does not need to run the code to understand what it does.

Mathieu Over a year ago

@pippo1980 Also all those prints you are using to get information like type, shape, .. are displayed in the variable explorer of your IDE directly. You do not need to print them.

|

Collectives™ on Stack Overflow

how numpy.ndarray can be normalized?

3 Answers 3

12 Comments

Vectorized is much faster than iterative

Scaling into the [0, 1] range

Only with numpy arrays and such

OpenCV

On-the-go scaling

Make sure you have images in the folder

Bug on versions of open-cv prior to 3.4

15 Comments

9 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

12 Comments

Vectorized is much faster than iterative

Scaling into the [0, 1] range

Only with numpy arrays and such

OpenCV

On-the-go scaling

Make sure you have images in the folder

Bug on versions of open-cv prior to 3.4

15 Comments

9 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related