I am new to Python and am trying to parallelize a program that I somehow pieced together from the internet. The program reads all image files (usually multiple series of images such as abc001,abc002...abc015 and xyz001,xyz002....xyz015) in a specific folder and then combines images in a specified range. Most times, the number of files exceeds 10000, and my latest case requires me to combine 24000 images. Could someone help me with:
- Taking 2 sets of images from different directories. Currently I have to move these images into 1 directory and then work in said directory.
- Reading only specified files. Currently my program reads all files, saves names in an array (I think it's an array. Could be a directory also) and then uses only the images required to combine. If I specify a range of files, it still checks against all files in the directory and takes a lot of time.
- Parallel Processing - I work with usually 10k files or sometimes more. These are images saved from the fluid simulations that I run at specific times. Currently, I save about 2k files at a time in separate folders and run the program to combine these 2000 files at one time. And then I copy all the output files to a separate folder to keep them together. It would be great if I could use all 16 cores on the processor to combine all files in 1 go.
Image series 1 is like so. Consider it to be a series of photos of the cat walking towards the camera. Each frame is is suffixed with 001,002,...,n.
Image series 1 is like so. Consider it to be a series of photos of the cat's expression changing with each frame. Each frame is is suffixed with 001,002,...,n.
The code currently combines each frame from set1 and set2 to provide output.png as shown in the link here.
import sys
import os
from PIL import Image
keywords=input('Enter initial characters of image series 1 [Ex:Scalar_ , VoF_Scene_]:\n')
keywords2=input('Enter initial characters of image series 2 [Ex:Scalar_ , VoF_Scene_]:\n')
directory = input('Enter correct folder name where images are present :\n') # FOLDER WHERE IMAGES ARE LOCATED
result1 = {}
result2={}
name_count1=0
name_count2=0
for filename in os.listdir(directory):
if keywords in filename:
name_count1 +=1
result1[name_count1] = os.path.join(directory, filename)
if keywords2 in filename:
name_count2 +=1
result2[name_count2] = os.path.join(directory, filename)
num1=input('Enter initial number of series:\n')
num2=input('Enter final number of series:\n')
num1=int(num1)
num2=int(num2)
if name_count1==(num2-num1+1):
a1=1
a2=name_count1
elif name_count2==(num2-num1+1):
a1=1
a2=name_count2
else:
a1=num1
a2=num2+1
for x in range(a1,a2):
y=format(x,'05') # '05' signifies number of digits in the series of file name Ex: [Scalar_scene_1_00345.png --> 5 digits], [Temperature_section_2_951.jpg --> 3 digits]. Change accordingly
y=str(y)
for comparison_name1 in result1:
for comparison_name2 in result2:
test1=result1[comparison_name1]
test2=result2[comparison_name2]
if y in test1 and y in test2:
a=test1
b=test2
test=[a,b]
images = [Image.open(x) for x in test]
widths, heights = zip(*(i.size for i in images))
total_width = sum(widths)
max_height = max(heights)
new_im = Image.new('RGB', (total_width, max_height))
x_offset = 0
for im in images:
new_im.paste(im, (x_offset,0))
x_offset += im.size[0]
output_name='output'+y+'.png'
new_im.save(os.path.join(directory, output_name))


