Python custom list building optimization

Question

I'm programming a genetic algorithm in Python, however, my operator (MMX) takes too long (10 seconds) to execute for individuals with 3 million weights (each individual is a list of 3.000.000 elements).

This is the code for the operator:

def calc_gen(maxel, minel, rec1, rec2, phiC):
    g = maxel - minel
    phi = 0
    if g > phiC:
        # Recta 2
        phi = rec2[0] * g + rec2[1]
    elif g < phiC:
        # Recta 1
        phi = rec1[0] * g + rec1[1]
    #Hay que asegurarse que no nos salimos del rango:
    maxv = min(1, maxel - phi)
    minv = max(0, minel + phi)
    gen1 = random.uniform(minv, maxv)  # Guardar el gen del primer hijo
    # Si C es el centro y A el elemento que ya tenemos y B el simétrico de A: C - A + C = B -> 2C - A = B
    # C = (maxv + minv) / 2; 2C - A = B -> maxv + minv - A = B
    # center = (maxv + minv) / 2
    gen2 = maxv + minv - gen1
    return gen1, gen2
    #return gen1, maxv + minv - gen1

def cxMMX(poblacion, rec1, rec2, phiC):
    start = timer()
    # Calcular el maximo y el minimo de cada gen en toda la población
    max_genes = numpy.amax(poblacion, axis=0).tolist()
    min_genes = numpy.amin(poblacion, axis=0).tolist()
    gis = timer()
    hijo1 = Individual()
    hijo2 = Individual()
    # Iterar dos listas a la vez (zip) con su indice (enumerate). Así crearemos los hijos simultáneamente en un loop
    for i, (maxel, minel) in enumerate(zip(max_genes, min_genes)):
        gen1, gen2 = calc_gen(maxel, minel, rec1, rec2, phiC)
        hijo1.append(gen1)
        hijo2.append(gen2)
    end = timer()
    #print("Tiempo Gi: %f Tiempo init: %f Tiempo calc gen: %f Tiempo mate total: %f" % (gis-start, init-gis, end-init, end-start))
    return [hijo1, hijo2]

rec1, rec2 and phiC are parameters that determine how the crossover is done, you shouldn't bother about them. They have the same value all across the algorithm.

poblacion is a list of lists, lets say it's shape is [7,3000000]. Individual() is a custom class. It is basically inheriting "list" and adding some attributes to store the fitness value.

Doing numpy.amax and numpy.amin separately seems like doing extra work. Also, there's probably a more pythonic way to do the "calc_gen()" loop.

PD: "gen1" depends on "gen2": gen1 obtained randomly within a range, and then gen2 is obtained looking for the symmetrical point.

PD2: A more detailed explanation on MMX operator can be found on the original paper, however, you can assume the code is okay and does what it has to do. The doi is https://doi.org/10.1007/3-540-44522-6_73

PD: the enumerate() and the i are there from the old code, forgot to remove them!

EDIT: reduced 20% time with Dillon Davis's solution. It's a pretty clean solution which will work with any custom list building function, provided you obtain each value of the list by executing one function:

def calc_gen_v2(maxel,minel, rec1m, rec1b, rec2m, rec2b, phiC):
    g = maxel - minel
    phi = 0
    if g > phiC:
        # Recta 2
        phi = rec2m * g + rec2b
    elif g < phiC:
        # Recta 1
        phi = rec1m * g + rec1b
    #Hay que asegurarse que no nos salimos del rango:
    maxv = min(1, maxel - phi)
    minv = max(0, minel + phi)
    gen1 = random.uniform(minv, maxv)  # Guardar el gen del primer hijo
    # Si C es el centro y A el elemento que ya tenemos y B el simétrico de A: C - A + C = B -> 2C - A = B
    # C = (maxv + minv) / 2; 2C - A = B -> maxv + minv - A = B
    # center = (maxv + minv) / 2
    gen2 = maxv + minv - gen1
    return gen1, gen2

def cxMMX_v3(poblacion, rec1, rec2, phiC):
    start = timer()
    # Calcular el maximo y el minimo de cada gen en toda la población
    max_genes = numpy.amax(poblacion, axis=0)
    min_genes = numpy.amin(poblacion, axis=0)
    gis = timer()
    hijo1, hijo2 = map(Individual, numpy.vectorize(calc_gen_v2)(max_genes, min_genes, rec1[0], rec1[1], rec2[0], rec2[1], phiC))
    end = timer()
    #print("Tiempo Gi: %f Tiempo init: %f Tiempo calc gen: %f Tiempo mate total: %f" % (gis-start, init-gis, end-init, end-start))
    return [hijo1, hijo2]

EDIT 2: as Dillon Davis suggested I implemented it in pure numpy, reducing the time to 3,5 seconds! (65% time save)

def cxMMX_numpy(poblacion, rec1, rec2, phiC):
    # Calculate max and min for every gen in the population
    max_genes = numpy.amax(poblacion, axis=0)
    min_genes = numpy.amin(poblacion, axis=0)
    g_pop = numpy.subtract(max_genes, min_genes)
    phi_pop = numpy.where(g_pop < phiC, numpy.multiply(g_pop, rec1[0]) + rec1[1], numpy.where(g_pop > phiC, numpy.multiply(g_pop, rec2[0]) + rec2[1], 0))
    maxv = numpy.minimum(numpy.subtract(max_genes, phi_pop), 1)
    minv = numpy.maximum(numpy.sum([min_genes, phi_pop], axis=0), 0)
    hijo1 = numpy.random.uniform(low=minv, high=maxv, size=minv.size)
    hijo2 = numpy.subtract(numpy.sum([maxv, minv], axis=0), hijo1)
    return [Individual(hijo1), Individual(hijo2)]

NOTE: In case you want to reuse, Individual inherits from list

NOTE: if g=phiC then rec1[0] * g_pop + rec1[1]=0, always, rec1[0] and rec1[1] guarantee that! so maybe it is better to do the math instead of a triple option?

Did you try to run a Python profiler? I can recommend docs.python.org/3.6/library/profile.html and github.com/rkern/line_profiler. — Kirill Bulygin
– Kirill Bulygin, Commented Jul 18, 2018 at 19:10
@user2357112 I use those values to initialize a neural network's weights later. With numpy arrays it takes ages to move the individuals through memory, so I use tolist() so only the references to the lists move trough functions. Maybe I'm doing it wrong anyway, I'm new to Python. — DGoiko
– DGoiko, Commented Jul 19, 2018 at 0:05
@KirillBulygin Hello. I did, but I'm new in Python so by now i'm measuring time in the wrong way. I've got 32 threads and the PC is a linux that is just doing that, not even x server loaded, so time() should be a "reliable" measure. I run it 20 times and measure mean and min — DGoiko
– DGoiko, Commented Jul 19, 2018 at 0:19
@DGoiko Profiling is used not to measure time reliably (it doesn't because of its overhead) but rather to find out the slowest functions/lines (that are worth to change). — Kirill Bulygin
– Kirill Bulygin, Commented Jul 19, 2018 at 8:18

Dillon Davis · Accepted Answer · 2018-07-18 20:55:57Z

3

Try replacing your for loop in cxMMX() with something like:

hijo1, hijo2 = map(Individual, numpy.vectorize(calc_gen)(max_genes, min_genes, rec1, rec2, phiC))

and drop the .tolist() from your numpy.amin() and numpy.amax().

This will vectorize your your calc_gen function, avoid a zip and the function overhead from several .append() calls, and should overall be quite a bit faster.

Edit:

Also consider converting calc_gen() to work directly on the numpy arrays. Replace calls to random.uniform() with numpy.random.uniform(), min() or max() with numpy.minimum() or numpy.maximum(), and then eliminate the for loop / map + vectorize completely. This would ultimately be the fastest option.

edited Jul 18, 2018 at 20:55

answered Jul 18, 2018 at 20:40

Dillon Davis

7,7682 gold badges17 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

DGoiko Over a year ago

I used tolist because later I pass the list to different functions, and passing numpy arrays was slower. using my code, removing tolist makes it slower. I'm going to try your solution. Thanks.

DGoiko Over a year ago

Just implemented it (had to make some modifications in order to pass rec1 and rec2). It reduced 20% of the time so far. Not bad. I'm thinking about how to convert it to numpy, and it looks easy, except for the part if g<phiC do something, elif do another thing. I'll try to convert it tomorrow and post both modifications on the OP. Would you mind to check again if I comment to see if I did it in the proper way? I'm pretty new to Python xD Thanks. I'll give you a point now.

DGoiko Over a year ago

Just implemented the pure numpy version, however, I couldn't implement the if/else statement. Could you give it a look? It is now saving 65% of the execution time. Any efficiency tip that comes to your mind for python and numpy would be appreciated aswell.

Dillon Davis Over a year ago

Try something like

phi_pop = numpy.where(g_pop < phiC, numpy.multiply(g_pop, rec1[0]) + rec1[1], numpy.where(g_pop > phiC, numpy.multiply(g_pop, rec2[0]) + rec2[1], 0))

DGoiko Over a year ago

Works perfectly. I'll have to check if I made any mistakes in the rest of the code, but it looks like everything works as expected. Thank you so much. Marked as accepted answer

|

mVChr · Accepted Answer · 2018-07-18 19:17:32Z

1

Have you tried using a multiprocessing.Pool?

You'd need to make a wrapper for calc_gen first:

# after calc_gen def
def get_calc_gen(rec1, rec2, phiC):
    return lambda maxel, minel: calc_gen(maxel, minel, rec1, rec2, phiC)

Then instead of the for loop you'd do something like:

# replacing for loop section
cgen = get_calc_gen(rec1, rec2, phiC)
minmax_genes = zip(max_genes, min_genes)
pool = multiprocessing.Pool()
mapped_genes = pool.map(cgen, minmax_genes)
for gen1, gen2 in mapped_genes:
    hijo1.append(gen1)
    hijo2.append(gen2)

P.S. You don't need enumerate in your original code since you don't seem to be using i

answered Jul 18, 2018 at 19:17

mVChr

50.3k11 gold badges111 silver badges105 bronze badges

2 Comments

DGoiko Over a year ago

Hello. I tried it at some point, but it added more overhead than the time it saved on a 2 CPU x8 cores x 2 threads PC

DGoiko Over a year ago

BTW, I gave you a point. Even if your answer doesnt save me time for this, I was looking forward the way to do something like that, so you helped me anyway.

Collectives™ on Stack Overflow

Python custom list building optimization

2 Answers 2

6 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related