GPU Coder cannot parallelize loop
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
Jeffrey
el 22 de Feb. de 2025
Movida: Walter Roberson
el 25 de Feb. de 2025
I have a for-loop that I am trying to parallelize with GPU Coder, which looks like this
% n_out is of type uint64
% input_array is of type single array
function out = my_func(n_out, input_array) %# codegen
coder.gpu.kernelfun;
out = zeros(1, n_out, 'single');
for i = 1:n_out % loop I want to parallelize
temp = 0.0;
%%
% code that changes temp depending on input_array(i). There are no reads from or writes to
% variable 'out' here
%%
out(i) = temp; % GPU Coder says this is a loop carried dependency?
end
end
When I run GPU Coder, it does not create a kernel and the build report states:
"Unable to parallelize loop because of loop carried dependencies. Check the use of variable 'out' in function 'my_func'".
1) Why is the assignment
out(i) = temp;
a "loop carried dependency"?
2) How do I remove such a "loop carried dependency"?
EDIT: removed syntax error in for loop index declaration
2 comentarios
Walter Roberson
el 22 de Feb. de 2025
I would be curious about what would happen if you wrote into a temporary array, and eventually copied the temporary array to the output variable?
I also wonder whether there are cases where out(i) is not assigned to, leading to a dependancy on the initialization of zeros()
Chao Luo
el 24 de Feb. de 2025
Hi Jeffrey,
Thanks for posting the question. There is a syntax error at line 4,
for i:n_out
I guess you mean
for i = 1:n_out
After fixing it, I am able to see the loop get parallelized when n_out type is a double scalar.
Respuesta aceptada
Jeffrey
el 24 de Feb. de 2025
Movida: Walter Roberson
el 25 de Feb. de 2025
1 comentario
Chao Luo
el 25 de Feb. de 2025
Movida: Walter Roberson
el 25 de Feb. de 2025
It is a limitation of the analysis. When a uint64 is used as array index, it is casted into int32. The cast would prevent the analysis to parallelize the loop. double is a special case because it is the default type commonly used as array index, so it is automatic replaced with int32 type so no cast is needed. So, if you have to use integer type, you can use int32 as index type.
Más respuestas (0)
Ver también
Categorías
Más información sobre Get Started with GPU Coder en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!