1

I'm trying to do some adding with SSE and I'm using to this C with assembly. Why something like this doesn't work?

struct vector {
    float x1, x2, x3, x4;
};

struct vector *dodawanie(const struct vector v1[], const struct vector v2[], int size) {

struct vector vec[size];
int i;
for(i = 0; i < size; i++) {
        asm(
            "MOV %1, %%rax \n"
            "MOV %2, %%rdx \n"

            "MOVUPS (%%rax), %%xmm0 \n"
            "MOVUPS (%%rdx), %%xmm1 \n"
            "ADDPS %%xmm0, %%xmm1 \n"

            "MOVUPS %%xmm1, %0 \n"

            :"=g"(vec[i])       //wyjscie
            :"g"(v1[i]), "g"(v2[i]) //wejscie
            :"%rax", "%rdx"
        );
}
return vec;
}

I got error: Thread 1: EXC_BAD_ACCESS (CODE = EXC_I386_GPFLT)

But when instead of v1[i], v2[i] I put v1, v2 etc. this work correctly but of course only with the first element of array.

What's wrong in my code?

3
  • 1
    1. take a look at generated asm it may happen that %1 and %2 are being passed by rax/rdx. 2. try to pass %1 and %2 directly by rax and rdx. Look here: gcc.gnu.org/onlinedocs/gcc/… for x86 specific constraints Commented May 27, 2016 at 20:59
  • First I would be inclined to use compiler intrinsics for addps. If you are intent on using inline assembler I'd follow Michal's advice of looking at getting the assembler template to do most of the work. Assuming v1, v2, and vec are all arrays of vectors (__m128) then something like this may work asm( "MOVAPS %[v1], %[out]\n\t" "ADDPS %[v2], %[out]\n\t" :[out]"=&x,m"(vec[i]) :[v1]"mx,x"(v1[i]), [v2]"mx,x"(v2[i]) ); Commented May 27, 2016 at 23:50
  • 1
    Your code appears to be returning a pointer to a local variable which could cause some unexpected behavior since you are relying on the stack not being trashed after the function returns and before the data is used. Commented May 28, 2016 at 18:30

1 Answer 1

3

You are using values from tables ( v1[i], v2[i] ) and you treat them as adresses ("MOVUPS (%%rax), %%xmm0 \n"). Use &v1[i] and &v2[i] respectively.

This is also why form v1 and v2 works as in such case address is being passed.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot, it's obviously and I didn't see this. Thanks to u I save a couple of minutes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.