4

I am fairly new to assembly.

After compiling, when I run the code below, I get a segmentation fault.

I am trying inline assembly with recursion. I am compiling this code with cxxdroid.

int sum_with_rec(int k)
{
int res =0;
__asm ("sum_with_rec_:\n"
    "CMP %[input_i], #0\n"
    "BEQ endrec\n"

    "rec:\n"
    "ADD %[result], %[input_i]\n"
    "PUSH {%[result]}\n"
    "SUB %[input_i], #1\n"
    "BL sum_with_rec_\n"
    
    "endrec:\n"
    :[result] "+r" (res)
    :[input_i] "r" (k)
    );
 return res;
}

int main(void)
{
  int d = 10;
  int e = 0;

  e = sum_with_rec(d);

  printf("Result of %d = %d\n", d, e);

}

Edit The target system is armv7. The compiler I use is cxxdroid.

Normal inline assembly compiles fine. When trying recursion in inline assembly it faults.

12
  • 1
    Please edit and be more specific about the processor you target as well as the specific C compiler you use. If you have a debugger you might be able step through your code line by line and observe the variable/register contents. BTW you forgot #include <stdio.h>. Commented Jun 18 at 18:33
  • 5
    I do not know which architecture you are targeting, but it looks like you are doing a bunch of pushes, and no pops. You are basically corrupting the stack frame of the function. Commented Jun 18 at 18:38
  • 2
    Or wait, you're actually calling into your own label inside the inline asm, not the C function. But you let execution fall out of the asm statement into the compiler-generated epilogue instead of doing your own bx lr, with lr pointing at endrec: so that epilogue runs twice, probably corrupting the stack. As far as the compiler knows, this is a leaf function. Anyway, this would be much easier to understand and debug if you just wrote the whole asm function yourself, where you push a return address. It's not supported to jump back into an asm statement from C. Commented Jun 18 at 19:57
  • 2
    I strongly recommend you do not use inline assembly for learning assembly programmig. It's just a terrible environment to program in. Commented Jun 19 at 9:44
  • 2
    As a rule of thumb, inline assembly is good for small one-liners like NOP and WFI instructions that don't always have a nice C equivalent. If you want to write a full assembly function, write it entirely in an assembly (.s or .asm) file and compile it with your C entry point, not inline assembly. Commented Jun 19 at 13:38

2 Answers 2

3

You have an unbalanced push. (Which is doing what? Is that supposed to be passing an arg? But your "callee", sum_with_rec_:, is looking for input in the registers picked by the compiler for the "r" and "+r" constraints. Look at disassembly or gcc -S asm output, e.g. on https://godbolt.org/, to see how GCC filled in your template with the operands it picked, and to see the compiler-generated code around your block.)

But more importantly, you let execution fall out of the inline asm statement into the compiler-generated epilogue with a modified lr (return address), or worse into main if this inlines!! (Which it will with normal optimization levels.)

If sum_with_rec doesn't inline into main (which it won't with the default -O0), the recursive callees will return back into the asm statement (after the bl, to endrec:), then fall out again, running the compiler-generated epilogue multiple times. (Because as far as the compiler knows, this is a leaf function, and you didn't declare a clobber on "lr", but your asm statement's bl instruction sets lr to its return address). The epilogue may include some stack-pointer manipulation; depends on optimization options, among many other problems, which is part of why this is not a supported way to use GNU C inline assembly. (See https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html / https://stackoverflow.com/tags/inline-assembly/info)

The details of how it breaks depend on build options and details of what the compiler did, but this entire approach is a non-starter. You can call a C function from inside an asm statement, although it's a pain. (You need to declare clobbers on every call-clobbered register in the ABI, including "cc" which your cmp also sets, and r0-r3 and others, and make sure your own asm is expecting that. And make sure the stack pointer is sufficiently aligned in ABIs where SP on call is aligned by more than 1 stack slot. e.g. x86-64 requires 16-byte stack alignment, but pushes are in units of 8.)

You also can call a chunk of hand-written asm code which return (bx lr) without ever leaving hand-written asm. (e.g. in a global-scope asm statement outside any function, or separate .S file, or in part of your asm statement inside a function that you jump over so its only ever reached by bl, not by falling into it.)

Then you still have to write clobbers for your asm statement which correctly describes all the registers you modify. For example, you don't declare a clobber on "lr", the link register, so the compiler's prologue/epilogue won't have saved/restored it. (If they did, then your very first recursive return would return to main, not recursively back up the call-tree. If not, then you'll have an infinite loop.)


Don't use inline asm this way.

Don't use it at all when you're trying to learn assembly; you have to already know assembly and how compilers "think" in order to write correct constraints to describe your asm statement to the compiler, and to understand everything that's going on and what's supported vs. what isn't. It's a delicate dance between your asm and the compiler-generated asm, and you need to not step on each other's toes.

This would be much easier to understand and debug if you just wrote the whole asm function yourself, where you push a return address. It's not supported to jump back into the middle of an asm statement from C.

You can use a separate .S file, or put a Basic asm statement (no constraints) inside an __attribute__((naked)) function. In both cases you still have to write everything yourself, including push {r0, lr} / ... / pop {r1, pc} or whatever. (Pushing LR and popping it back into the program counter is a traditional way to return on ARM, when you don't need to interoperate between ARM and Thumb mode. bx lr is the normal way to just return.)


Thanks for thé help Eugene and Peter. Changing BL to B solved thé problem. I could have looked hours for thos

With b it's not recursion, just a while() loop. Your code is missing several pieces to actually be a recursive function call.

Look at compiler-generated code for a pure-C recursive function, compiled with -Og or something, on https://godbolt.org/ (At higher optimization levels it will inline into itself and convert recursion to iteration; -Og -fno-inline-functions should avoid that.)

For example, in your current code, res isn't really passed / returned, it's just a global register variable across calls. So you're using recursion to loop, but not to get data between callers and callees.

Sign up to request clarification or add additional context in comments.

Comments

0

You could get it to work if you got your pushes and pops JUST right. However if you need this kind of performance that you would use inline assembly ... don't use recursion. Convert the recursive solution to an iterative solution, this gives you more options to optimize performance. This may be difficult, but it will be safer, simpler ... and faster.

3 Comments

Just adding a pop wouldn't be sufficient. The asm lets the recursive callees (including the base-case) fall out of the inline asm statement, then return back into it. So the compiler-generated epilogue runs n times after the prologue runs once. And it doesn't declare a clobber in "cc", r0-r3, or crucially on "lr", so at best it'll be an infinite loop. And yes obviously this performs like total garbage vs. an iterative loop.
Understood, and I could probably add other reasons not to pursue the OP's solution - point to reinforce is, there are safer, simpler, and more performant alternatives, rather than to try to redeem doing it this way. While making this way work would be a nice puzzle and maybe self gratifying, I can't see how it would add value vs just adding costs.
Agreed; the only reason to write anything like this would be as a learning exercise. But as I wrote in my answer, and as fuz said in comments, GNU C inline asm inside a non-naked function is a bad way to learn assembly. Just write whole functions in asm, look at compiler output for pure C functions, and stuff like that. Only then try to learn inline asm, if you're interested in using it to wrap a single instruction or a loop.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.