1

So I recently asked this question

I had to create an environment variable MYENV and store something in it such that I can successfully run this code.

#include <stdio.h>
#include <stdlib.h>

int main(){
            int (*func)();
            func = getenv("MYENV");
            func();
}

Earlier I was doing something like export MYENV=ls.

Which a user pointed out is incorrect as when the func() is called it basically tells C to run the instructions stored in the variable func which would be the string ls and is not a correct machine code. So I should pass some shellcode instead.

Now I want to know if this how it works for functions in general. As in when I declare a function let's say myFunction() which does let's say multiply 100 and 99 and returns the value, then the variable myFunction will point towards a set of machine instructions stored somewhere which multiplies 100 and 99 and returns the value.

And if I were to figure out those machine instructions and store them in a string and make myFunction point towards it, and then if I call myFunction() we'll have 9900 returned?

This is what I mean :

int (*myFunc)();
char *var = <machine_instructions_in_string_format>
int returnVar = myFunc();

Will the returnVar have 9900?

And if yes, how do I figure out what that string is?

I am having a hard time wrapping my head around this.

9
  • 1
    Not in general. It might work on some platforms, however, but it is UB according to the C standard. On common platforms you have to make at least the page executable, that contains the code (i.e. with mprotect() on unixish systems). Commented Jun 8, 2020 at 16:35
  • 1
    You "figure out what the string is" by compiling a program that does what you want, then looking at the machine code that was generated. Commented Jun 8, 2020 at 16:37
  • @Barmar: correction: compiling a function, not a whole program. e.g. How to remove "noise" from GCC/clang assembly output? / How to disassemble one single function using objdump? Commented Jun 8, 2020 at 16:47
  • ls is not a function, but a command. qsort is a standard C function Commented Jun 8, 2020 at 16:50
  • 3
    As a more general remark, compiled languages typically make a strong distinction between code (in C: functions) and data (in C: variables, including arrays). On modern systems a program cannot modify itself, even though that's pretty cool; and it cannot execute data (unless you jump through the hoops in the links above). Both was easier possible in, say, the 1970s and on occasion put to good use. But in general you would use interpreted languages for that, some of which (Lisp) do not make that distinction at all. Commented Jun 8, 2020 at 17:03

1 Answer 1

4

You have to fill the environment variable out with opcodes for your target machine. I made a little experiment:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
        int (*f)();
        f = getenv("VIRUS");
        (*f)();
        printf("Haha, it returned\n");
        return 0;
}

I compiled it, then used execstack:

$ cc ge.c
$ execstack -s ./a.out

Then I wrote a bit of assembler:

mov %rbp, %rsp
pop %rbp
ret

Which mimics the function epilogue. Compiled it:

$ cc -c t.s

Looked at the opcodes:

$ objdump -D t.o
...
   0:   48 89 ec                mov    %rbp,%rsp
   3:   5d                      pop    %rbp
   4:   c3                      retq   

set the envar:

$ export VIRUS=$(printf "\\x48\\x89\\xec\\x5d\\xc3")

then ran the program:

$ ./a.out

And it said nothing, which is a clear indication that the printf line was stepped over. But, just to check, I tried:

$ export VIRUS=$(printf "\\xc3")
$ ./a.out
Haha, it returned

This was run on ubuntu-18.04 with an amd64 instruction set. If this happens to be a school assignment, you should aim for bonus points and figure out how you could get it to execute an opcode that contained a null (0) byte.

Sign up to request clarification or add additional context in comments.

6 Comments

That asm definitely deserves comments. You're tearing down the caller's stack frame because that function didn't push %rbp / mov %rsp, %rbp to make its own stack frame. So the ret is popping main's return address into RIP. It's not exactly "stepping over" printf, it's more like a longjmp. But of course it depends on debug-mode code-gen by the compiler!
It will happen to return zero because you declared the function pointer's arg type as () not (void), so the compiler will zero AL via zeroing EAX. (And the process exit status only captures the low byte of the retval anyway.)
Also, an easier way to build this is gcc -zexecstack ge.c, to pass the execstack option to the linker instead of modifying the binary afterward. But yes, either way it sets a read-implies-exec flag in the ELF metadata, making all pages executable including but not limited to the region above the initial stack pointer where env vars live.
This is probably a homework assignment. Giving too much away....
I sort of meld 'arbitrary constraint' == assignment
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.