I am creating an emulator for an instruction set architecture, and I needed to implement a stack structure. I decided that my %eip, %ebp and %esp would be int pointers. However, there are situations where I need to store memory addresses on the stack, in which case this memory would be encoded as an integer value. But when I return this value, I need to put it back into my instruction pointer, which is implemented as an int pointer. C will not let me assign my integer to my int pointer, so I have no way of recovering these memory addresses from the "stack". Any suggestions?
3 Answers
To assign an int value to an int * object, use an explicit cast, as in:
destination = (int *) source;
Your question says “C will not let me assign my integer to my int pointer” but fails to state exactly what the problem is. Presumably you are getting some diagnostic message from the compiler. This would be because assigning an int value to an int * object violates the C standard’s constraints for assignments. The code above shows how to work around that.
That solves the immediate problem of the compiler diagnostic. However, there can be various issues with using int values as containers for pointers, including the possibility of trap values and discrepancies between the sizes of pointers and integers. Provided that int and int * are the same size, using an int to hold an int * is not unlikely to work, but you should be sure of the properties of your C implementation.
2 Comments
I decided that my %eip, %ebp and %esp would be int pointers.
This is not a sound architectural decision. You need to reconsider it.
The size of a pointer is architecture-dependent -- in particular, an
int *will be 64 bits wide on a 64-bit system. By contrast, all of these registers are 32 bits wide by definition. Using a 64-bit pointer to store their values will result in unexpected behavior.These registers are not required to be aligned to an integer. In particular, EIP is (at best) aligned to an instruction, and will be incremented by one byte when running 1-byte instructions. Deferencing an
int *which is not properly aligned will cause an unaligned access fault on many systems.There is no hard architectural distinction between any of the integer registers (EA/B/C/DX, ESP, EBP, ESI, EDI). All of them can be referenced in an ModRM encoding, and can be treated as either a numeric value or an address, depending on the context. Singling ESP and EBP out will unnecessarily complicate your emulator, and is likely to create a lot of obnoxious special cases in your code.
Note that, as you are emulating a 32-bit system on what might not be a 32-bit platform, you will need some way of translating addresses within the emulated system to "real" addresses in the host process. There are a number of different ways of doing this; which one is most appropriate for you will depend on your specific goals.
5 Comments
int objects, not arbitrary pointers in int * objects. They mention int * because they have int * for their %eip, %ebp, and %esp, but the pointers they want to store will be in the int pointed to by one of those.int * as a storage container for an arbitrary value. They are restoring to an int * some value intended for the int * that is %eip. The fact that their “%eip” is an int * suggests the ISA they are emulating has a fixed instruction size so that all valid values for it are also valid int * values. In any case, they are not using an int * as a vehicle for arbitrary values, just for values intended for it.It is implementation defined but if the integer width is not smaller than the pointer - you can use it this way.
Some people say that the using ptrdiff_t and NULL pointer as a reference is more portable and safer.
ptrdiff_t myptrdiff = myptr - (type_of_myptr *)NULL;
myptr = myptrdiff + (type_of_myptr *)NULL;
int*iptr=(int*)your_intuintptr_t, a type defined in<inttypes.h>when it is available on your machine (and it usually is available, though it is theoretically an optional type). Or you may simply need to add an explicit cast. If you get warnings about different sizes of integer and pointer, then you need to worry a lot more.