My question is based on this reddit post. The example there shows how to change an integer in memory using cast function from the ctypes module:
>>> import ctypes
>>> ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[3] = 100
>>> 29
100
I'm interested in the low level internals here and I've checked this in GDB session by setting a breakpoint on the cast function in CPython:
(gdb) break cast
Function "cast" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (cast) pending.
(gdb) run test.py
Starting program: /root/.pyenv/versions/3.8.0-debug/bin/python test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x7ffff00e7b40
Breakpoint 1, cast (ptr=0x9e6e40 <small_ints+1088>, src=10382912, ctype=<_ctypes.PyCPointerType at remote 0xa812a0>) at /root/.pyenv/sources/3.8.0-debug/Python-3.8.0/Modules/_ctypes/_ctypes.c:5540
5540 if (0 == cast_check_pointertype(ctype))
(gdb) p *(PyLongObject *) ptr
$38 = {
ob_base = {
ob_base = {
ob_refcnt = 12,
ob_type = 0x9b8060 <PyLong_Type>
},
ob_size = 1
},
ob_digit = {100}
}
(gdb) p *((long *) ptr + 3)
$39 = 100
(gdb) p ((long *) ptr + 3)
$40 = (long *) 0x9e6e58 <small_ints+1112>
(gdb) p *((char *) ptr + 3 * 8)
$41 = 100 'd'
(gdb) p ((char *) ptr + 3 * 8)
$42 = 0x9e6e58 <small_ints+1112> "d"
(gdb) set *((long *) ptr + 3) = 29
(gdb) p *((long *) ptr + 3)
$46 = 29
(gdb) p *((char *) ptr + 3 * 8)
$47 = 29 '\035'
I would like to know if it's possible to get the memory address using Python in the GDB session because I couldn't access the returned addresses:
(gdb) python print("{:#x}".format(ctypes.addressof(ctypes.c_int(29))))
0x7f1053c947f0
(gdb) python print("{:#x}".format(id(29)))
0x22699d8
(gdb) p *0x7f1053c947f0
Cannot access memory at address 0x7f1053c947f0
(gdb) p *0x22699d8
Cannot access memory at address 0x22699d8
The indexing is also different compeering to Python REPL, I guess this is related to endianness?
(gdb) python print(ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[3])
9
(gdb) python print (ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[2])
29
Questions:
- Why memory addresses from Python in GDB session are not accessible, values are not in the the process memory range (
info proc mappings)? - Why the indexing is different comparing to Python REPL?
- (bonus question) I would expect that the
srcparameter in theCPythoncastfunction holds the address of the object but it seems to beptrinstead and after memcpyresult->b_ptrpoints to a different value than&ptr? Is this were the actual casting happens?