3

My question is based on this reddit post. The example there shows how to change an integer in memory using cast function from the ctypes module:

>>> import ctypes
>>> ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[3] = 100
>>> 29
100

I'm interested in the low level internals here and I've checked this in GDB session by setting a breakpoint on the cast function in CPython:

(gdb) break cast
Function "cast" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (cast) pending.
(gdb) run test.py 
Starting program: /root/.pyenv/versions/3.8.0-debug/bin/python test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x7ffff00e7b40

Breakpoint 1, cast (ptr=0x9e6e40 <small_ints+1088>, src=10382912, ctype=<_ctypes.PyCPointerType at remote 0xa812a0>) at /root/.pyenv/sources/3.8.0-debug/Python-3.8.0/Modules/_ctypes/_ctypes.c:5540
5540        if (0 == cast_check_pointertype(ctype))
(gdb) p *(PyLongObject *) ptr
$38 = {
  ob_base = {
    ob_base = {
      ob_refcnt = 12, 
      ob_type = 0x9b8060 <PyLong_Type>
    }, 
    ob_size = 1
  }, 
  ob_digit = {100}
}
(gdb) p *((long *) ptr + 3)
$39 = 100
(gdb) p ((long *) ptr + 3)
$40 = (long *) 0x9e6e58 <small_ints+1112>
(gdb) p *((char *) ptr + 3 * 8)
$41 = 100 'd'
(gdb) p ((char *) ptr + 3 * 8)
$42 = 0x9e6e58 <small_ints+1112> "d"
(gdb) set *((long *) ptr + 3) = 29
(gdb) p *((long *) ptr + 3)
$46 = 29
(gdb) p *((char *) ptr + 3 * 8)
$47 = 29 '\035'

I would like to know if it's possible to get the memory address using Python in the GDB session because I couldn't access the returned addresses:

(gdb) python print("{:#x}".format(ctypes.addressof(ctypes.c_int(29))))
0x7f1053c947f0
(gdb) python print("{:#x}".format(id(29)))
0x22699d8
(gdb) p *0x7f1053c947f0
Cannot access memory at address 0x7f1053c947f0
(gdb) p *0x22699d8
Cannot access memory at address 0x22699d8

The indexing is also different compeering to Python REPL, I guess this is related to endianness?

(gdb) python print(ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[3])
9
(gdb) python print (ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[2])
29

Questions:

  1. Why memory addresses from Python in GDB session are not accessible, values are not in the the process memory range (info proc mappings)?
  2. Why the indexing is different comparing to Python REPL?
  3. (bonus question) I would expect that the src parameter in the CPython cast function holds the address of the object but it seems to be ptr instead and after memcpy result->b_ptr points to a different value than &ptr? Is this were the actual casting happens?

2 Answers 2

1
+50
  1. Your Python process is not a real python process, rather, GDB is running a Python REPL for you. Imagine it as another thread inside of GDB. Of course, this is a simplification, you should see the docs
  2. I was unable to reproduce this behaviour:
    (gdb) python
    >import ctypes
    >print(ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[3])
    >end
    29
    
    I can't think of any reason this behaviour would happen (least of all endianness, which is the same across your entire system*)
  3. The src parameter appears to be used as the origin type, rather than the origin object. For reference, see ctypes.h and ctypes/__init__.py (_SimpleCData is just CDataObject with some helpers like indexing and repr). And yes, the memcpy is what does the actual casting in this case, although if you are casting between two data types, there is additional work beforehand.

* Except on ARM, where you can change endianness with an instruction

Sign up to request clarification or add additional context in comments.

Comments

-1

import ctypes

def modify_value_in_memory(address, new_value):

\# Convert the Python integer address to a C-compatible pointer

\# This is for demonstration and might not work reliably for all Python objects

\# as Python's memory management is dynamic.

try:

    \# Create a pointer to the memory address

    ptr = ctypes.cast(address, ctypes.POINTER(ctypes.c_int))

    \# Modify the value at that memory address

    ptr.contents.value = new_value

    print(f"القيمة في العنوان {address} تم تعديلها إلى {new_value}")

except Exception as e:

    print(f"حدث خطأ أثناء تعديل الذاكرة: {e}")

if _name_ == "_main_":

\# Define a variable

my_variable = 10

print(f"القيمة الأصلية للمتغير: {my_variable}")

\# Get the memory address of the variable

\# In CPython, id() returns the memory address of the object.

\# However, Python's garbage collector and optimization can move objects,

\# so this is not a reliable way to 'hack' Python programs like C/C++ ones.

\# This is purely for educational demonstration of memory concepts.

address_of_my_variable = id(my_variable)

print(f"عنوان الذاكرة للمتغير: {address_of_my_variable}")

\# Attempt to modify the value using the memory address

\# Note: This might not always work as expected due to Python's internal optimizations

\# (e.g., small integers are often interned, meaning they share the same memory address).

\# For larger objects or mutable types, it might show a more direct effect.

new_value = 99

modify_value_in_memory(address_of_my_variable, new_value)

\# Print the variable again to see if the value changed

\# It's highly probable that for small integers, the value won't change here

\# because Python often reuses objects for small integer values.

\# For mutable objects (like lists), the effect might be more visible.

print(f"القيمة بعد محاولة التعديل: {my_variable}")

\# Demonstrate with a mutable object (list)

my_list = \[1, 2, 3\]

print(f"القائمة الأصلية: {my_list}")

address_of_list_element = id(my_list\[0\]) # Address of the first element (integer 1)

print(f"عنوان الذاكرة للعنصر الأول في القائمة: {address_of_list_element}")

\# Attempt to modify the first element of the list via its memory address

\# This is still subject to Python's internal object management.

modify_value_in_memory(address_of_list_element, 100)

print(f"القائمة بعد محاولة تعديل العنصر الأول: {my_list}")

\# A more direct way to modify a list element (without memory address manipulation)

my_list\[0\] = 20

0

print(f"القائمة بعد التعديل المباشر: {my_list}")

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.