It’s well known pysha3 isn’t compatible with pypy, and because it’s unmaintained for 3 years, I have to modify it myself.
Of course, a proper way would be to perform a complete rewrite in pure python code (which would also results in a faster implementation over the current one), but I lack the required knowledge both in cryptograhy and background math to do this, and the program using it is very very list intensive (which requires a python3 without a gil for multithreading or python3 with a jit).
The single point of failure boils down to this function which has to be called by C code:
static PyObject*
_Py_strhex(const char* argbuf, const Py_ssize_t arglen)
{
static const char *hexdigits = "0123456789abcdef";
PyObject *retval;
#if PY_MAJOR_VERSION >= 3
Py_UCS1 *retbuf;
#else
char *retbuf;
#endif
Py_ssize_t i, j;
assert(arglen >= 0);
if (arglen > PY_SSIZE_T_MAX / 2)
return PyErr_NoMemory();
#if PY_MAJOR_VERSION >= 3
retval = PyUnicode_New(arglen * 2, 127);
if (!retval)
return NULL;
retbuf = PyUnicode_1BYTE_DATA(retval);
#else
retval = PyString_FromStringAndSize(NULL, arglen * 2);
if (!retval)
return NULL;
retbuf = PyString_AsString(retval);
if (!retbuf) {
Py_DECREF(retval);
return NULL;
}
#endif
/* make hex version of string, taken from shamodule.c */
for (i=j=0; i < arglen; i++) {
unsigned char c;
c = (argbuf[i] >> 4) & 0xf;
retbuf[j++] = hexdigits[c];
c = argbuf[i] & 0xf;
retbuf[j++] = hexdigits[c];
}
return retval;
}
cython compatibility level is at 3.2 for pypy and PyUnicode_New was introduced in python3.3.
I tried the hammer way to fix it with replacing the whole file with the following cython code:
cdef Py_strhex(const char* argbuf, const Py_ssize_t arglen):
return (argbuf[:arglen]).hex()
but it seems it triggers a segmentation fault including compiling and using the official Python implementation. And using the official PyPy binary, I don’t have the debugging symbols for gdb so I don’t know why.
(gdb) bt
#0 0x00007ffff564cd00 in pypy_g_text_w__pypy_interpreter_baseobjspace_W_Root () from /usr/lib64/pypy3.6-v7.2.0-linux64/bin/libpypy3-c.so
#1 0x00007ffff5d721a8 in pypy_g_getattr () from /usr/lib64/pypy3.6-v7.2.0-linux64/bin/libpypy3-c.so
#2 0x00007ffff543a8bd in pypy_g_dispatcher_15 () from /usr/lib64/pypy3.6-v7.2.0-linux64/bin/libpypy3-c.so
#3 0x00007ffff5ab909b in pypy_g_wrapper_second_level.star_2_14 () from /usr/lib64/pypy3.6-v7.2.0-linux64/bin/libpypy3-c.so
#4 0x00007fffd7212372 in _Py_strhex.2738 () from /usr/lib64/pypy3.6-v7.2.0-linux64/site-packages/pysha3-1.0.3.dev1-py3.6-linux-x86_64.egg/_pysha3.pypy3-72-x86_64-linux-gnu.so
#5 0x00007fffd7217990 in _sha3_sha3_224_hexdigest_impl.2958 () from /usr/lib64/pypy3.6-v7.2.0-linux64/site-packages/pysha3-1.0.3.dev1-py3.6-linux-x86_64.egg/_pysha3.pypy3-72-x86_64-linux-gnu.so
#6 0x00007ffff5be2170 in pypy_g_generic_cpy_call__StdObjSpaceConst_funcPtr_SomeI_5 () from /usr/lib64/pypy3.6-v7.2.0-linux64/bin/libpypy3-c.so
#7 0x00007ffff54b25cd in pypy_g.call_1 () from /usr/lib64/pypy3.6-v7.2.0-linux64/bin/libpypy3-c.so
#8 0x00007ffff56715b9 in pypy_g_BuiltinCodePassThroughArguments1_funcrun_obj () from /usr/lib64/pypy3.6-v7.2.0-linux64/bin/libpypy3-c.so
#9 0x00007ffff56ffc06 in pypy_g_call_valuestack__AccessDirect_None () from /usr/lib64/pypy3.6-v7.2.0-linux64/bin/libpypy3-c.so
#10 0x00007ffff5edb29b in pypy_g_CALL_METHOD__AccessDirect_star_1 () from /usr/lib64/pypy3.6-v7.2.0-linux64/bin/libpypy3-c.so
Increasing the default Linux stack depth to 65Mb doesn’t change the depth of recursion where the segfault happens so even if the stack depth is larger than 200, this doesn’t seems to be related to a stack overflow.
hashlib.sha3that comes with Python 3.6.