Forum Archive

ctypes pythonapi version

ywangd

@omz
ctypes.pythonapi always points to the C API of Python 3 regardless of the default interpreter setting. Is there anyway to access the Python 2 version of pythonapi object? It would be even more fantastic if both of them can be accessed without switch interpreter setting.

I also tried to manually load the library with

ctypes.CDLL(os.path.join(os.path.dirname(sys.executable), 'Frameworks/PythonistaKit.framework/PythonistaKit'))

Although it seems to load the Python 2 API and Py_GetVersion does show the version to be 2.7. But it is somehow not really usable. Many API calls working with the Python 3 API would not work or even simply crash the app.

Any help is appreciated.

dgelessus

@ywangd Perhaps the Python 3 interpreter is the "main" DLL that takes priority. The names of the Python 2 and 3 C functions conflict in most places, which means that because ctypes.pythonapi is really just a ctypes.PyDLL(None) (i. e. accessing global symbols rather than a specific DLL) you can only really access one version with it and that will not change with the interpreter version of the console.

If you want to call the active version's C API, you need to wrap its DLL in a ctypes.PyDLL instead of a ctypes.CDLL so the GIL stays held while you call its functions. If you want to call the other version's C API, you can use a normal ctypes.CDLL, but you need to worry about managing the GIL yourself (the PyGILState functions are probably the easiest way).

ywangd

Thanks @dgelessus
The use of PyDLL worked for some initial tests!

ywangd

@omz @dgelessus
The following simple code using pythonapi works well in Python 2 but errors out in Python 3.

import ctypes

p3 = ctypes.pythonapi
state = p3.PyGILState_Ensure()
p3.PyRun_SimpleString('print(42)')
p3.PyGILState_Release(state)

The error is name 'p' is not defined which is very weird as it suggests that the API does not even parse the given string correctly. It somehow tries to get a variable named p which is in fact the first character of print.

dgelessus

@ywangd That's a unicode/str/bytes issue. Short answer, the arguments to Python's C API need to be byte strings (b"...") unless stated otherwise in the docs. Long explanation below. :)

In C, the type char represents a byte (which is generally agreed to be 8 bits nowadays). Most code uses char * (a pointer to a char, which is effectively used as an array of unknown size) as the data type for "strings". Because a char is only 8 bit wide, it can't hold a full Unicode code point. There is the wchar_t data type, which is not really standardized either, but it's wider than char and can usually hold a Unicode code point, so APIs that support Unicode properly use wchar_t * instead of char * for strings.

In Python 2, the situation is similar. str is like C's char * - it's made of 8-bit bytes and can't hold Unicode text properly, and unicode is like C's wchar_t * and supports full Unicode. That's why ctypes converts str to char * and unicode to wchar_t * and vice versa.

Now Python 3 comes along and cleans up a lot of Python 2's Unicode issues. In Python 3, you have the two data types bytes and str. Python 3's bytes is an 8-bit string like Python 2's str, and Python 3's str is a Unicode string like Python 2's unicode. And most importantly, in both Python versions the string "hello" is of type str, which means that under Python 2 it's 8-bit (i. e. char *) and under Python 3 it's Unicode (i. e. wchar_t *).

Python's C API functions, such as PyRun_SimpleString use normal char * for source code. So under Python 2, your code works fine - "print(42)" is an 8-bit string, gets converted to char *, which is what PyRun_SimpleString wants. Perfect. Under Python 3, "print(42)" is a Unicode string, which gets converted to wchar_t *, and then things go wrong. Because wchar_t is 32 bits wide under iOS, the text print(42) represented as a wchar_t * has three null bytes between each character (which would be used if the character had a higher code point in Unicode). Null bytes are also the "end of string" marker in C. Python reads the start of the wchar_t * string, but expects a char * - it sees a p, then a null byte, and thinks "great, I'm done" and so it just runs p instead of print(42).

ywangd

Thanks a lot @dgelessus ! One cannot ask for a better answer!