4

I'm embedding python in a C++ plug-in. The plug-in calls a python algorithm dozens of times during each session, each time sending the algorithm different data. So far so good

But now I have a problem: The algorithm takes sometimes minutes to solve and to return a solution, and during that time often the conditions change making that solution irrelevant. So, what I want is to stop the running of the algorithm at any moment, and run it immediately after with other set of data.

Here's the C++ code for embedding python that I have so far:

void py_embed (void*data){


counter_thread=false;

PyObject *pName, *pModule, *pDict, *pFunc;

//To inform the interpreter about paths to Python run-time libraries
Py_SetProgramName(arg->argv[0]);

if(!gil_init){
    gil_init=1;
    PyEval_InitThreads();
    PyEval_SaveThread();
}
PyGILState_STATE gstate = PyGILState_Ensure();

// Build the name object
pName = PyString_FromString(arg->argv[1]);
if( !pName ){
    textfile3<<"Can't build the object "<<endl;
}

// Load the module object
pModule = PyImport_Import(pName);
if( !pModule ){
    textfile3<<"Can't import the module "<<endl;
}

// pDict is a borrowed reference 
pDict = PyModule_GetDict(pModule);
if( !pDict ){
    textfile3<<"Can't get the dict"<<endl;
}

// pFunc is also a borrowed reference 
pFunc = PyDict_GetItemString(pDict, arg->argv[2]);
if( !pFunc || !PyCallable_Check(pFunc) ){
    textfile3<<"Can't get the function"<<endl;
}

/*Call the algorithm and treat the data that is returned from it
 ...
 ...
 */

// Clean up
Py_XDECREF(pArgs2);
Py_XDECREF(pValue2);
Py_DECREF(pModule);
Py_DECREF(pName);

PyGILState_Release(gstate);

counter_thread=true;
_endthread(); 

};

Edit: The python's algorithm is not my work and I shouldn't change it

2
  • can the algorithm be decomposed into small steps (ideally that run in bounded time?) Your C++ code could be: while(stillNeeded) performNextStep(); Commented Jun 19, 2014 at 20:39
  • No, the algorithm is not my work and I shouldn't change it Commented Jun 19, 2014 at 20:57

4 Answers 4

6

This is based off of a cursory knowledge of python, and reading the python docs quickly.

PyThreadState_SetAsyncExc lets you inject an exception into a running python thread.

Run your python interpreter in some thread. In another thread, PyGILState_STATE then PyThreadState_SetAsyncExc into the main thread. (This may require some precursor work to teach the python interpreter about the 2nd thread).

Unless the python code you are running is full of "catch alls", this should cause it to terminate execution.

You can also look into the code to create python sub-interpreters, which would let you start up a new script while the old one shuts down.

Py_AddPendingCall is also tempting to use, but there are enough warnings around it maybe not.

Sign up to request clarification or add additional context in comments.

1 Comment

Most "catch-alls" only catch subclasses Exception, so you could use an exception with another baseclass (i.e. SystemExit) to avoid them: docs.python.org/2/library/exceptions.html
5

Sorry, but your choices are short. You can either change the python code (ok, plugin - not an option) or run it on another PROCESS (with some nice ipc between). Then you can use the system api to wipe it out.

3 Comments

So,if I'm understanding correctly, your suggestion is to use processes instead of threads? Or better, to put the py_embedthread in a process and kill it when I want? I'm amazed by the fact that there is no option in the Python/C API to terminate a worker thread within the main or other thread...
@JoãoPereira terminating a thread while it is accessing some unknown portions of memory will indeed stop it from running, but it will leave your application in an incredibly hard to predict state. You could leak file handles or memory, corrupt the heap (suppose you are in the middle of returning memory to the heap, and on some instruction you simply stop...), or other similar operations. The OS provides a capsule with known properties at termination (a process), with all the attached performance penalties to such wrapping.
@JoãoPereira, that (control) should be done with agreement (and participation - register a signal for example) of the python code running inside the thread - but if you can't get hands on the python side, you got to use tricks to handle it.
3
+50

I've been thinking about this problem, and I agree that sub interpreters may provide you one possible solution https://docs.python.org/2/c-api/init.html#sub-interpreter-support. It supports calls for creating new interpreters and ending existing ones. The bugs & caveats sections describes some issues that depending on your architecture may or may not present a problem.

Another possible solution is to use the python multiprocessing module, and within your worker thread test a global variable (something like time_to_die). Then from the parent, you grab the GIL, set the variable, release the GIL and wait for the child to finish.

But then another idea ocurred to me. Why not just use fork(), init your python interpreter in the child and when the parent decides it's time for the python thread to end, just kill it. Something like this:

void process() {

    int pid = fork();
    if (pid) {
        // in parent
        sleep(60);
        kill(pid, 9);   
        }
    else{
        // in child
        Py_Initialize();
        PyRun_SimpleString("# insert long running python calculation");
        }
    }

(This example assumes *nix, if you're on windows, substitute CreateProcess()/TerminateProcess())

3 Comments

Suggestion 1 is bad because I think that any new sub-interpreter has to import the modules again, and that takes some time.. As for the other 2 suggestions, I like the fork() better (even though I have no idea what it is yet) but I'm definitely going to look into both. Best answer so far and for that I'm awarding the bounty
Just one thing, I'm under Windows and not understanding this CreateProcess() stuff easily. Can you update your example from linux to windows?
CreateProcess is the win32 call to fork a child process, disconnected from the parent. Here's a quick example msdn.microsoft.com/en-us/library/windows/desktop/…. The concept is identical to fork(), although the Microsoft call has quite a few extra options (most of which can just be NULL). The first action of your new child would be to initialize python and "do it's thing". You can then kill it whenever you want -- completely safely. It will have no impact on the parent when you choose to kill it.
2

So, I finally thought of a solution (more of a workaround really).

Instead of terminating the thread that is running the algorithm - let's call it T1 -, I create another one -T2 - with the set of data that is relevant at that time.

In every thread i do this:

thread_counter+=1; //global variable
int thisthread=thread_counter;

and after the solution from python is given I just verify which is the most "recent", the one from T1 or from T2:

if(thisthread==thread_counter){
     /*save the solution and treat it */
}

Is terms of computer effort this is not the best solution obviously, but it serves my purposes.

Thank you for the help guys

2 Comments

Are you really sure about this solution? You could end up in a situation where you get more and more threads and because of less cpu time per thread you could need to wait much longer.
Actually, after some time trying this solution i found out that the problem is exactly what you say... Time to try some of the suggested solutions

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.