4

I am trying to make an application using python code and C++, but I need some kind of protection against infinite loops or too long executions. I tried following some threads and seeing how other people solved their issues, but couldn't find a working solution. I have this example code:

#include <fstream>
#include <iostream>
#include <string>
#include <thread>
#include <atomic>
#include <chrono>
#include <cstdlib>

#include "pybind11/pybind11.h"
#include "pybind11/embed.h"

namespace py = pybind11;

std::atomic<bool> stopped(false);
std::atomic<bool> executed(false);

void python_executor(const std::string &code)
{
  py::gil_scoped_acquire acquire;
  try
  {
      std::cout << "+ Executing python script..." << std::endl;

      py::exec(code);

      std::cout << "+ Finished python script" << std::endl;
  }
  catch (const std::exception &e)
  {
      std::cout << "@ " << e.what() << std::endl;
  }

  executed = true;

  std::cout << "+ Terminated normal" << std::endl;
}

int main()
{
    std::cout << "+ Starting..." << std::endl;

    std::string code = R"(
# infinite_writer.py
import time

file_path = r"C:\Temp\loop.txt"
counter = 1

while True:
    with open(file_path, "a") as f:
        f.write(f"Line {counter}\n")
    counter += 1
    time.sleep(1)  # optional: wait 1 second between writes

)";
    py::scoped_interpreter interpreterGuard{};
    py::gil_scoped_release release;
    std::thread th(python_executor, code);
    auto threadId = th.get_id();
    std::cout << "+ Thread: " << threadId << std::endl;

    // stopped = true;
    int maxExecutionTime = 10;
    auto start = std::chrono::steady_clock::now();
    while (!executed)
    {
      auto elapsed = std::chrono::steady_clock::now() - start;
      if (elapsed > std::chrono::seconds(maxExecutionTime)) {
        std::cout << "Interrupting...";
        PyErr_SetInterrupt();
        std::this_thread::sleep_for(std::chrono::seconds(1));
        if (th.joinable()) {
          th.join();
          executed = true;
          break;
        }
      }
      std::this_thread::sleep_for(std::chrono::seconds(1));
      std::cout << "+ Waiting..." << std::endl;

    }

    // Make sure to join the thread if it's still running
    if (th.joinable()) {
        th.join();
    }

    std::cout << "+ Finished" << std::endl;

    return EXIT_SUCCESS;
}

It's purposely an infinite loop in python, to try and stop it when timeout hits, but it never finishes, I got the following console response:

+ Starting...
+ Thread: 28596
+ Executing python script...
+ Waiting...
+ Waiting...
+ Waiting...
+ Waiting...
+ Waiting...
+ Waiting...
+ Waiting...
+ Waiting...
+ Waiting...
+ Waiting...
Interrupting...

It stays at this part infinitely, proving that it didn't finish execution of the python code. What can I do to correctly end the execution?

I tried exactly like the example code, with PyErr_SetInterrupt, and also tried using PyEval_SetTrace lie said in this thread on github: https://github.com/pybind/pybind11/issues/2677 with the same result, code still runs after trying to stop it.

6
  • 1
    staging-ground-comment from anonymous user: " There isn't much you can do with a thread other than wait it out. If you can, write the thread function with regular checks of a termination flag. If the flag is set, the thread politely exits. If you can't write a thread that can politely exit on demand, consider running the potentially too slow code in another process that you can safely kill. Avoid killing a thread because too many things can go wrong and leave you with an unstable program. " Commented Oct 14 at 20:16
  • 2
    Side note: The asker says in a comment on Staging Ground that the end goal is to allow clients to run their own code. Isolate this code in another process. There is NOTHING you can do to protect clients from themselves, but you can protect your system from the client by isolating their code. This will make things a bit slower and be trickier for you to implement, but it's in your interests to minimize the amount of debugging of your code that result from the bugs of the client's plug-ins. Commented Oct 14 at 20:32
  • Have you tried intentionally throwing an exception in the python code after hitting your timeout? Commented Oct 14 at 21:53
  • I haven't tried that yet cause the idea is I won't have control over the code, considering that it is going to be user provided in the application. But I am trying to figure out a way to wrap the user code with some python code to make the behavior I'm aiming for. Commented Oct 14 at 22:22
  • You should do something like PyThreadState_SetAsyncExc(python_executor_thread_id, PyExc_KeyboardInterrupt); while holding GIL instead of PyErr_SetInterrupt. python_executor_thread_id should be set by python_executor thread after acquiring gil. Commented Oct 14 at 23:42

1 Answer 1

2

You can isolate user's Python code execution in a child process, and after timeout, send signal to it.

  1. Fork child process

  2. Start executing user's Python code in it

  3. In main process, start a thread that will send a signal to that child process when the time is over

  4. In main process, wait for the child to finish (normally or by signal)

Example for Unix-like OS:

#include <iostream>
#include <string>
#include <thread>
#include <chrono>
#include <cstdlib>
#include <csignal>
#include <vector>
#include <sys/types.h>
#include <sys/wait.h> 

std::vector<char*> vecstr2vecc(const std::vector<std::string>& str_vec)
{
    std::vector<char*> res(str_vec.size() + 1, nullptr);
    for (size_t i = 0; i < str_vec.size(); i++)
        res[i] = const_cast<char*>(str_vec[i].c_str());
    return res;
}

void task(const std::string& code, size_t maxExecutionTime)
{
    pid_t pid = fork();
    if (pid)  // parent
    {
        std::cout << "Started task with pid=" << pid << std::endl;
        std::thread task_timeout_stopper([pid, maxExecutionTime]() {
            std::this_thread::sleep_for(std::chrono::seconds(maxExecutionTime));
            if (kill(pid, SIGINT) == 0)
            {
                std::cout << "Sent signal to pid=" << pid << std::endl;
            }
        });
        task_timeout_stopper.detach();
        waitpid(pid, NULL, 0);
        std::cout << "pid=" << pid << " finished." << std::endl;
    }
    else  // child
    {
        std::vector<std::string> args = { "python", "-c", code };
        execvp(args.front().c_str(), vecstr2vecc(args).data());
    }
}


int main()
{
    std::string code = R"(
import time

file_path = r"loop.txt"
counter = 1

while counter < 10:  # execute for 10 seconds
    with open(file_path, "a") as f:
        f.write(f"Line {counter}\n")
    counter += 1
    time.sleep(1)  # optional: wait 1 second between writes

)";

    task(code, 15); // this task will end in 10 seconds, before timeout of 15
    task(code, 5);  // this task will be interrupted after 5 seconds
    return 0;
}

This gives me the following output:

Started task with pid=120700
pid=120700 finished.
Started task with pid=120737
Sent signal to pid=120737
Traceback (most recent call last):
  File "<string>", line 12, in <module>
KeyboardInterrupt
pid=120737 finished.

The results in loop.txt file are as expected:

Line 1
Line 2
Line 3
Line 4
Line 5
Line 6
Line 7
Line 8
Line 9
Line 1
Line 2
Line 3
Line 4
Line 5
Sign up to request clarification or add additional context in comments.

4 Comments

Really sorry for not answering before. This is a really good alternative. Just a question. I would need to get the final context from pybind after the execution. Is that possible using this implementation? I'm very much a noob on threads and forks.
@Luck_Silva What kind of context from pybind11 do you mean? According to github.com/pybind/pybind11/blob/master/include/pybind11/eval.h, pybind11::exec() function after executing multi-statement Python code returns just None.
When you pass a context dict to the exec call, you can access that context dict after the exec call. When it creates new variables inside the execution, this variables are available to access in C++ after. Consider this: I have a script that receives 2 parameters, "a" and "b", both numbers. Inside the script, it creates a third variable, "c", which is a calculation with "a" and "b". I want to access this result variable "c", after the call to exec().
Please remember that in implementation as above, the Python code will be executed in a separate process. So, locals and globals dicts after the execution will be there, you'll have to access them from the main process. The easiest way to exchange the data between processes is just a file. In the child process, you can serialize locals and globals dicts to text JSON or binary pickle, for ex. task123_context.txt file. In main process, you can read this data into your C++ pybind code, knowing the task id. It's workable for simple types, but for arbitrary types can be challenging.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.