I am writing a set of Python wrappers via pybind11 for an optimization library whose heavy-lifting code was written in C++.
The abstract class hierarchy of my C++ code to be wrapped currently looks something like this (multivariate.h):
typedef std::function<double(const double*)> multivariate;
struct multivariate_problem {
// objective
multivariate _f;
int _n;
// bound constraints
double *_lower;
double *_upper;
multivariate_problem(const multivariate f, const int n, double *lower,
double *upper):
_f(f), _n(n), _lower(lower), _upper(upper)) {
}
};
struct multivariate_solution {
... // some code here for return a solution (seems to work ok)
};
class MultivariateOptimizer {
public:
virtual ~MultivariateOptimizer() {
}
// initialize the optimizer
virtual void init(const multivariate_problem &f, const double *guess) = 0;
// perform one step of iteration
virtual void iterate() = 0;
// retrieve current solution
virtual multivariate_solution solution() = 0;
// this essentially calls init(), performs a number of iterate() calls, returns solution()
virtual multivariate_solution optimize(const multivariate_problem &f,
const double *guess) = 0;
};
Now, my pybind code looks something like this currently (multivariate_py.cpp):
// wrap the function expressions
typedef std::function<double(const py::array_t<double>&)> py_multivariate;
// wrap the multivariable optimizer
void build_multivariate(py::module_ &m) {
// wrap the solution object
py::class_<multivariate_solution> solution(m, "MultivariateSolution");
...
// wrap the solver
py::class_<MultivariateOptimizer> solver(m, "MultivariateSearch");
// corresponds to MultivariateSearch::optimize()
solver.def("optimize",
[](MultivariateOptimizer &self, py_multivariate py_f,
py::array_t<double> &py_lower,
py::array_t<double> &py_upper,
py::array_t<double> &py_guess) {
const int n = py_lower.size();
double *lower = static_cast<double*>(py_lower.request().ptr);
double *upper = static_cast<double*>(py_upper.request().ptr);
const multivariate &f = [&py_f, &n](const double *x) -> double {
const auto &py_x = py::array_t<double>(n, x);
return py_f(py_x);
};
const multivariate_problem &prob { f, n, lower, upper };
double *guess = static_cast<double*>(py_guess.request().ptr);
return self.optimize(prob, guess);
}, "f"_a, "lower"_a, "upper"_a, "guess"_a,
py::call_guard<py::scoped_ostream_redirect,
py::scoped_estream_redirect>());
// corresponds to MultivariateSearch::init()
solver.def("initialize",
[](MultivariateOptimizer &self, py_multivariate py_f,
py::array_t<double> &py_lower,
py::array_t<double> &py_upper,
py::array_t<double> &py_guess) {
const int n = py_lower.size();
double *lower = static_cast<double*>(py_lower.request().ptr);
double *upper = static_cast<double*>(py_upper.request().ptr);
const multivariate &f = [&py_f, &n](const double *x) -> double {
const auto &py_x = py::array_t<double>(n, x);
return py_f(py_x);
};
const multivariate_problem &prob { f, n, lower, upper };
double *guess = static_cast<double*>(py_guess.request().ptr);
return self.init(prob, guess);
}, "f"_a, "lower"_a, "upper"_a, "guess"_a,
py::call_guard<py::scoped_ostream_redirect,
py::scoped_estream_redirect>());
// corresponds to MultivariateSearch::iterate()
solver.def("iterate", &MultivariateOptimizer::iterate);
// corresponds to MultivariateSearch::solution()
solver.def("solution", &MultivariateOptimizer::solution);
// put algorithm-specific bindings here
build_acd(m);
build_amalgam(m);
build_basin_hopping(m);
...
As you can see, I internally wrap the python function to a C++ function, then wrap the function inside a multivariate_problem object to pass to my C++ backend.
Which would then be called by pybind with the following entry point (mylibrary.cpp):
namespace py = pybind11;
void build_multivariate(py::module_&);
PYBIND11_MODULE(mypackagename, m) {
build_multivariate(m);
...
}
pybind11 will compile this without errors through setup.py via typical pip install . commands. In fact, most of the code is working correctly for both initialize() and optimize() calls. For example, the following Python code will run fine (assuming mylibrary is replaced with the name of my package in setup.py, and MySolverName is a particular instance of MultivariateSearch:
import numpy as np
from mylibrary import MySolverName
# function to optimize
def fx(x):
total = 0.0
for x1, x2 in zip(x[:-1], x[1:]):
total += 100 * (x2 - x1 ** 2) ** 2 + (1 - x1) ** 2
return total
n = 10 # dimension of problem
alg = MySolverName(some_args_to_pass...)
sol = alg.optimize(fx,
lower=-10 * np.ones(n),
upper=10 * np.ones(n),
guess=np.random.uniform(low=-10., high=10., size=n))
print(sol)
However, here is where I am currently stuck. I would also like the user to have an option to run the solvers interactively, which is where the initialize and iterate functions come into play. However, the binding provided above, which works for optimize, does not work for iterate.
To illustrate a use case, the following code does not run:
import numpy as np
from mylibrary import MySolverName
# function to optimize
def fx(x):
... return result here for np.ndarray x
n = 10 # dimension of problem
alg = MySolverName(some_args_to_pass...)
alg.initialize(f=fx,
lower=-10 * np.ones(n),
upper=10 * np.ones(n),
guess=np.random.uniform(low=-10., high=10., size=n))
alg.iterate()
print(alg.solution())
When the iterate command is executed, the script hangs and then terminates after a while, typically with no error, and it does not print the last line. In rare cases, it produces a crash on the Python side that the "dimension cannot be negative", but there is no guidance on which memory this pertains to. Without the iterate() line, the code will successfully print the last line as expected.
When I std::cout inside the C++ iterate() function, it always hangs at the point right before any calls to the function fx and does not print anything after the call, so I have narrowed down the problem to the Python function not persisting correctly with iterate() as the entry point. However, optimize() calls iterate() internally within the C++ code, and the previous example is working successfully, so the error is rather odd.
I've tried to go through the documentation for pybind, but I could not find an example that achieves precisely what I am trying to do above. I have tried other solutions like keep_alive<>, to no avail.
Can you help me modify the pybind wrappers, such that the internally stored Python function will persist inside the C++ object, and execute correctly even after the control is again restored to the Python interpreter between iterate() calls? Thanks for your help!
solver.def("initialize"definition, you probably have dangling references and/or pointers that cause undefined behavior. It might not manifest inoptimize, because it's executed as a single statement. To be sure you would need to include some simple example of concrete Solver class. But even better, change your interfaces to takestd::vector<double>instead ofdouble*, and then all your problems will magically disappear. If you don' want to changeMultivariateOptimizer, then provide a wrapper class.