I'm trying to formulate an LP that is in essence a variant of the sudoku problem, and I've repurposed the code from https://coin-or.github.io/pulp/CaseStudies/a_sudoku_problem.html.
The differences are now that instead of a Boxes constraint in the original sudoku formulation, I now want each possible pair-wise sequences across a row to appear at least once (ie, the sequence 1-2 should appear once, 2-1 should appear once, etc).
In this specific scenario it is equivalent to having no duplicated pairwise sequences, but I intend to generalize this code for values/rows/columns that aren't equal.
To that end I've attempted two ways of formulating it:
- Making a new binary variable for the link between two cells, denoted [v1][v2][r][l], where l can take values from 1-8 (8 links in total between 9 cells).
# All rows, columns and values within a Sudoku take values from 1 to 9
VALS = ROWS = COLS = range(1, 10)
LINKS = range(1, 9) #8 links between cells in a given row
links = LpVariable.dicts("Links", (VALS, VALS, ROWS, LINKS), cat = "Binary")
Just like choices[v][r][c] is denoted as:
- 1 if the cell at (r,c) has value v
- 0 otherwise
links[v1][v2][r][l] is denoted as:
- 1 if cell at [r][l] has value v1 and cell at [r][l + 1] has value v2
- 0 otherwise
After setting the typical constraints on links (each individual link sums to 1 across all [v1][v2]), I then set the constraint as such:
for v1 in VALS:
for v2 in VALS:
if v1 != v2:
prob += lpSum([links[v1][v2][r][l] for r in ROWS for l in LINKS]) >= 1
#Each link of v1 - v2 should appear at least once across all rows/links
The problem is now that I'm dealing with significantly more binary variables with this formulation - the original sudoku LP deals with 729 binary variables and here I deal with 5000-ish at the minimum and I'm not sure if this is efficient (or even feasible if I expand the range of rows/columns/values).
- Having a multiplicative boolean variable
Since each individual link is a binary variable that's 1 if both cell choices are 1, link[v1][v2][r][c] can be expressed as choices[v1][r][c] * choices[v2][r][c+1], and as such it should be possible to formulate a constraint as:
for v1 in VALS:
for v2 in VALS:
if v1 != v2:
prob += lpSum(choices[v1][r][l] * choices[v2][r][l+1] for r in ROWS for c in LINKS) >= 1
but from what I understand of PuLP, constraints cannot involve multiplication since that makes them not linear. Linearizing it like in 1) would resolve it but run into the issues of way too many variables and constraints for a feasible run time if the problem is expanded in any way.
Are there better ways to formulate this constraint such that it doesn't require the creation of a 4d array to specify/solve for each individual link, and doesn't require multiplication in its constraints?