Subset Sum Problem¶
This page discuss the subset sum problem as a simple example of formulation and solving with the Amplify SDK.
A subset sum problem is the following problem.
Let \(N\) and \(K\) be positive integers. There is a sequence \(A_0, A_1, \ldots, A_{N - 1}\) consisting of \(N\) positive integers. Find a subsequence of this sequence whose sum of elements is closest to \(K\).
Alternatively, the following problem is a subset sum problem.
There are \(N\) videos of varying lengths. Which of these videos should you select to make the total time closest to \(K\) minutes?
Creating the Problem¶
Before solving a subset sum problem with the Amplify SDK, we first create a subset sum problem and express it in program code. In this example, we will work with a simple problem with \(N=10\) elements.
import numpy as np
N = 10
K = 27
A = np.array([2, 10, 3, 8, 5, 7, 9, 5, 3, 2])
Formulation with the Amplify SDK¶
To perform the formulation with Amplify SDK, we need to reformulate the subset sum problem as a polynomial minimization problem.
First, “the sum of the elements of the subsequence closest to \(K\)” can be reformulated as “the difference between the sum of the elements of the subsequence and \(K\) squared is the least”. Also, “the sum of the elements of the subsequence” can be written as \(q_0 A_0 + q_1 A_1 + \cdots + q_{N-1} A_{N-1}\), where the variable \(q_i\) takes \(1\) if the \(i\)-th element of the sequence is chosen, \(0\) otherwise.
Thus, we can formulate the subset sum problem as follows.
If the variables \(q_0, q_1, \cdots, q_{N-1}\) take the values \(0\) or \(1\), minimize the function value below.
The above \(\quad (q_0 A_0 + q_1 A_1 + \cdots + q_{N-1} A_{N-1} - K)^2\) is called the objective function.
Let’s express the above in the Amplify SDK.
First, create the decision variables (the variables we will optimize in combinatorial optimization). First, instantiate a class VariableGenerator
to issue variables. \(N\) variables are required, all of which take \(0\) or \(1\) values. Variables that take \(0\) or \(1\) values are called binary variables, and we can create them easily using the Amplify SDK.
from amplify import VariableGenerator
gen = VariableGenerator()
# The variable type as the 1st and the number of variables as the 2nd arguments.
q = gen.array("Binary", N)
q
Now we create the objective function \((q_0 A_0 + q_1 A_1 + \cdots + q_{N-1} A_{N-1} - K)^2\). Using NumPy-like notation, we can implement as follows.
objective = ((q * A).sum() - K) ** 2
print(objective)
40 q_0 q_1 + 12 q_0 q_2 + 32 q_0 q_3 + 20 q_0 q_4 + 28 q_0 q_5 + 36 q_0 q_6 + 20 q_0 q_7 + 12 q_0 q_8 + 8 q_0 q_9 + 60 q_1 q_2 + 160 q_1 q_3 + 100 q_1 q_4 + 140 q_1 q_5 + 180 q_1 q_6 + 100 q_1 q_7 + 60 q_1 q_8 + 40 q_1 q_9 + 48 q_2 q_3 + 30 q_2 q_4 + 42 q_2 q_5 + 54 q_2 q_6 + 30 q_2 q_7 + 18 q_2 q_8 + 12 q_2 q_9 + 80 q_3 q_4 + 112 q_3 q_5 + 144 q_3 q_6 + 80 q_3 q_7 + 48 q_3 q_8 + 32 q_3 q_9 + 70 q_4 q_5 + 90 q_4 q_6 + 50 q_4 q_7 + 30 q_4 q_8 + 20 q_4 q_9 + 126 q_5 q_6 + 70 q_5 q_7 + 42 q_5 q_8 + 28 q_5 q_9 + 90 q_6 q_7 + 54 q_6 q_8 + 36 q_6 q_9 + 30 q_7 q_8 + 20 q_7 q_9 + 12 q_8 q_9 - 104 q_0 - 440 q_1 - 153 q_2 - 368 q_3 - 245 q_4 - 329 q_5 - 405 q_6 - 245 q_7 - 153 q_8 - 104 q_9 + 729
Note
Of course, in classic Python style, you can implement it as follows.
objective = sum(x * a for x, a in zip(q, A)).
However, depending on the problem and the data size, such writing can be time-consuming. We recommend a NumPy-like writing like objective = (q * A) ** 2
. See Speedup Formulation for details.
Now, we use the objective function to construct a combinatorial optimization model.
from amplify import Model
model = Model(objective)
print(model)
minimize:
40 q_0 q_1 + 12 q_0 q_2 + 32 q_0 q_3 + 20 q_0 q_4 + 28 q_0 q_5 + 36 q_0 q_6 + 20 q_0 q_7 + 12 q_0 q_8 + 8 q_0 q_9 + 60 q_1 q_2 + 160 q_1 q_3 + 100 q_1 q_4 + 140 q_1 q_5 + 180 q_1 q_6 + 100 q_1 q_7 + 60 q_1 q_8 + 40 q_1 q_9 + 48 q_2 q_3 + 30 q_2 q_4 + 42 q_2 q_5 + 54 q_2 q_6 + 30 q_2 q_7 + 18 q_2 q_8 + 12 q_2 q_9 + 80 q_3 q_4 + 112 q_3 q_5 + 144 q_3 q_6 + 80 q_3 q_7 + 48 q_3 q_8 + 32 q_3 q_9 + 70 q_4 q_5 + 90 q_4 q_6 + 50 q_4 q_7 + 30 q_4 q_8 + 20 q_4 q_9 + 126 q_5 q_6 + 70 q_5 q_7 + 42 q_5 q_8 + 28 q_5 q_9 + 90 q_6 q_7 + 54 q_6 q_8 + 36 q_6 q_9 + 30 q_7 q_8 + 20 q_7 q_9 + 12 q_8 q_9 - 104 q_0 - 440 q_1 - 153 q_2 - 368 q_3 - 245 q_4 - 329 q_5 - 405 q_6 - 245 q_7 - 153 q_8 - 104 q_9 + 729
We have completed the model construction.
Solver setup¶
You can configure various combinatorial optimization solvers with the Amplify SDK. On this page, we will use Fixstars Amplify AE, which is provided as a cloud service.
First, create a solver client FixstarsClient
for the Amplify SDK corresponding to Amplify AE to specify which solver to use.
from amplify import FixstarsClient
client = FixstarsClient()
Then, we set up the token and solver’s parameters.
from datetime import timedelta
client.token = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
client.parameters.timeout = timedelta(milliseconds=1000)
This completes the solver setup.
Solver execution with the Amplify SDK¶
Let the Amplify SDK obtain the solution of the combinatorial optimization model created above by executing the solver corresponding to the solver client. The following code executes the Amplify AE.
from amplify import solve
result = solve(model, client)
The value of the variables \(q = (q_0, q_1, \ldots, q_{N-1})\) can be obtained as follows.
q_values = q.evaluate(result.best.values)
print(q_values)
[0. 0. 0. 1. 0. 0. 1. 1. 1. 1.]
The binary variables \(q_i\) take \(1\) if the \(i\)-th element of the number sequence \(A\) is chosen and \(0\) otherwise. The optimal solution to the subset sum problem is as follows.
print(f"selected numbers: {[a for x, a in zip(q_values, A) if x == 1]}")
selected numbers: [8, 9, 5, 3, 2]
You can see that the sum is close to \(K=27\).
print(
f"{K=}, sum of selected numbers = {sum(a for x, a in zip(q_values, A) if x == 1)}"
)
K=27, sum of selected numbers = 27