# Speedup Formulation Formulating a large combinatorial optimization problem using Python's for statements can be very time-consuming. The Amplify SDK provides a fast way to formulate large optimization problems in practical time. For example, significant speedups can be achieved using Numpy-like methods in the {py:class}~amplify.PolyArray class. This page describes several methods for fast formulations, using the binary variable formulation of the traveling salesperson problem as an example. ## Formulating the traveling salesperson problem The traveling salesperson problem with $n$ cities can be formulated using $(n+1)n$ binary variables as follows: Let $d$ be a distance matrix, where $d_{i,j}$ denotes the distance between city $i$ and city $j$. {seealso} We will not explain what the variables, objective function, and constraints refer to since the meaning of the formulation is unimportant here. See [](tsp.ipynb) for more details on formulating the traveling salesperson problem.  \begin{align} \text{minimize} \quad & \sum_{0 \leq i, j, k < n} d_{i, j} q_{k, i} q_{k+1, j} & \\ \text{subject to} \quad & \sum_{0 \leq i < n} q_{k, i} = 1 \quad & \text{for} \quad k \in \{0, 1, \ldots, n - 1\}, \\ & \sum_{0 \leq k < n} q_{k, i} = 1 \quad & \text{for} \quad i \in \{0, 1, \ldots, n - 1\}, \\ & q_{0, i} = q_{n, i} \quad & \text{for} \quad i \in \{0, 1, \ldots, n - 1\}, \\ & q_{k, i} \in \{0, 1\} & \end{align} Before starting the formulation, let's create the distance matrix distance (= $d$). This time distance will be generated as a random symmetric matrix. {testcode} import numpy as np NUM_CITIES = 100 # Generate a random symmetric matrix distance = np.zeros((NUM_CITIES, NUM_CITIES)) for i in range(NUM_CITIES): for j in range(i+1, NUM_CITIES): distance[i, j] = distance[j, i] = np.random.rand()  {note} For practical purposes, it is better to generate the coordinates of each city and find the distance matrix. For example, using [scipy](https://scipy.org/) the following method can be used for fast computation. python from scipy.spatial import distance locations = np.random.random((NUM_CITIES, 2)) # randomly generate coordinates distance = distance.cdist(locations, locations, metric='euclidean')   ## Naive formulations The for statement and the list comprehension, you can formulate the problem as follows. {testcode} from amplify import Poly, VariableGenerator, one_hot # Create decision variables gen = VariableGenerator() q = gen.array("Binary", NUM_CITIES + 1, NUM_CITIES) # Set q[NUM_CITIES, i] = q[0, i] for i in range(NUM_CITIES): q[NUM_CITIES, i] = q[0, i] # Construct the objective function objective = 0 for i in range(NUM_CITIES): for j in range(NUM_CITIES): for k in range(NUM_CITIES): objective += distance[i, j] * q[k, i] * q[k + 1, j] # Construct constraints row_one_hot_constraints = sum( one_hot(sum(q[i, k] for k in range(NUM_CITIES))) for i in range(NUM_CITIES) ) col_one_hot_constraints = sum( one_hot(sum(q[i, k] for i in range(NUM_CITIES))) for k in range(NUM_CITIES) ) # Create a model model = objective + (row_one_hot_constraints + col_one_hot_constraints)  The resulting formulation time is as follows (depending on the execution environment). {list-table} :header-rows: 1 :width: 90% :widths: 1 1 1 * - Objective function - Constraints - Total * - 1.55 s - 13.3 ms - 1.56 s  ## Improvement #1: Not using the built-in sum function In the first line of the above code, add {testcode} from amplify import sum  and use the {py:func}amplify.sum function instead of the built-in {py:func}sum function. The {py:func}amplify.sum function is a class sum-specific function provided by the Amplify SDK. Then the formulation time is improved as follows (depending on the execution environment). {list-table} :header-rows: 1 :width: 90% :widths: 1 1 1 * - Objective function - Constraints - Total * - 1.55 s - **3.95 ms** - 1.55 s  Simply importing the Amplify SDK's {py:func}amplify.sum function speeds up the constraint formulation by about a factor of 3. The difference from the overall time is small, but in the case of the traveling salesman problem, this is because the number of constraints ($2n$) is small compared to the number of terms in the objective function ($O\left(n^3\right)$). This optimization can be very effective if the {py:func}sum function is used in areas where the formulation is time-consuming. {note} The Amplify SDK's {py:func}amplify.sum function overrides the {py:func}sum function. The Amplify SDK's {py:func}sum function automatically falls back to the built-in {py:func}sum function for classes not provided by the Amplify SDK, allowing for replacement.  ## Improvement #2: Not using for statements and list comprehension The objective function used to be formulated using the for statement, but Python's for statement is slow. As much as possible, we will try to formulate them using the methods of {py:class}~amplify.PolyArray, a numpy-like polynomial array of the Amplify SDK, or numpy. First, let $A$ be the $n \times n$ matrix consisting of the upper $n rows of$q$and$B$be the$n\times n$matrix consisting of the lower$n$rows. Then the objective function is written as follows. $$\sum_{0 \leq i, j, k < n} d_{i, j} q_{k, i} q_{k+1, j} = \sum_{0 \leq i, j, k < n} d_{i, j} A_{k, i} B_{k, j}$$ Looking at$A$and$d$, $$(Ad)_{k, j} = \sum_{0 \leq k, j < n} A_{k, i} d_{i, j}.$$ Therefore, the objective function$\displaystyle \sum_{0 \leq i, j, k < n} d_{i, j} A_{k, i} B_{k, j}$can be the sum of "the matrix product of ($A$and$d$) and the element product of$B$". Thus, the objective function can be written with the Amplify SDK as follows. {testcode} q1 = q[:-1] q2 = q[1:] objective = ((q1 @ distance) * q2).sum()  Constraints can also be passed to the {py:func}~amplify.one_hot function with the axis keyword argument and written as follows.  row_one_hot_constraints = one_hot(q1, axis=1) col_one_hot_constraints = one_hot(q1, axis=0)  Overall the code will looks like this. {testcode} from amplify import VariableGenerator, one_hot # Generate decision variables gen = VariableGenerator() q = gen.array("Binary", NUM_CITIES + 1, NUM_CITIES) # Set q[NUM_CITIES, i] = q[0, i] q[-1, :] = q[0, :] # Create slices of q q1 = q[:-1] q2 = q[1:] # Constructe the objective function objective = ((q1 @ distance) * q2).sum() # Construct the constraints row_one_hot_constraints = one_hot(q1, axis=1) col_one_hot_constraints = one_hot(q1, axis=0) model = objective + (row_one_hot_constraints + col_one_hot_constraints)  With this modification, the time taken is improved as follows (depending on the execution environment). {list-table} :header-rows: 1 :width: 90% :widths: 1 1 1 * - Objective function - Constraints - Total * - **152.1 ms** - **0.870 ms** - **154.4 ms**  ## Improvement #3: taking it straightforwardly In the "Improvement #2", we tried to transform the objective function into a form that would work with numpy's methods, but it takes a lot of work to think about the transformation. The Amplify SDK supports the {py:func}~amplify.einsum function, which can mechanically formulate such a function using Einstein's contraction notation. First, as in "Improvement #2", by denoting the$n \times n$matrix consisting of the upper$n$rows of$q$as$A$and the$n\times n$matrix consisting of the lower n rows as$B$, the objective function is written as follows. $$\sum_{0 \leq i, j, k < n} d_{i, j} q_{k, i} q_{k+1, j} = \sum_{0 \leq i, j, k < n} d_{i, j} A_{k, i} B_{k, j}$$ Note that the range over which the subscripts$i$,$j$, and$k$of the matrices$d$,$A$, and$B$appearing in this formula move coincides with the length of the corresponding row or column, respectively. In such a case, using the {py:func}~amplify.einsum function, the computation of the objective function can be written as follows.  # Construct the objective function objective = einsum("ij,ki,kj->", distance, q1, q2)  Here, the first argument of the {py:func}~amplify.einsum function represents the sequence of subscripts of the product in Einstein's contraction notation, where "ij,ki,kj->"{l=python}" is the product$d_{i, j} A_{k, i} B_{k, j}\$. The second and subsequent arguments then give the arrays corresponding to the respective subscripts. The complete code is as follows. {testcode} from amplify import VariableGenerator, one_hot, einsum # Create decision variables gen = VariableGenerator() q = gen.array("Binary", NUM_CITIES + 1, NUM_CITIES) # Set q[NUM_CITIES, i] = q[0, i] q[-1, :] = q[0, :] # Create slices of q q1 = q[:-1] q2 = q[1:] # Construct the objective function objective = einsum("ij,ki,kj->", distance, q1, q2) # Constract the constraints row_one_hot_constraints = one_hot(q1, axis=1) col_one_hot_constraints = one_hot(q1, axis=0) # Create a model model = objective + (row_one_hot_constraints + col_one_hot_constraints)  With this formulation, we observed the following times (depending on the execution environment). {list-table} :header-rows: 1 :width: 90% :widths: 1 1 1 * - Objective function - Constraints - Total * - **58.11 ms** - 0.870 ms - **60.4 ms**  As a result of the above, the objective function and overall computation time became **25 times** faster, and the constraint construction became **15 times** faster.