BlackBox Optimization with Quantum Annealing and Ising Machines¶
This example program introduces a method of blackbox optimization, which may be called Factorization Machines and Quantum Annealing (FMQA). You can find example programs of FMQA applications in various fields such as material search, fluid engineering, chemical plant and urban transportation.
Introduction to FMQA¶
Blackbox optimization¶
FMQA is one of the black box optimization methods. Usually, in mathematical optimization, the objective is to find a set of decision variables $\boldsymbol{x}$ such that some objective function $y = f(\boldsymbol{x})$ is minimized (or maximized). Here, $\boldsymbol{x}$ is assumed to be a binary variable vector with size $d$ and each element taking the value 0 or 1.
$$ \begin{aligned} \mathrm{Minimize}&\,\,f(\boldsymbol{x}) \\ \mathrm{subject\,\,to\,\,}&\boldsymbol{x} \in [0,1]^d \end{aligned} $$
Here, if information about the objective function $f(\boldsymbol{x})$ (functional form, gradient, submodularity, convexity, etc.) is given, efficient optimization can be performed. For example, suppose the function $f(\boldsymbol{x})$ is known (and is quadratic in $\boldsymbol{x}$), as in some optimization problems shown in the Amplify demo tutorial. In such a case, $f(\boldsymbol{x})$ can be used as the objective function to perform the optimization directly as a quadratic unconstrained binary optimization (QUBO: Quadratic Unconstrained Binary Optimization) problem.
On the other hand, in the case of optimization to minimize (or maximize) values obtained by simulation or experiment for physical or social phenomena, the objective function $f(\boldsymbol{x})$ corresponds to simulation or experiment, and the function cannot be described explicitly. Mathematical optimization for such an unknown objective function $f(\boldsymbol{x})$ is called blackbox optimization.
In addition, evaluating such an objective function (running simulations or experiments) is usually relatively expensive (in terms of time and money, etc). Therefore, even if the set of decision variables is finite, optimization by full search is generally difficult. Therefore, an optimization method with as few objective function evaluations as possible is required.
FMQA introduction¶
FMQA is a blackbox optimization method that combines machine learning and quantum annealing. The method iterates through the cycles shown in the figure below, simultaneously searching for both a good approximation by a secondorder polynomial of the blackbox function and the input that yields the minimum value of that polynomial.
First, we compute a secondorder polynomial that approximates the black box function using machine learning. Next, the optimization solver finds the input $x$ that minimizes the quadratic polynomial. The $x$ obtained by the Ising machine is then input to the black box function. Suppose the secondorder polynomial obtained by machine learning approximates the blackbox function well enough. In that case, we can expect that $\boldsymbol{x}$ outputs a small value when input to the blackbox function. If not, we can still expect a better polynomial approximation of the black box function in the next training cycle by adding the data to the training data set and performing machine learning again.
The secondorder polynomial approximation of the blackbox function uses a machinelearning model called the Factorization Machine (FM), consisting of the following polynomial where $d$ is a constant representing the length of the input to the blackbox function, $\boldsymbol{v}$, $\boldsymbol{w}$, and $w_0$ are the parameters of the model, and $k$ is a hyperparameter representing the size of the parameters.
$$ \begin{aligned} f(\boldsymbol{x}  \boldsymbol{w}, \boldsymbol{v}) &= w_0 + \langle \boldsymbol{w}, \boldsymbol{x}\rangle + \sum_{i=1}^d \sum_{j=i+1}^d \langle \boldsymbol{v}_i, \boldsymbol{v}_j \rangle x_i x_j \\ &=w_0 + \sum_{i=1}^d w_i x_i + \sum_{i=1}^d \sum_{j=i+1}^d \sum_{f=1}^k v_{if}v_{jf}x_ix_j \\ &=w_0 + \sum_{i=1}^d w_i x_i + \frac{1}{2}\sum_{f=1}^k\left(\left(\sum_{i=1}^d v_{i f} x_i\right)^2  \sum_{i=1}^d v_{i f}^2 x_i^2\right) \end{aligned} $$
Using the FM as a machine learning model has the following advantages.
 Minimization by an optimization (QUBO) solver is possible because the model is a quadratic polynomial
 The computational complexity of the model's inference can be parameterized
The hyperparameter $k$ is a positive integer less than or equal to the length $d$ of the input to the black box function, which has the effect of adjusting the number of parameters in the FM model. When $k=d$, the model has the same degrees of freedom as the QUBO interaction term, while a smaller $k$ reduces the number of parameters and suppresses overfitting.
FMQA procedure¶
In FMQA, we prepare the initial training data as follows, and then repeats the inference and optimization with the machine learning model.

Preparation of initial training data ($N_0$ samples)
 Prepare $N_0$ input samples $\{\boldsymbol{x}_1, \boldsymbol{x}_2, \cdots, \boldsymbol{x}_{N_0}\}$ and the corresponding $N_0$ outputs $\{f(\boldsymbol{x}_1 ), f(\boldsymbol{x}_2), \cdots, \boldsymbol{x}_{N_0}\}$ as initial training data.

FMQA optimization cycle ($N$ times)
 Train the FM model using the (most recent) training data and obtain the FM parameters $(\boldsymbol{v}, \boldsymbol{w})$.
 Estimate the input $\hat{\boldsymbol{x}}$ that minimizes the acquisition function $g(\boldsymbol{x})$ by using Amplify.
 Evaluate the objective function $f(\boldsymbol{x})$ with $\hat{\boldsymbol{x}}$ to obtain $\hat{y} = f(\hat{\boldsymbol{x}})$.
 Add $(\hat{\boldsymbol{x}}, \hat{y})$ to the training data.
Repeat steps 14 above for $N$ times.
This example program shows how to run FMQA using PyTorch and the Amplify SDK. The part of the code that minimizes the learned model uses the Fixstars Amplify Annealing Engine (Amplify AE), a GPUbased annealing machine, instead of quantum annealing (QA).
The following is a example program that performs blackbox optimization using FMQA.
Solver client configuration¶
First, we will configure the solver client of the solver we will use during the optimization cycles. In this example program, we will use Amplify AE.
from amplify import FixstarsClient
from datetime import timedelta
# Set a solver client
client = FixstarsClient()
# If you use Amplify in a local environment, enter the Amplify API token.
# client.token = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
# Set timeout to be 2 seconds
client.parameters.timeout = timedelta(milliseconds=2000)
Definition of blackbox function¶
Now we define the black box function to be used for black box optimization. A blackbox function is a function whose input is a 1dimensional array of binary variables with values of 0 or 1 and whose output is a real number. However, input arrays consisting of integers and real numbers can also be considered. Examples of such use are presented here.
The blackbox functions may simulate physical or social phenomena or return a value obtained from experiments for practical use. These are suitable for blackbox optimization because they cannot be expressed in mathematical formulas, and their properties are not obvious.
However, in this tutorial, instead of simulation or experiment, we will prepare an simple algebraic
function as a black box function. In the following, we define the make_blackbox_func
function
to create a "blackbox" function that takes a NumPy 1dimensional array with $d$ elements as input and
real numbers as output.
import numpy as np
from typing import Callable, Any
# Set random seed
seed = 1234
rng = np.random.default_rng(seed)
def make_blackbox_func(d: int) > Callable[[np.ndarray], float]:
"""Returns a function that takes a binary vector with size d as input and returns a float value"""
rng = np.random.default_rng(seed)
Q = rng.random((d, d))
Q = (Q + Q.T) / 2
Q = Q  np.mean(Q)
def blackbox(x: np.ndarray) > float:
assert x.shape == (d,) # x is a 1D array with size d
return x @ Q @ x # type: ignore
return blackbox
The function created here is assumed to be a quadratic function, but from now on we will estimate and
minimize the function, assuming that we do not know that the function created by
make_blackbox_func
is quadratic or any other properties.
Model training by machine learning¶
This section implements the part of FMQA that learns the optimal parameters (weights and bias) of the model by machine learning. It corresponds to the lower right part of the figure below, where the input is the training data and the output is the model.
First, the TorchFM
class representing the model by the Factorization Machine is defined
using PyTorch.
The following equation represents the Factorization Machine.
$$ \begin{aligned} f(\boldsymbol{x}  \boldsymbol{w}, \boldsymbol{v}) &= \underset{\color{red}{\mathtt{out\_linear}}}{\underline{ w_0 + \sum_{i=1}^d w_i x_i} } + \underset{\color{red}{\mathtt{out\_quadratic}}}{\underline{\frac{1}{2} \left[\underset{\color{red}{\mathtt{out\_1}}}{\underline{ \sum_{f=1}^k\left(\sum_{i=1}^d v_{i f} x_i\right)^2 }}  \underset{\color{red}{\mathtt{out\_2}}}{\underline{ \sum_{f=1}^k\sum_{i=1}^d v_{i f}^2 x_i^2 }} \right] }} \end{aligned} $$
The input $x$ of this model is a vector of the length $d$ as the input to the blackbox function, with the following three parameters.
 $v$: 2dimensional array of $d\times k$.
 $w$: 1D vector of length $d$.
 $w_0$: scalar
The only hyperparameter is $k$, which is given as a positive integer less than or equal to $d$.
The TorchFM
class defined below inherits from torch.nn.Module
and is
constructed from an input vector $x$ of size $d$ and a hyperparameter $k$. The hyperparameter $k$
controls
the number of parameters in the model; the larger the hyperparameter, the more parameters are, the
more
accurate the model becomes, but also the more prone to overfitting.
The TorchFM
class has attributes $v$, $w$, and $w_0$ of the model parameters and updates
these parameters as the training proceeds. According to the above formula, the forward
method
also outputs an estimate of $y$ from the input $x$. Since the parameters $v$, $w$, and $w_0$ are
needed to
construct a QUBO model later, we also define a function get_parameters
to output these
parameters.
import torch
import torch.nn as nn
# Set a random seed
torch.manual_seed(seed)
class TorchFM(nn.Module):
def __init__(self, d: int, k: int):
"""Define a model
Args:
d (int): Size of input vector
k (int): Hyperparameter k
"""
super().__init__()
self.d = d
self.v = torch.randn((d, k), requires_grad=True)
self.w = torch.randn((d,), requires_grad=True)
self.w0 = torch.randn((), requires_grad=True)
def forward(self, x: torch.Tensor) > torch.Tensor:
"""Take x as input and returns a predicted value y
Args:
x (torch.Tensor): 2D tensor of (number of samples × d)
Returns:
torch.Tensor: 1D tensor of predicted y (size is the number of samples)
"""
out_linear = torch.matmul(x, self.w) + self.w0
out_1 = torch.matmul(x, self.v).pow(2).sum(1)
out_2 = torch.matmul(x.pow(2), self.v.pow(2)).sum(1)
out_quadratic = 0.5 * (out_1  out_2)
out = out_linear + out_quadratic
return out
def get_parameters(self) > tuple[np.ndarray, np.ndarray, float]:
"""Returns the parameters (weights and bias) v, w, w0"""
np_v = self.v.detach().numpy().copy()
np_w = self.w.detach().numpy().copy()
np_w0 = self.w0.detach().numpy().copy()
return np_v, np_w, float(np_w0)
Next, we define a function train()
to perform machine learning of the model based on the
TorchFM
class. The input is the training data $x, y$ and an instance of the
TorchFM
model. By calling the train()
function, the TorchFM
parameters are trained and optimized.
In general machine learning, we split the data into training and validation data, optimize the parameters using the training data, and validate the model during training using the validation data. The model is validated at each epoch, and the parameters at the epoch with the highest prediction accuracy for the validation data are saved and used as the model after learning.
from torch.utils.data import TensorDataset, DataLoader, random_split
from tqdm.auto import tqdm, trange
import copy
def train(
x: np.ndarray,
y: np.ndarray,
model: TorchFM,
) > None:
"""Perform training of a FM model
Args:
x (np.ndarray): Training data (input vectors)
y (np.ndarray): Training data (output values)
model (TorchFM): a TorchFM model
"""
# Number of iterations
epochs = 2000
# Model optimizer
optimizer = torch.optim.AdamW([model.v, model.w, model.w0], lr=0.1)
# Loss function
loss_func = nn.MSELoss()
# Prepare the dataset
x_tensor, y_tensor = (
torch.from_numpy(x).float(),
torch.from_numpy(y).float(),
)
dataset = TensorDataset(x_tensor, y_tensor)
train_set, valid_set = random_split(dataset, [0.8, 0.2])
train_loader = DataLoader(train_set, batch_size=8, shuffle=True)
valid_loader = DataLoader(valid_set, batch_size=8, shuffle=True)
# Execute the training
min_loss = 1e18 # Save the minimum loss value
best_state = model.state_dict() # Save the best model parameters
# Visualize progress using `tqdm` instead of `range`
for _ in trange(epochs, leave=False):
# Training phase
for x_train, y_train in train_loader:
optimizer.zero_grad()
pred_y = model(x_train)
loss = loss_func(pred_y, y_train)
loss.backward()
optimizer.step()
# Validation phase
with torch.no_grad():
loss = 0
for x_valid, y_valid in valid_loader:
out_valid = model(x_valid)
loss += loss_func(out_valid, y_valid)
if loss < min_loss:
# Save the model parameter is the loss function value is updated
best_state = copy.deepcopy(model.state_dict())
min_loss = loss
# Update model based on the trainded parameters
model.load_state_dict(best_state)
Solving the model with Amplify¶
Next, we implement the anneal
function to minimize the inferred machine learning model,
corresponding to the lower left part of the FMQA cycle diagram (reproduced below), where the input is
the
model TorchFM
class after training, and the output is a vector $x$ that minimizes the
model.
Solve the following optimization problem for the class TorchFM
model learned earlier to
find
the input $x$ such that the inferred model is minimized.
$$ \underset{x}{\mathrm{argmin}} \quad \underset{\color{red}{\mathtt{out\_linear}}}{\underline{ w_0 + \sum_{i=1}^d w_i x_i} } + \underset{\color{red}{\mathtt{out\_quadratic}}}{\underline{\frac{1}{2} \left[\underset{\color{red}{\mathtt{out\_1}}}{\underline{ \sum_{f=1}^k\left(\sum_{i=1}^d v_{i f} x_i\right)^2 }}  \underset{\color{red}{\mathtt{out\_2}}}{\underline{ \sum_{f=1}^k\sum_{i=1}^d v_{i f}^2 x_i^2 }} \right] }} $$
The decision variable in this optimization problem is $x$. It is a onedimensional binary variable vector of length $d$, just like the input vector to the blackbox function. Also, $v$, $w$, and $w_0$, parameters in the learning phase, are constants here.
We define the anneal
function that performs the optimization using Amplify AE for the
given
model as follows. The anneal
function uses the VariableGenerator
to create a
1D
binary vector x
of length $d$, and then uses the binary variable array $x$ and $v$, $w$,
and
$w_0$ obtained from the TorchFM
class to create a Factorization Machine The objective
function to be optimized is created according to the formula of Factorization Machine
.
After building the optimization model, we will perform minimization of the objective function using
the
created QUBO model and the solver client FixstarsClient
defined earlier.
from amplify import VariableGenerator, Model, solve, Poly
def anneal(torch_model: TorchFM) > np.ndarray:
"""Take the parameters of the FM model and find x that gives the minimum value of the FM model as the QUBO model described by those parameters."""
# Generate a binary variable array of size d
gen = VariableGenerator()
x = gen.array("Binary", torch_model.d)
# Obtain FM parameters v, w, w0 from TorchFM
v, w, w0 = torch_model.get_parameters()
# Create the objective function
out_linear = w0 + (x * w).sum()
out_1 = ((x[:, np.newaxis] * v).sum(axis=0) ** 2).sum() # type: ignore
out_2 = ((x[:, np.newaxis] * v) ** 2).sum()
objective: Poly = out_linear + (out_1  out_2) / 2
# Construct combinatorial optimization model
amplify_model = Model(objective)
# Execute the optimization (pass the constructed QUBO model and the solver client created in the beginning)
result = solve(amplify_model, client)
if len(result.solutions) == 0:
raise RuntimeError("No solution was found.")
# Return the input that minimizes the model
return x.evaluate(result.best.values).astype(int)
Performing FMQA¶
Now that we have defined the train
function for machine learning and the
anneal
function for optimization, which are the core of FMQA, we can execute FMQA using these functions.
First, we create a blackbox
function for blackbox optimization as follows. This
function
takes a NumPy onedimensional vector of length $d = 100$ consisting of $0$ or $1$ and returns a float.
d = 100
blackbox = make_blackbox_func(d)
Creating initial training data¶
Next, the blackbox
function $y = f(\boldsymbol{x})$ is evaluated on the input vector
$\boldsymbol{x}$ to create $N_0$ initial training data. The blackbox
function
corresponds to
the results of a simulation or experiment, so you can create these data based on previous experiments
or
simulations.
Since we have prepared an appropriate function as a blackbox
function for simulation,
we
define the init_training_data
function to create the initial training data using a random
$N_0$ input vector $x$ as follows. This function takes the blackbox
function and the
number
of initial training data $N_0$ and returns $N_0$ input vectors $\boldsymbol{x}$ and corresponding
output
$y$ as initial training data.
def init_training_data(d: int, n0: int):
"""Craete initial training data of n0 samples"""
assert n0 < 2**d
# Generate n0 random inputs of size d
x = rng.choice(np.array([0, 1]), size=(n0, d))
# Check if these random inputs are unique. If not, modify corresponding inputs
x = np.unique(x, axis=0)
while x.shape[0] != n0:
x = np.vstack((x, np.random.randint(0, 2, size=(n0  x.shape[0], d))))
x = np.unique(x, axis=0)
# Evaluate blackbok function to obtain n0 output values
y = np.zeros(n0)
for i in range(n0):
y[i] = blackbox(x[i])
return x, y
N0 = 60 # Number of samples in the initlal training data
x_init, y_init = init_training_data(d, N0)
Now, we have $N_0$pair of initial training data.
print(x_init.shape, y_init.shape)
Running an FMQA cycle¶
Using the $N_0$ pairs of initial training data created above, we will run the FMQA cycle according to the following diagram.
We perform the following operations per FMQA cycle.
 Train the model
 Construct a model of class
TorchFM
and train it by calling thetrain
function on the model with initial training datax = x_init
,y = y_init
.
 Construct a model of class
 Minimize the model
 Minimize the model by passing a trained model of class
TorchFM
to theanneal
function to get $\hat{x}$ that minimizes the model.  If $\hat{x}$ is already included in the training data
x
, change part of $\hat{x}$ to avoid duplicate training data.
 Minimize the model by passing a trained model of class
 Evaluation of blackbox function
 Input $\hat{x}$ to the blackbox function and get output $\hat{y}$.
 Add $\hat{x}$ and $\hat{y}$ to the traiing data
x
andy
, respectively.
An example implementation of the above is as follows: Since the execution takes a few minutes of computation time, an example output is shown in Execution example.
# Number of FMQA cycles
N = 10
# Initialize training data
x, y = x_init, y_init
# Perform N FMQA cycles
# Use `tqdm` instead of `range` to visualize the progress
for i in trange(N):
# Create machinelearning model
model = TorchFM(d, k=10)
# Train the machinelearning model
train(x, y, model)
# Obtain the input that yields the minimum value of the trained model
x_hat = anneal(model)
# Ensure the uniqueness of x_hat
while (x_hat == x).all(axis=1).any():
flip_idx = rng.choice(np.arange(d))
x_hat[flip_idx] = 1  x_hat[flip_idx]
# Evaluate the blackbox function with the estimated x_hat
y_hat = blackbox(x_hat)
# Add the evaluation results to the training data set
x = np.vstack((x, x_hat))
y = np.append(y, y_hat)
tqdm.write(f"FMQA cycle {i}: found y = {y_hat}; current best = {np.min(y)}")
After running the above cell, x
and y
have inputoutput pairs of blackbox
function evaluations. The number of pairs is $N_0 + N = 70$.
print(x.shape, y.shape)
The following gives the input that gives the minimum value of the black box function evaluation and its value.
min_idx = np.argmin(y)
print(f"best x = {x[min_idx]}")
print(f"best y = {y[min_idx]}")
Plotting the evolution of the evaluated values¶
Below are $N_0$ initial training data and the evolution of the optimized evaluation values of the blackbox function over $N$ FMQA cycles. The initial training data is shown in blue, and the evaluation values obtained by the FMQA cycles are shown in red.
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(6, 4))
ax = fig.add_subplot()
# Evaluation results obtained during initial training data generation
ax.plot(
range(N0),
y[:N0],
marker="o",
linestyle="",
color="b",
)
# Evaluation results during the FMQA cycles
ax.plot(
range(N0, N0 + N),
y[N0:],
marker="o",
linestyle="",
color="r",
)
ax.set_xlabel("number of iterations", fontsize=18)
ax.set_ylabel("f(x)", fontsize=18)
ax.tick_params(labelsize=18)
plt.show()
Execution example¶
The output of a typical plot obtained by running this sample code is shown below. The optimization uses a QUBO solver, so the results will vary from run to run. However, you can see that the evaluated value of the blackbox function decreases with the number of FMQA cycles.
4. References¶
The present blackbox optimization method that combines quantum annealing and Ising machines with machine learning is called FMQA, which has been originally proposed as FMQA in the following research.
 K. Kitai, J. Guo, S. Ju, S. Tanaka, K. Tsuda, J. Shiomi, and R. Tamura, "Designing metamaterials with quantum annealing and factorization machines", Physical Review Research 2, 013319 (2020).
In this study, the search for "metamaterials" is carried out using FMQA, which also have shown superior performance compared to Bayesian optimization, a conventional blackbox optimization method.
In the following study, the same blackbox optimization method is also applied to the design of photonic crystals.
 T. Inoue, Y. Seki, S. Tanaka, N. Togawa, K. Ishizaki, and S. Noda, "Towards optimization of photoniccrystal surfaceemitting lasers via quantum annealing," Opt. Express 30, 4350343512 (2022).
These studies suggest that this optimization method (FMQA), based on FM and combinatorial optimization, may have general applicability in blackbox optimization problems in various fields. In Fixstars Amplify, there are several examples of such blackbox optimization in the areas of chemical reaction, fluid dynamics, as well as material search, as follows: