5.1. FMQA Optimizer¶

FMQAOptimizer is a class to perform black-box optimization using a factorization machine with quadratic-optimization annealing (FMQA).

FMQA¶

FMQA is a black-box optimization method that combines machine learning and quadratic-optimization annealing. The method was first proposed in the following study:

K. Kitai, J. Guo, S. Ju, S. Tanaka, K. Tsuda, J. Shiomi, and R. Tamura, “Designing metamaterials with quantum annealing and factorization machines”, Physical Review Research 2, 013319 (2020).

The method iterates through the cycles shown in the figure below, simultaneously searching for both a good approximation by a second-order polynomial model of the black-box function and the input that yields the minimum value of that polynomial.

FMQA flow

First, we construct a second-order polynomial model that approximates the black-box function using machine learning. The constructed polynomial model is regarded as an optimization model.
Next, the optimization solver finds the input \(x\) that minimizes the optimization model, typically by quantum annealing or simulated annealing in the context of Quadratic Unconstrained Binary Optimization (QUBO). The solution \(x\) obtained is a potentially optimal input to the black box function.

Suppose the second-order polynomial model approximates the black-box function well enough. In that case, we can expect that \(\boldsymbol{x}\) yields a relatively small value when passed to the black-box function. If not, we can still expect a better polynomial approximation of the black box function in the next training cycle by adding the data to the training dataset and constructing a polynomial model again in step 1. above.

Note

Although the optimization at each FMQA cycle is based on QUBO, which is for binary decision variables, Amplify-BBOpt implements several ways to accommodate various integer and real variables. For details, see here.

Note

In Amplify-BBOpt, the polynomial model and the optimization model have exactly the same formulation, but the meaning of the variables is different. The variables are input variables in the polynomial model and decision variables in the optimization model.

Internally, the constructed polynomial model is converted to an optimization model as amplify.Model by using the Amplify SDK (see red text in the above figure). Then, the Amplify SDK solves the optimization model using a solver specified by a solver client.

The second-order polynomial approximation of the black-box function is based on a machine-learning model called the Factorization Machine (FM), consisting of the following polynomial. Here, \(d\) is a constant representing the length of the input to the black-box function, \(\boldsymbol{v}\), \(\boldsymbol{w}\), and \(w_0\) are the parameters of the model, and \(k\) is a hyperparameter representing the size of the parameters.

\[\begin{split} \begin{aligned} g(\boldsymbol{x} | \boldsymbol{w}, \boldsymbol{v}) &= w_0 + \langle \boldsymbol{w}, \boldsymbol{x}\rangle + \sum_{i=1}^d \sum_{j=i+1}^d \langle \boldsymbol{v}_i, \boldsymbol{v}_j \rangle x_i x_j \\ &=w_0 + \sum_{i=1}^d w_i x_i + \sum_{i=1}^d \sum_{j=i+1}^d \sum_{f=1}^k v_{if}v_{jf}x_ix_j \\ &=w_0 + \sum_{i=1}^d w_i x_i + \frac{1}{2}\sum_{f=1}^k\left(\left(\sum_{i=1}^d v_{i f} x_i\right)^2 - \sum_{i=1}^d v_{i f}^2 x_i^2\right) \end{aligned} \end{split}\]

Using the FM as a surrogate model for the black-box objective function has the following advantages.

Minimization by an optimization (QUBO) solver based on quantum/simulated annealing is possible because the model is a quadratic polynomial
The computational complexity of the model’s inference can be parameterized by the model hyperparameter \(k\)

The hyperparameter \(k\) is a positive integer less than or equal to the length \(d\) of the input to the black box function, which can adjust the number of parameters in the FM model. When \(k=d\), the model has the same degrees of freedom as the QUBO interaction terms, while a smaller \(k\) reduces the number of parameters and circumvents over-fitting.

Note

Generally, the purpose of machine learning is to construct a model with a loss function value, typically a mean squared error. However, as long as the surrogate model can produce the black-box objective function output values that correlate positively with the true value, such an error is irrelevant in the black-box optimization context. For this reason,

While a higher \(k\) results in a smaller error to the training dataset, a smaller \(k\) is recommended to avoid over-fitting for a relatively small training data size in FMQA.
It is a good practice to check the cross-correlation of true values and predicted values by the trained model in each optimization cycle (see model corrcoef in the output from the optimizer as shown below). Ensure the coefficient is mostly positive, preferably close to or higher than 0.8.

Using FMQA-Optimizer¶

To use FMQAOptimizer, you must set an optimization solver client before instantiating the class. Here, the optimization solver client is the client that performs mathematical optimization at step ② in the above figure at each FMQA cycle.

The code below shows the preparation of a black-box function and initial training data, which are already mentioned in “2. Black-Box Function.”

import numpy as np
from amplify_bbopt import (
    DatasetGenerator,
    FMQAOptimizer,
    RealVariable,
    blackbox,
)

from utils.pseudo_simulators import (
    pseudo_wing_simulator as wing_simulator,
)

np.set_printoptions(legacy="1.25")


@blackbox
def objective_lift_drag(
    wing_width: float = RealVariable(bounds=(1, 20), nbins=100),
    wing_height: float = RealVariable(bounds=(1, 5), nbins=20),
    wing_angle: float = RealVariable(bounds=(0, 45), nbins=20),
) -> float:
    """This black-box function executes wing_simulator(), and returns
    the negative lift-drag ratio for a given wing's width, height, and angle.
    """
    lift, drag = wing_simulator(wing_width, wing_height, wing_angle)
    lift_drag = lift / drag
    print(f"{lift=:.2e}, {drag=:.2e}, {lift_drag=:.2e}")
    return -lift_drag  # value to minimize


# Generate initial training data set
data = DatasetGenerator(objective=objective_lift_drag).generate(num_samples=10)

Show code cell output Hide code cell output

amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #0/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=6.22e+02, drag=1.60e+02, lift_drag=3.89e+00
amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #1/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=1.36e+02, drag=5.80e+01, lift_drag=2.34e+00
amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #2/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=2.41e+01, drag=1.12e+01, lift_drag=2.16e+00
amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #3/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=5.86e+02, drag=1.70e+02, lift_drag=3.46e+00
amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #4/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=3.75e+02, drag=1.55e+02, lift_drag=2.43e+00
amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #5/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=5.32e+02, drag=1.55e+02, lift_drag=3.44e+00
amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #6/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=5.71e+02, drag=2.58e+02, lift_drag=2.21e+00
amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #7/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=6.18e+02, drag=1.71e+02, lift_drag=3.63e+00
amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #8/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=3.95e+02, drag=2.33e+02, lift_drag=1.69e+00
amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #9/10 initial data for objective_lift_drag
amplify-bbopt | 2024/10/04 05:45:41 | INFO | - [obj]: lift=6.62e+01, drag=1.77e+02, lift_drag=3.73e-01

Now, we will set a solver client that optimizes the surrogate model \(g(\boldsymbol{x})\) at step ② in the above figure at each FMQA cycle.

You can choose a solver client from the Amplify SDK. Here, we will use Amplify Annealing Engine (Amplify AE) as a solver as an example. All available solvers are listed here. The timeout to be set in the client.parameters.timeout corresponds to the searching timeout for the optimization step (② in the above figure) at each FMQA cycle.

You also need to provide your API token for the solver (for Amplify AE, you can get one for free after user registration).

from datetime import timedelta

from amplify import FixstarsClient

# Set up solver client
client = FixstarsClient()
client.parameters.timeout = timedelta(milliseconds=2000)  # 2 seconds
# client.token = "xxxxxxxxxxx"  # Enter your Amplify AE API token & remove comment.

Finally, using the data, client and objective_lift_drag prepared above, you can instantiate the FMQAOptimizer class as follows. You can see the overview of the black-box optimization setting as well.

# Instantiate the FMQA optimizer
optimizer = FMQAOptimizer(
    data=data, objective=objective_lift_drag, client=client
)

# Display the overall black-box optimization setting
print(optimizer)

num variables: 3
num elemental variables: 3
num amplify variables: 137
optimizer client: FixstarsClient
objective weight: 1.0
--------------------
trainer class: TorchFMTrainer
model class: TorchFM
model params: {d: 137, k: 10}
batch size: 8
epochs: 2000
loss class: MSELoss
optimizer class: AdamW
optimizer params: {'lr': 0.5}
lr_sche class: StepLR
lr_sche params: {'step_size': 100, 'gamma': 0.8}
data split ratio (train): 0.8

To run the optimization cycles, use the class’s optimize() method with the number of optimization cycles. In the output, you can see the evolution of best objective at each optimization cycle. With appropriate settings, you can expect that best objective will decrease on average with cycles.

# Perform FMQA optimization for [num_cycles] cycles
optimizer.optimize(num_cycles=5)

Show code cell output Hide code cell output

amplify-bbopt | 2024/10/04 05:45:41 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:41 | INFO | #1/5 optimization cycle, constraint wt: 7.77e+00
amplify-bbopt | 2024/10/04 05:45:46 | INFO | model corrcoef: 0.929
amplify-bbopt | 2024/10/04 05:45:49 | INFO | num_iterations: 20
amplify-bbopt | 2024/10/04 05:45:49 | INFO | - [obj]: lift=1.66e+02, drag=2.34e+01, lift_drag=7.11e+00
amplify-bbopt | 2024/10/04 05:45:49 | INFO | y_hat=-7.106e+00, best objective=-7.106e+00
amplify-bbopt | 2024/10/04 05:45:49 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:49 | INFO | #2/5 optimization cycle, constraint wt: 1.42e+01
amplify-bbopt | 2024/10/04 05:45:53 | INFO | model corrcoef: 0.961
amplify-bbopt | 2024/10/04 05:45:55 | INFO | num_iterations: 21
amplify-bbopt | 2024/10/04 05:45:55 | INFO | modifying solution (11, is_frequent=False), {'wing_width': 16.545454545454547, 'wing_height': 1.0, 'wing_angle': 45.0} --> {'wing_width': 17.12121212121212, 'wing_height': 1.631578947368421, 'wing_angle': 2.3684210526315788}.
amplify-bbopt | 2024/10/04 05:45:55 | INFO | - [obj]: lift=2.84e+02, drag=4.06e+01, lift_drag=6.99e+00
amplify-bbopt | 2024/10/04 05:45:55 | INFO | y_hat=-6.991e+00, best objective=-7.106e+00
amplify-bbopt | 2024/10/04 05:45:55 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:45:55 | INFO | #3/5 optimization cycle, constraint wt: 1.42e+01
amplify-bbopt | 2024/10/04 05:45:59 | INFO | model corrcoef: 0.950
amplify-bbopt | 2024/10/04 05:46:02 | INFO | num_iterations: 21
amplify-bbopt | 2024/10/04 05:46:02 | INFO | - [obj]: lift=2.44e+02, drag=3.99e+01, lift_drag=6.13e+00
amplify-bbopt | 2024/10/04 05:46:02 | INFO | y_hat=-6.127e+00, best objective=-7.106e+00
amplify-bbopt | 2024/10/04 05:46:02 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:46:02 | INFO | #4/5 optimization cycle, constraint wt: 1.42e+01
amplify-bbopt | 2024/10/04 05:46:05 | INFO | model corrcoef: 0.938
amplify-bbopt | 2024/10/04 05:46:08 | INFO | num_iterations: 21
amplify-bbopt | 2024/10/04 05:46:08 | INFO | - [obj]: lift=1.72e+02, drag=2.38e+01, lift_drag=7.21e+00
amplify-bbopt | 2024/10/04 05:46:08 | INFO | y_hat=-7.210e+00, best objective=-7.210e+00
amplify-bbopt | 2024/10/04 05:46:08 | INFO | ----------------------------------------
amplify-bbopt | 2024/10/04 05:46:08 | INFO | #5/5 optimization cycle, constraint wt: 1.44e+01
amplify-bbopt | 2024/10/04 05:46:11 | INFO | model corrcoef: 0.932
amplify-bbopt | 2024/10/04 05:46:14 | INFO | num_iterations: 19
amplify-bbopt | 2024/10/04 05:46:14 | INFO | - [obj]: lift=2.08e+02, drag=3.14e+01, lift_drag=6.62e+00
amplify-bbopt | 2024/10/04 05:46:14 | INFO | y_hat=-6.625e+00, best objective=-7.210e+00

Once all optimization cycles are completed, you can check the final solution, which is as follows. You can also plot the optimization history. See 6. Visualization for more details.

# Print results
print(f"{optimizer.best_solution=}")  # Solution (optimal input)
print(f"{optimizer.best_objective=:.3e}")  # Objective function value

optimizer.best_solution={'wing_width': 17.12121212121212, 'wing_height': 1.0, 'wing_angle': 45.0}
optimizer.best_objective=-7.210e+00

Advanced setting¶

By default, an FMQAOptimizer class instance considers the TorchFM class as a surrogate model class and the TorchFMTrainer class as the model trainer class. As for the model parameters of the TorchFM class, the following are considered.

\(d\): the size of the input vector (always equals the input length, e.g. num_amplify_variables in the above example)
\(k\): \(\min(10, d)\)

The default training parameters are mentioned in set_train_params. You can adjust/modify these model trainers and model classes and their parameters as follows.

Model parameters¶

In the above optimization problem, let’s adjust the hyperparameter \(k\) of the FM model (TorchFM) from the default value of 10 to 20.

trainer = optimizer.trainer
trainer.set_model_params(k=20)

Training parameters¶

You can also adjust training parameters for the model. For the default trainer class TorchFMTrainer, you can use the set_train_params method. The following code line modifies the batch size and optimizer parameters and turns off the learning rate scheduler. Also, with print(trainer), you can see that the changes to the above model and training parameters have been successfully made.

trainer.set_train_params(
    batch_size=32, optimizer_params={"lr": 0.1}, lr_sche_class=None
)
print(trainer)

--------------------
trainer class: TorchFMTrainer
model class: TorchFM
model params: {d: 137, k: 20}
batch size: 32
epochs: 2000
loss class: MSELoss
optimizer class: AdamW
optimizer params: {'lr': 0.1}
data split ratio (train): 0.8

You can also use your own custom model and trainer classes. For details, see Custom Trainer and Model.