Improving Surrogate Model Performance¶

In QA-BBO, the performance of the surrogate model is the key to successful optimization. In particular, for minimization problems targeted by this library, the performance of the surrogate model for samples with relatively small black-box function values (objective values) is especially important.

Attention

In general, “model performance” often refers to predictive accuracy metrics such as mean squared error. However, in black-box optimization, the (rank) correlation between the true black-box function values and the surrogate model’s predicted values is crucial. For each cycle, it is important not only that the correlation coefficient is positive but also that, on average, it well exceeds 0.5 and approaches 1 as the optimization progresses. This serves as a practical indicator of whether the surrogate model is being constructed appropriately.

Tip

Amplify-BBOpt calculates and outputs correlation coefficients every time the surrogate model is constructed. The correlation coefficient is computed as follows:

Extract the lower samples from the training data based on the threshold defined by percentile_cutoffs for the objective values.
For these lower samples, compute the predicted values using the constructed surrogate model and calculate the correlation coefficient with the corresponding true values. The correlation coefficients for these lower samples are printed at each cycle as follows:

 <=10%: 0.654, <=25%: 0.693, <=50%: 0.721, all: 0.872

These values represent the correlation coefficients for the bottom 10%, 25%, 50%, and all samples, respectively. You can specify the lower percentile thresholds as shown below (the default is [10, 25, 50, 100]):

from amplify_bbopt import KMTrainer

trainer = KMTrainer()
trainer.percentile_cutoffs = [20, 50, 100]  # Specify bottom 20%, 50%, and 100%

By regularly checking the correlation coefficient between the surrogate model predictions and the objective values, you can more easily detect and avoid potential optimization failures early in the process.

Transforming training data¶

In black-box optimization for minimization problems, as the optimization cycles progress, the black-box function values are expected to gradually approach to the true optimum. Consequently, the overall dynamic range of the training data tends to become relatively large. One effective approach to improving the predictive performance of the surrogate model — especially for samples with small objective function values — is to apply training data transformation.

Exponential transformation of objective function values¶

It is generally difficult to construct a model for datasets with a large dynamic range. Applying an exponential transformation such as the one below can sometimes improve model performance [1]:

\[ \hat{y} = -\exp\left( \frac{-y}{c_m}\right). \]

Here, \(y\) is the original objective function value (assumed to approach zero near the optimum), \(\hat{y}\) is the transformed value, and \(c_m\) is a hyperparameter of this transformation, typically set to the mean of the objective function values in the initial training data.

Hint

If the black-box function value at the optimal solution is not expected to be near zero, you may apply an appropriate offset before performing the transformation.

For example, in a black-box optimization problem that aims to maximize an accuracy score (%), the black-box function value can be defined as the negative accuracy (i.e., \(-100 \le y \le 0\)), so that minimizing \(y\) corresponds to maximizing accuracy. In such a case, \(y\) may approach -100 during optimization. An offset can be applied as follows:

\[ y^* = y - y_{offset} \]

Here, for this example, \(y_{offset} = -100\). If it is difficult to estimate \(y_{offset}\) from the characteristics of the black-box function, you may use the minimum value in the initial training data as \(y_{offset}\). After applying the offset, perform the exponential transformation:

\[ \hat{y} = -\exp\left( \frac{-y^*}{c_m}\right). \]

The inverse transformation to recover the original function value from the transformed one can be expressed as:

\[ y = -c_m\log(-\hat{y}) + y_{offset} \]

This type of scaling offers several advantages:

Emphasizes differences among small objective values
Due to the exponential function’s steep slope for small \(y\), even slight differences in small values lead to large variations in \(\hat{y}\). This helps the model focus more strongly on regions near the minimum, which is particularly beneficial in minimization tasks.
Suppresses the influence of large objective values
For large \(y\), \(\exp(-y/c_m)\) approaches zero, making \(\hat{y}\) nearly constant. Thus, samples with large (less relevant) objective values have reduced impact on model construction.
Shifts regression model focus
While standard regression minimizes global error (e.g., MSE or MAE) uniformly, this transformation biases the loss toward improving model performance for samples with smaller \(y\) values, those more relevant to optimization.

By having the black-box function output the transformed value \(\hat{y}\) instead of \(y\), it may be possible to construct a more effective surrogate model.

Training Data Transformation with ExpScaler¶

Amplify-BBOpt provides the ExpScaler class to easily apply the exponential transformation described above. This class transforms the training data internally without modifying the black-box function. The transformed data is only used when building the surrogate model, so it does not affect the black-box objective function values in the optimization process.

Basic Usage¶

By passing an instance of ExpScaler to the surrogate_data_transformer argument of Optimizer, the exponential transformation is automatically applied during surrogate model training.

from amplify_bbopt import ExpScaler, Optimizer

# Create an ExpScaler instance
scaler = ExpScaler()

# Pass it to Optimizer as surrogate_data_transformer
optimizer = Optimizer(
    blackbox=my_blackbox_func,
    trainer=my_trainer,
    client=my_client,
    surrogate_data_transformer=scaler,
)

Note

In ExpScaler, the offset value \(y_{offset}\) is automatically set to the minimum value of the target dataset \(y\).

Customizing Parameters¶

ExpScaler supports the following customizable parameters:

cm_method: Specifies how to calculate \(c_m\).
- "mean" (default): Uses the mean of \(y - y_{offset}\)
- "median": Uses the median of \(y - y_{offset}\)

from amplify_bbopt import ExpScaler

# Calculate c_m using median
scaler = ExpScaler(cm_method="median")

dynamic: Specifies whether to recalculate parameters on each transformation.
- False (default): Fixes parameters based on the first transformation
- True: Recalculates parameters on each transformation

from amplify_bbopt import ExpScaler

# Recalculate parameters from all training data each cycle
scaler = ExpScaler(dynamic=True)

Note

Using dynamic=False applies consistent scaling throughout the entire optimization process. On the other hand, dynamic=True updates scaling parameters as training data grows.