Model training by machine learning¶
This section implements the part of FMQA that learns the optimal parameters (weights and bias) of the
model by machine learning. First, the TorchFM
class representing the model by the
Factorization Machine is defined using PyTorch.
The following equation represents the Factorization Machine.
$$
\begin{aligned}
f(\boldsymbol{x} | \boldsymbol{w}, \boldsymbol{v}) &=
\underset{\color{red}{\mathtt{out\_linear}}}{\underline{ w_0 + \sum_{i=1}^d w_i x_i} } +
\underset{\color{red}{\mathtt{out\_quadratic}}}{\underline{\frac{1}{2}
\left[\underset{\color{red}{\mathtt{out\_1}}}{\underline{ \sum_{f=1}^k\left(\sum_{i=1}^d v_{i f}
x_i\right)^2 }} - \underset{\color{red}{\mathtt{out\_2}}}{\underline{ \sum_{f=1}^k\sum_{i=1}^d v_{i
f}^2
x_i^2 }} \right] }}
\end{aligned}
$$
The input $x$ of this model is a vector of the length $d$ as the input to the black-box function,
with
the following three parameters.
- $v$: 2-dimensional array of $d\times k$.
- $w$: 1D vector of length $d$.
- $w_0$: scalar
The only hyperparameter is $k$, which is given as a positive integer less than or equal to $d$.
The TorchFM
class defined below inherits from torch.nn.Module
and is
constructed from an input vector $x$ of size $d$ and a hyperparameter $k$. The hyperparameter $k$
controls
the number of parameters in the model; the larger the hyperparameter, the more parameters are, the
more
accurate the model becomes, but also the more prone to over-fitting.
The TorchFM
class has attributes $v$, $w$, and $w_0$ of the model parameters and updates
these parameters as the training proceeds. According to the above formula, the forward
method
also outputs an estimate of $y$ from the input $x$. Since the parameters $v$, $w$, and $w_0$ are
needed to
construct a QUBO model later, we also define a function get_parameters
to output these
parameters.