Model training by machine learning¶
This section implements the part of FMQA that learns the optimal parameters (weights and bias) of the
model by machine learning. First, the TorchFM class representing the model by the
Factorization Machine is defined using PyTorch.
The following equation represents the Factorization Machine.
$$
\begin{aligned}
f(\boldsymbol{x} | \boldsymbol{w}, \boldsymbol{v}) &=
\underset{\color{red}{\mathtt{out\_linear}}}{\underline{ w_0 + \sum_{i=1}^d w_i x_i} } +
\underset{\color{red}{\mathtt{out\_quadratic}}}{\underline{\frac{1}{2}
\left[\underset{\color{red}{\mathtt{out\_1}}}{\underline{ \sum_{f=1}^k\left(\sum_{i=1}^d v_{i f}
x_i\right)^2 }} - \underset{\color{red}{\mathtt{out\_2}}}{\underline{ \sum_{f=1}^k\sum_{i=1}^d v_{i
f}^2
x_i^2 }} \right] }}
\end{aligned}
$$
The input $x$ of this model is a vector of the length $d$ as the input to the black-box function,
with
the following three parameters.
- $v$: 2-dimensional array of $d\times k$.
- $w$: 1D vector of length $d$.
- $w_0$: scalar
The only hyperparameter is $k$, which is given as a positive integer less than or equal to $d$.
The TorchFM class defined below inherits from torch.nn.Module and is
constructed from an input vector $x$ of size $d$ and a hyperparameter $k$. The hyperparameter $k$
controls
the number of parameters in the model; the larger the hyperparameter, the more parameters are, the
more
accurate the model becomes, but also the more prone to over-fitting.
The TorchFM class has attributes $v$, $w$, and $w_0$ of the model parameters and updates
these parameters as the training proceeds. According to the above formula, the forward
method
also outputs an estimate of $y$ from the input $x$. Since the parameters $v$, $w$, and $w_0$ are
needed to
construct a QUBO model later, we also define a function get_parameters to output these
parameters.