Abstrakt: |
The hidden Markov model (HMM), used with Gaussian Process (GP) as an emission model, has been widely used to model sequential data in complex form. This study introduces the hybrid Bayesian HMM with GP emission using SM kernel (HMM-GPSM) to estimate the hidden state of each time-series observation, that is, sequentially observed from a single channel. We then propose a scalable inference method to train the HMM-GPSM using large-scale sequences of time-series dataset that has (1) a large number of sequences for state transitions and (2) a large number of data points in a time-series observation for each hidden state. For state transitions with a large number of sequences, we employ stochastic variational inference (SVI) to update the parameters of HMM-GPSM efficiently. Also, for each time-series observation that has a large number of data points, we propose the approximate GP emission using the Random Fourier Feature (RFF), which is constructed by using the spectral points that are sampled from the spectral density of SM kernel. We propose the efficient inference of the kernel hyperparameters of the approximate GP emission and corresponding HMM-GPSM. Specifically, we derive the training loss, that is, the evidence lower bound of the HMM-GPSM that can be scalably computed for a large number of time-series observations by employing the regularized lower bound of GP emission likelihood with KL divergence. The proposed methods can be used together to train HMM-GPSM with the sequential time-series dataset that contains both (1) and (2). We validate the proposed method on the synthetic and real datasets using the clustering accuracy, marginal likelihood, and training time as the performance metrics. [ABSTRACT FROM AUTHOR] |