Applications and Statistical Modeling of Electroencephalograms using Identity Vectors

Autor: Ward, Christian Radcliffe
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Druh dokumentu: Text
DOI: 10.34944/dspace/3996
Popis: In recent years, electroencephalograms (EEGs) have been the subject of intense signal processing research. The ability of software to group, cluster, or identify trends in EEG data has applications that range from clinical support tools for neurologists to brain-computer interfaces. However, a persistent limitation in the development of EEG classification algorithms has been a lack of clinician labeled data which is necessary to train the supervised neural networks and deep learning systems. This work addresses this issue by presenting an unsupervised technique for classifying EEGs and elucidating common data modes that do not depend on labeled data. Specifically, this work introduces the application of Identity Vectors (I-Vectors) to EEG signals. I-Vectors were originally developed in the speech processing community to parse multiple facets of speaker data (speaker, language, accent, age, etc). The similarities between EEG and speech data suggest that I-Vectors are a strong candidate for developing data models that can differentiate between subjects, channels, and medical conditions. I-Vectors work by building a Universal Background Model (UBM) of signal features that is based on weighted Gaussian clusters. This UBM is then projected into a lower dimensional space through a Total Variability Matrix which seeks to maximize the differences between the UBM and a group of “enrollment” signals. Optionally, further dimensionality reduction can typically be achieved through linear discriminant analysis (LDA) before generating the final I-Vectors. This work develops the application of I-Vectors to EEGs by addressing three key research aims. First: can the I-Vector technique be used to classify EEG data with equivalent performance to other machine learning classifiers. Secondly: how should I-Vector parameters be tuned to optimize performance on EEG data. And thirdly: What properties of EEG data do I-Vectors take advantage of, and can this knowledge be used to inform the EEG classification process. I-Vector performance was rigorously evaluated using larger and more diverse data sets than have been used in comparable published literature, specifically various blends of the PhysioNet Motor Movement Database and the Temple University Hospital EEG Corpus. Benchmark comparisons were made against well-known classifiers in the EEG domain, namely the Mahalanobis Distance and Gaussian Mixture Model-Universal Background Model (GMMUBM) classifiers. Performance was also evaluated using three different EEG feature sets as system inputs, namely Power Spectral Density, Spectral Coherence, and Cepstral Coefficients. Ultimately, the I-Vectors exceeded the performance of the MD classifier and reported an equal error rate 5% higher higher than the GMMUBMs. This was achieved using I-Vectors that were one to two orders of magnitude smaller than those in the GMMUBM classifier and half the size of the MD classifier. These results Indicated the technique was robust and has the potential to scale for use on large datasets such as the Temple University Hospital EEG Corpus.
Electrical and Computer Engineering
Ph.D.
Databáze: Networked Digital Library of Theses & Dissertations