Popis: |
Pattern recognition is an analytical process that is now playing an increasingly important role in the interpretation and evaluation of data across many scientific disciplines. This thesis examines the application of pattern recognition techniques to both pharmaceutical process data and metabolomic data, and attempts to identify trends, as well as provide solutions to problems in these fields. Pattern recognition can be divided into univariate and multivariate techniques. Although this thesis examines both types of methods, the focus is primarily on multivariate techniques, because the pharmaceutical process data and the metabolomic data are both highly multivariate. Previous studies have provided data from these areas using a variety of analytical techniques such as acoustic emission (AE), nuclear magnetic resonance (NMR) and denaturing gradient gel electrophoresis (DGGE), and in this work multivariate pattern recognition is successful at extracting valuable information from these data. The first approach is the monitoring of a high-shear wet granulation process using AE spectroscopy. Granulation is a highly complex multi-step process, and there are many factors and variables, such as the duration of different phases, quantities of added materials, and speed of the granulator equipment. These factors can all have a considerable effect on the quality of the final granules that are produced, although it can be very difficult to pinpoint exactly to what degree they each influence the process. The combination of AE with pattern recognition techniques is studied here as an approach to identify the trends that are due to each variable. This study uses multiway analysis such as parallel factor analysis (PARAFAC) and multiway partial least squares (MPLS) to examine the data, as well as slope analysis techniques to analyse the trends between the acoustic profiles of batches that are characterized by different process variables. The results suggest that AE combined with pattern recognition techniques can be particularly useful to the pharmaceutical industry for the monitoring and control of granulation processes. The second approach looks at pattern recognition applied to the metabolomic data produced by two sources. An NMR dataset was obtained from human samples of saliva, half of which had been treated with a mouthwash. Multiple factors contributed to variance in the dataset. The total variance was split into parts characterized by these underlying experimental factors. Multilevel techniques were applied, based on a hierarchical relationship between the factors. Also, as the experimental factors in the study varied due to the programmed experimental design, ANOVA based techniques were utilized to interpret this variance. The results suggest that in metabolomic studies on humans there is a high variation between the subjects. This variation was dominant and once extracted, trends that were due to other experimental factors could be analysed. The second metabolomic dataset was a DGGE dataset. Pattern recognition techniques, in particular • principal coordinates analysis (PCO), were used to analyse the trends in microbial profiles of human sweat. It was shown that PCO components can be effectively used as an input for various classification techniques in order to observe individual or group separation. |