Popis: |
One by-product of the large-scale manufacture of biological products is the generation of significant quantities of process data. Typically this data is catalogued and stored in accordance with regulatory requirements, but rarely is it used to enhance subsequent production. A large amount of useful information is inherent in this data; the problems lie in the lack of appropriate methods to apply in order to extract it. The identification and/or development of tools capable of providing access to this valuable, untapped resource are therefore an important area for research. The main objective of this research is to investigate whether it is possible to attain knowledge from the information inherent within process data. The approach adopted in this thesis is to utilise the tools and techniques prevalent in the areas of data mining and pattern recognition. Through the application of these techniques, it is hypothesised that useful information can be acquired. Specifically the industrial sponsors of the research, Avecia Biologics, are interested in looking at methods for comparing new proteins to those they have previously worked on, with the intention of inferring information pertaining to the large scale manufacturing route for different processes. It is hypothesised that by comparing proteins and looking for similarities at the molecular level, it could be possible to identify potential pit-falls and bottlenecks in the recovery process before they occur. This would then allow Avecia to highlight areas of process development that may require specific attention. Two main techniques are the primary focus of the study; the Self-Organising Map (SOM) and the Support Vector Machine (SVM). Through a detailed investigation of these techniques, from benchmarking studies to applications with real-world problems, it is shown that these methods have the potential to become a useful tool for extracting information from biological process data. |