Recent advancements in high-throughput laboratory equipment and the emergence of novel monitoring devices for biotechnological process development allow for the faster, easier and cheaper execution of experiments, which has led to the accumulation of vast amounts of high-dimensional data. Information extraction and handling of the data using low throughput methodologies can be challenging in some cases and sheer impossible in others. Algorithms in the realm of statistics and machine learning capable of handling such high-dimensional data sets offer an opportunity to gain profound insight into underlying mechanisms of complex biological and technical systems.

In this thesis, machine learning models capable of processing high-dimensional data, for instance whole Fourier transformed infrared spectra or images, will be generated to describe both upstream and downstream processes. To achieve this task, modern machine learning frameworks such as deep learning will be leveraged and customized to overcome the challenges faced when modelling high-dimensional biotechnological data. Not only will the models be capable of evaluating the state of biotechnological processes online, but specially developed information extraction techniques will be applied to gain deeper understanding of the models’ inner workings, revealing underlying process mechanisms.