Genome Information Databases for CHO

To date, 8 out of 10 blockbuster biotherapeutics are produced in Chinese Hamster Ovary (CHO) cells, making them the most prominent industrial mammalian cell line. Despite the importance of this cell line, publicly available genomic sequence information was lagging behind compared to other biotechnologically relevant organisms, like Escherichia coli or Saccharomyces cerevisiae. The first complete genomic sequence information for CHO cells and the Chinese Hamster, used as reference sequence, was published between 2011 and 2013. Based on this knowledge, the long-term goal is to achieve shortened development timelines and thus reduced production costs of valuable therapeutic compounds that are then affordable for most health care systems. To reach this goal, one of our exploratory tasks is to expand and improve existing data sets and in-silico models at a level of detail that would allow better prediction and control of cellular behavior and bioprocesses with increased reliability.

In cooperation with leading academic institutions we are currently working on the reference genome of the Chinese Hamster to enable reliable comparison between different CHO cell lines. The application of a new sequencing method (PacBio) to sequence longer regions of the genome has led to a genomic version consisting of ~1800 scaffolds, which shows a great improvement over the first published version in 2013, consisting of >80000 scaffolds. This new assembly will facilitate to study genes and their function with higher precision and to better analyze the differences between CHO cell lines, which can vary largely both in their genotypes and phenotypes.

Besides the Chinese Hamster, we have sequenced 6 related CHO cell lines and investigated the relative contribution of genomic and epigenetic modifications towards phenotype evolution. These data present a unique, comprehensive resource of the chromatin state and was made available in the online database http://cho-epigenome.boku.ac.at.

The availability of genome sequences has opened the door for genome scale science and engineering of CHO cells, however, this also requires making available the infrastructure for easy public access to this information. To provide such access the website www.CHOgenome.org was founded in 2010 as a joint scientific community effort, co-chaired by Nicole Borth. In expectation of many more genomes that will be published over the next years and a concomitant rise in –omics datasets published on CHO, there is need for improved web site infrastructure to enable working with multiple genomes and to provide adequate visualization and analysis tools. Currently our group is working on CHOmine to offer a quick, powerful way to search collated data of CHO cell lines and to allow users to query across multiple data types in different combinations, facilitating analysis.

In another international cooperation we are working on the first genome scale metabolic model, which describes all biochemical reactions of the metabolisms of CHO cells in mathematical equations. This model requires the detailed compiling of all possible metabolites as well as metabolic pathways of CHO, and the identification of all enzymes involved in it. The metabolic model was generated and verified manually and has thus achieved a state of usefulness for academic interest. We and other are currently working on the applicability for industrial purposes such as prediction of the impact of changes in the bioprocess on product quality.

 

Cell line engineering of CHO

Despite the importance and advantages of CHO cells, like the ability to produce therapeutic proteins in the correct, non-immunogenic and biologically active form, we still have to face certain challenges when working with CHO, including high genomic and phenotypic variation, instability of production, and secretory bottlenecks. Thus, a major focus of our working group lies on the improvement of CHO cells as production platform with key focuses on increasing the growth rate, the productivity and the product quality.

Current goals include

·         new approaches for controlling the epigenome, which determines the transcriptome and thus the phenotype of a cell line. Our previous results have shown that while there are numerous genomic variations present in each cell line, their direct impact on cell behavior is less than the impact of changes in epigenetic regulation. Thus “stability” has to be redefined as “consistency in behavior” which may be controlled by alternative approaches to engineering.

·         development of methods to stabilize a given phenotype as determined by the epigenome

·         establishment of methods that analyse and characterize the stability of the genome of a given cell line. These tools include chromosome painting technology to enable visualization of large scale genomic rearrangements, karyotype analyses as well as a newly established protocol for AFLP to asses genome scale rearrangements in a medium throughput manner.

·         establishment of full gene knockout libraries using CRISPR/cas9 and other genome editing tools. This approach might lead in the future to a minimal cell line, where all genomic elements that are not required for recombinant protein production have been deleted to enable faster growth and enhanced genomic stability, due to the smaller genome size.

·         new tools for controlling recombinant gene expression based on stringent, RNA-based  selection systems or autologous CHO promoter and enhancer elements

We are a member of the Marie Sklodonskaja Curie International Training Network eCHO-systems (enhancing CHO by systems biotechnology) (http://www.echo-systems.eu) which aims at training young researchers for a future at the intersection of biology, bioprocessing and bioinformatics. The participating 15 students, hosted at 4 European universities and a company partner, receive both academic and industrial training in cell culture, bioprocessing, large scale production, bioinformatics, modeling and cell engineering.