Topic 01

Data analysis for single cell long read sequencing

Single-cell long-read sequencing has emerged as a groundbreaking technology that enables the study of transcriptome complexity at an unprecedented resolution. By capturing full-length transcripts, this approach provides valuable insights into isoform-level gene expression, alternative splicing, and allele-specific expression. However, the analysis of single-cell long-read sequencing data presents unique challenges due to the high dimensionality, sparsity, and complex structure of the data.

This project aims to develop a comprehensive computational pipeline for analysing and interpreting single-cell long-read sequencing data. Our primary objectives are to:

Design novel algorithms for isoform-level quantification and normalisation, taking into account the unique characteristics of long-read data and the inherent noise in single-cell measurements.
Develop dimensionality reduction and clustering techniques specifically adapted for isoform-level expression data, enabling the identification of cell subpopulations and rare cell types based on their isoform expression profiles.
Investigate the dynamics of isoform expression across different cell states, developmental stages, and disease conditions, uncovering the role of alternative splicing in driving cellular diversity and function.
Integrate multi-omics data, such as single-cell epigenomics and proteomics, to gain a systems-level understanding of the regulatory mechanisms governing isoform expression and their impact on cellular phenotypes.

To achieve these objectives, we will leverage state-of-the-art computational methods, including machine learning, graph theory, and probabilistic modelling. We will also develop user-friendly software packages and interactive visualisations to facilitate the exploration and interpretation of single-cell long-read sequencing data by the broader scientific community.

Through close collaborations with experimental biologists and clinicians, we will apply our computational framework to a wide range of biological systems, including stem cell differentiation, brain development, and cancer progression. By unravelling the complexity of isoform expression at the single-cell level, we aim to gain novel insights into the molecular mechanisms underlying cellular identity, plasticity, and dysfunction. The outcomes of this project have the potential to revolutionise our understanding of transcriptome complexity and its role in shaping cellular behaviour.

Back