Skip to page navigation menu Skip entire header
Brown University
Skip 13 subheader links

Analyzing RNA-seq data using prior knowledge of gene and cell relationships

Description

Abstract:
Ten years ago, the first RNA-seq study was published. Since then over 200 thousand RNA-seq studies have been published, spanning many different organisms, tissue types, and experimental conditions. However, until recently RNA-seq could only be used to investigate differences in gene expression between samples. This is because the expression of a sample is measured from pooled mRNA from hundreds of thousands to millions of cells. Recently, new RNA-seq technologies have begun to emerge, such as single cell RNA-seq (scRNA-seq), which allows for profiling of individual cells from a sample. This allows for the study of cellular heterogeneity within a tissue. Another new RNA-seq technology called Spatial Transcriptomics RNA-seq (STRNA-seq) profiles the mRNA transcripts from a tissue slice while retaining the spatial location of the transcripts in the tissue. Both methods produce high-dimensional transcript count matrices but are limited by extremely low coverage, with roughly zero entries. In this dissertation, we introduce two methods that use known gene and cell dependencies to recover signal from scRNA-seq and STRNA-seq data. The first method, netNMF-sc, is a matrix factorization method which utilizes gene co-expression networks obtained from prior RNA-seq studies to perform dimensionality reduction and imputation of sparse scRNA-seq data, improving clustering performance and recovery of coexpressed genes over existing methods. The second method, STCNA, uses hidden Markov models to infer genomic copy number aberrations (CNAs) from STRNA-seq data of tumor tissues. Copy number aberrations, a subset of genomic rearrangements, are acquired as a tumor evolves and are a driving force of cancer development. STCNA uses spatial information to uncover subclonal CNAs, which are present in only a subset of cells in the tissue. Finally, we present a third method, NAIBR, which identifies genomic rearrangements, including those which do not result in copy number changes, such as inversions and translocations, from barcoded DNA sequencing data.
Notes:
Thesis (Ph. D.)--Brown University, 2020

Citation

Elyanow, Rebecca, "Analyzing RNA-seq data using prior knowledge of gene and cell relationships" (2020). Center for Computational Molecular Biology Theses and Dissertations. Brown Digital Repository. Brown University Library. https://repository.library.brown.edu/studio/item/bdr:ngscgmmd/

Relations

Collection: