Peak shape clustering: an application to GATA-1

Abstract

In recent years many techniques have been developed to study genetic and epigenetic processes. Among Next Generation Sequencing method, we focus on ChIP-Seq (Chromatin Immuno Precipitation Sequencing), that permits to investigate protein-DNA interactions, e.g. the direct interaction between transcription factors, histones and DNA. At present, in the relevant literature, the analysis of ChIP-Seq data is mainly restricted to the detection of enriched regions (peaks) in the genome, considering only signal intensity. Motivated by the fact that these peaks can show very different shapes, we propose an innovative approach that takes into consideration also the shape of such peaks. We introduce some indices to summarize the shape and we use multivariate clustering techniques in order to detect statistically significant differences in peak shape. We show that an application of this analysis method to ChIP-Seq for the transcription factor GATA-1 reveals novel biological insights. Moreover, we suggest that a functional data analysis approach can lead to even more interesting results, treating peaks directly as curves. Joint work with I. Dellino, A. Parodi, P.G. Pelicci, L. Riva, L.M. Sangalli, P. Secchi and S. Vantini

Date
Location
W-257 MSC at University Park; with video to room CG628 at Hershey
Event
Seminar