A Statistical Framework for Measuring Reproducibility and Replicability of High-Throughput Experiments From Multiple Sources

Abstract

Replication is essential to reliable and consistent scientific discovery in high-throughput experiments. Quantifying the replicability of scientific discoveries and identifying sources of irreproducibility have become important tasks for quality control and data integration. In this work we introduce a novel statistical model to measure the reproducibility and replicability of findings from replicate experiments in multi-source studies. Using a nested copula mixture model that characterizes the interdependence between replication experiments both across and within sources, our method quantifies reproducibility and replicability of each candidate simultaneously in a coherent framework. Through simulation studies, an ENCODE ChIP-seq dataset and a SEQC RNA-seq dataset, we demonstrate the effectiveness of our method in diagnosing the source of discordance and improving the reliability of scientific discoveries.

Publication
Statistics in Medicine
Date
Links