The polymerase chain reaction (PCR) is one of the most widely used techniques in molecular biology. In combination with High Throughput Sequencing (HTS), PCR is widely used to quantify transcript abundance for RNA-seq, and in the context of analysis of T and B cell receptor repertoires. In this study, we combine DNA barcoding with HTS to quantify PCR output from individual target molecules. We develop computational tools that simulate both the PCR branching process itself, and the subsequent subsampling which typically occurs during HTS sequencing. We explore the influence of different types of heterogeneity on sequencing output, and compare them to experimental results where the efficiency of amplification is measured by barcodes uniquely identifying each molecule of starting template. Our results demonstrate that the PCR process introduces substantial amplification heterogeneity, independent of primer sequence and bulk experimental conditions. This heterogeneity can be attributed both to inherited differences between different template DNA molecules, and the inherent stochasticity of the PCR process. The results demonstrate that PCR heterogeneity arises even when reaction and substrate conditions are kept as constant as possible, and therefore single molecule barcoding is essential in order to derive reproducible quantitative results from any protocol combining PCR with HTS.

Link to Paper »


Katharine Best, Theres Oakes, James M Heather, John Shawe-Taylor, Benny Chain




Scientific reports, (5).


Complex issues, Complex networks, and Complex Systems