Erik Garrison

Erik Peter Garrison

+44 7460 919767

23 Fitzwilliam Road

Cambridge, Cambridgeshire



Bachelor of Arts in Social Studies, Harvard University. Spring 2006

Spanish language citation. GPA: 3.45 overall. Varsity Crew 2002 to 2005. Undergraduate Fellow, Harvard Institute for Quantitative Social Science. Senior thesis focused on the relationship between social structure and communication technologies. Electives focused in computer science and statistics with classes in functional programming, theoretical computer science, peer-to-peer networks, linear algebra.


PhD Student, Wellcome Trust Sanger Institute. Cambridge, UK. October 2014 to present

Student. Focus on computational genomics.

Gabor Marth Lab, Boston College. Boston, MA. February 2010 to September 2014

Research associate. Implemented general framework for small variant detection from short-read sequencing data (freebayes). Developed and maintained tools to manipulate short read data and descriptions of genetic variation.  Wrote first haplotype- and graph-based variant detection methods for short-read sequencing data.  Major contributor to the 1000 Genomes Project in the areas of variant detection, data integration, and functional interpretation.

The Echonest. Somerville, MA. January 2009 to May 2009

Contractor. Designed and implemented control and monitoring systems to mange a compute cluster deployed in the Amazon EC2 cloud.

One Laptop Per Child. Cambridge, MA. May 2008 to January 2009

Software engineer. Focused on operating system build processes, customer support, maintenance, software design planning, communication among a globally-dispersed group of volunteers and educators.

George Church Lab, Harvard Medical School. Boston, MA. August 2006 to April 2008

Contractor. Designed, wrote, and tested data acquisition and system control software for the "Polonator" open-source DNA sequencing device.

National Bureau of Economic Research. Cambridge, MA. May 2006 to May 2007

Research assistant. Wrote software to efficiently process Wikipedia's XML-based data dumps (wikiq), and evaluated metrics of user contribution. Analyzed data related to the internationalization of clinical trials.

Harvard Kennedy School of Government. Cambridge, MA. January 2005 to September 2005

Research assistant. Obtained and processed data for country-level quantitative studies of terrorism and violent extremism.



Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907 (2012).


A map of human genome variation from population-scale sequencing. Nature (2010).

An integrated map of genetic variation from 1,092 human genomes. Nature (2012).

Demographic history and rare allele sharing among human populations. PNAS (2011).

A comprehensive map of mobile element insertion polymorphisms in humans. PLoS genetics (2011).

BamTools: a C++ API and toolkit for analyzing and managing BAM files. Bioinformatics (2011).

Integrative annotation of variants from 1092 humans: Application to cancer genomics. Science (2013).

MOSAIK: A hash-based algorithm for accurate next-generation sequencing read mapping. arXiv:1309.1149 (2013).


Haplotype-based variant detection from short-read sequencing.  Biology of Genomes; Cold Spring Harbor, 2012.

Haplotype-based variant detection and interpretation enables the population-scale analysis of multi-nucleotide sequence variants.  American Society of Human Genetics; San Francisco, 2012.

Simultaneous assembly of thousands of human genomes.  Biology of Genomes; Cold Spring Harbor, 2013.

A generalized human reference as a graph of genomic variation.  American Society of Human Genetics; Boston, 2013.

From short reads to genotypes, haplotypes, and frequencies.  Invited talk; Penn State, 2014.

Variant detection using a graph of genomic variation.  Advances in Genome Biology and Technology; Marco Island, Florida, 2014.


Programming Languages

C++, Python, Perl, Javascript, C, Lisp, R, *nix Shell scripting, Ruby.


MySQL, Postgresql, key-value storage systems

Web Design


Operating Systems

5 years experience with Linux (Ubuntu, Debian, Fedora, and Gentoo).


R, Bayesian methods, Neural networks, Unsupervised classification


Native English. Fluent Spanish and Italian.