Human Evolution and Archaic Hominin Gene Flow
Approximately 2% of the genome of each contemporary non-African human traces ancestry to ancient gene flow with Neanderthals. Preliminary evidence suggests that functional differences between modern human and Neanderthal-introgressed alleles influences variation in human traits and contribute to disease risk. Yet the mechanistic basis of these functional differences and the fitness consequences of Neanderthal introgression remain poorly understood. My postdoctoral research in the Akey lab aims to reveal the phenotypic legacy of archaic hominin introgression and potential impacts on the survival and reproductive success of modern humans. See a video about our latest publication on this topic here: https://www.youtube.com/watch?v=xX2iv4SyNHg
During my PhD, I worked with the prenatal genetic testing company Natera to investigate numerical chromosome abnormalities in preimplantation embryonic development. Analyzing genomic data from more than 6,000 in vitro fertilization cycles and more than 40,000 embryos, I documented rates of various forms of aneuploidy and relationships with variables such as parents’ ages and genotypes. This work provided insight into the molecular basis of aneuploidy formation and factors influencing aneuploidy risk. I am currently working with colleagues at Stanford and other researchers worldwide to follow up on these findings. Read more about our research on this topic here: https://www.ashg.org/press/201510-aneuploidy-fertility.html
Through collaboration with Illumina, I led a project to test the utility of Illumina TruSeq synthetic long-read technology (formerly Moleculo) for de novo genome assembly. To examine the advantages and limitations of this technology for this application, we assembled the genome of Drosophila melanogaster, allowing us to compare our assembly to the existing high-quality reference. We demonstrated that synthetic long-reads enable the accurate reconstruction and placement of highly repetitive transposable elements, which are difficult to assemble with short read data alone. We also documented key limitations of the technology when attempting to assemble long or nearly identical repeats. Together, this knowledge can help guide expectations for potential users as well as identify potential areas for improvement for future versions of the technology.
The first chapter of my PhD thesis sought to evaluate genomic methods of demographic inference in a tractable natural system with known demographic history. The checkerspot butterfly Euphydryas gillettii offered an ideal system for this study, as a population was intentionally introduced to an isolated study site at the Rocky Mountain Biological Laboratory in Gothic, Colorado in 1977. We used RNA-seq to assemble the transcriptome and discover single nucleotide polymorphisms within and between the Colorado population and a proxy ancestral population in the butterfly’s native range. Using these data, we demonstrated that demographic inference methods were sensitive enough to accurately time the establishment of the Colorado population (and the associated population bottleneck). In summary, this work introduced a low-cost method of marker discovery with transcriptomic data and also provided an important positive control for demographic inference methods. Photo credit: Carol Boggs.