Repository logo

An Investigation of the Use of Linear Mixed Models Under an Extreme Phenotype Sampling (EPS) Design

dc.contributor.authorOnifade, Maryam Yetunde
dc.contributor.supervisorBurkett, Kelly
dc.contributor.supervisorSankoff, David
dc.date.accessioned2024-09-05T19:24:52Z
dc.date.available2024-09-05T19:24:52Z
dc.date.issued2024-09-05
dc.description.abstractMixed models have been used in genome-wide association studies to correct for confounding by population stratification and other forms of hidden relatedness. This class of models includes linear mixed models (LMMs) and generalized linear mixed models (GLMMs). This thesis presents an investigation into the use and application of LMMs within the context of extreme phenotype sampling (EPS) designs where genetic covariates are missing for some participants since genotypes are only collected on samples having extreme response variable values. We begin by exploring whether existing mixed model approaches correct for population stratification under an EPS design. These methods have been previously investigated with both continuous and case/control response variables. However, they have not been investigated in the context of EPS designs. We assess the performance of three mixed model approaches suitable for binary traits (GMMAT, LEAP and CARAT) and one linear mixed model approach (GEMMA) for continuous traits. Our investigation includes an overview of mixed model methodology applicable to binary response variables. We assess type 1 error rates and power using simulation studies with both common and rare variants scenarios. As a practical application of these mixed model techniques, we also compared methods when applied to a prostate cancer dataset collected as part of the PROtEUs study conducted in Québec, Canada that is known to have population substructure. Our simulation results show that for a common candidate variant, both LEAP and GMMAT had type 1 error rate close to the nominal value and similar power. Similar type 1 error control was observed with the analysis on the PROtEUs dataset. However, for rare variants the false positive rate remains inflated even after correction with mixed model approaches. Next, we present an Expectation Maximization (EM) algorithm for fitting linear mixed models with missing genetic covariates that was motivated by EPS designs. We used the method of weights adapted for linear mixed models to handle the missing genotypes. We derive two hypothesis tests for genetic association, a likelihood ratio test using importance sampling and a Monte-Carlo based Wald test. The performance of our algorithm was then assessed. Simulation studies were used to estimate type 1 error and power. We observed type 1 error rates below the nominal values of 0.05, signifying a conservative test, and low power for all missing data scenarios considered. Moreover some point estimates appear biased. We applied our algorithm to analyze the PROtEUs dataset and although our algorithm was able to correctly estimate most of the model parameters, the genetic effect estimated using the EM approach was larger than values by other approaches. The false positive rate also seemed inflated based on the p-value distribution across 5000 genetic markers. More investigation is needed to ensure the EM-based procedure is a valid approach to handle missing genotype data, particularly from an EPS study.
dc.identifier.urihttp://hdl.handle.net/10393/46536
dc.identifier.urihttps://doi.org/10.20381/ruor-30537
dc.language.isoen
dc.publisherUniversité d'Ottawa / University of Ottawa
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectExtreme Phenotype Sampling (EPS)
dc.subjectLinear Mixed Model (LMM)
dc.subjectExpectation maximization (EM) algorithm
dc.subjectPopulation Stratification
dc.subjectFalse positive rate
dc.titleAn Investigation of the Use of Linear Mixed Models Under an Extreme Phenotype Sampling (EPS) Design
dc.typeThesisen
thesis.degree.disciplineSciences / Science
thesis.degree.levelDoctoral
thesis.degree.namePhD
uottawa.departmentMathématiques et statistique / Mathematics and Statistics

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Onifade_Maryam_Yetunde_2024_thesis.pdf
Size:
1.51 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
6.65 KB
Format:
Item-specific license agreed upon to submission
Description: