Model-based Analysis Methods in Statistical Genomics Qiuling He Under the supervision of Professor Michael A. Newton At the University of Wisconsin-Madison This thesis aims to solve two problems in statistical genomics: (1) how to model agreement among genome-wide RNA interference (RNAi) studies; and (2) how to integrate experimentally derived genomic data with functional annotations. The problems are distinct in their specific elements but share two important features: (1) solutions could have significant implications for the practice of statistical genomics, and (2) our approaches to solve them use common model-based tools and techniques. The RNAi analysis concerns four recent genome-wide studies of influenza virus replication. All studies identified genes whose inactivation alters a cell's ability to produce virus, and they all had a similar experimental design. In total 614 human genes were confirmed to have an affect on viral replication, however there were very limited agreement between the studies. For instance, only one gene was confirmed by all four studies. The apparent lack of agreement raises questions about the rate of false positives and false negatives in genome-wide RNAi. We develop a generative sampling model to describe the RNAi data, and with likelihood methods we use this model to assess the relative magnitude of false positive and false negative effects. The model accommodates many aspects of RNAi, but it is sufficiently simple that closed form inference summaries are available. Evidence points to a relatively high false negative rate. In the second part of the thesis, we investigate the problem of genomic data integration, specifically, the problem of integrating experimentally derived data with data on the known functional profiles of the annotated genes. Such functional category analysis is important to data reduction and for weak-signal identification, though state-of-the-art methodology does not adequately handle the complexity of growing systems of functional categories. We show that a leading model-based empirical Bayesian approach suffers inconsistency and inefficiency, and we propose a new approach to connect these problems.