Dissertation

Identifying Feature, Parameter, and Sample Subsets in Machine Learning and Image Analysis

Author: Mehta, Ronak
Date: 2023
Publisher: Madison, Wis.: University of Wisconsin--Madison
Summary: Modern machine learning has proven to be extremely effective in aiding and automating a large number of tasks, beginning with simple image recognition, now ranging widely from full language translation and understanding to computer-aided medical diagnosis and drug discovery. These advances have largely been enabled by significant development in computation schemes and algorithms, that have come alongsideexponential increases in the scale of training data available. Success in these cases is measured directly by performance,evaluating some sort of error or accuracythat proves to be competitive with evendomain-level experts. Recent research within the fieldhas thus now moved to orthogonal, but equally important questions involving robustness to data distribution shifts, model fairness, interpretability ofmodel outputs, and explainability of model predictions. The analysis of these varied questionstypically look at the learning formulationwith a finer toothed comb,identifying individual elements or groups of elements of interest. These ideas all fall under a similar fundamental problem of finding subsets, where they be of groups within the population, features of the input, or sections of the model. Uniquely shaped by their particular machine learning instantiations, we need methods for identifying (1) subsets of training samples or subpopulations which have disparate outcomes for a given model, (2) feature subsets that are sufficient or uniquely explain a particular model prediction, and (3) important parameter subsets or paths through neural network models that explicitly describe how a model output was generated. In this dissertation, we will describe a number of methods for addressing these subset selection problems. Developing mathematical tools based primarily on aspects of differential geometry and conditional independence, we will demonstrate theoretical and empirical effectiveness of these methods on a wide breadth of problems that can be distilled in the manner above, including hypothesis testing with medical imaging, predicting disease progression, machine unlearning, and increasing model fairness.

Search

Additional Options

Website Search

Catalog Search

Database Search

Journal Search

Article Search

UW Digital Collections Search

Search the UW-Madison Libraries

Identifying Feature, Parameter, and Sample Subsets in Machine Learning and Image Analysis

Search Text Within

Permalink

Details

Subjects

Content Type

Additional Options

Chat with a Specific library

Identifying Feature, Parameter, and Sample Subsets in Machine Learning and Image Analysis

download Download

copyright Copyright Statement

Search Text Within

Permalink

Details

Publication Details

Subjects

Content Type

UW-Madison Dissertations and Theses

Keyboard Shortcuts

Available anywhere

Available in search results