Books

Interactive machine learning: from theory to scale

Author / Creator
Zhu, Yinglun, 1994- author
Available as
Online
Physical
Summary

While machine learning has made unprecedented successes in many real-world scenarios, most learning approaches require a huge amount of training data. Such a requirement imposes real challenges to ...

While machine learning has made unprecedented successes in many real-world scenarios, most learning approaches require a huge amount of training data. Such a requirement imposes real challenges to the practitioners, e.g., data annotation can be expensive and time-consuming. To overcome these challenges, this dissertation studies interactive machine learning, where learning is conducted in a closed-loop manner: The learner uses previously collected information to guide future decisions (e.g., which data points to label next), which in turn help make the following predictions. This dissertation focuses on developing novel algorithmic principles and uncovering fundamental limits when scaling interactive machine learning into real-world settings at large scales. More specifically, we study interactive machine learning with (i) noisy data and rich model classes, (ii) large action spaces, and (iii) model selection requirements; this dissertation is thus grouped into three corresponding parts. To bring the promise of interactive learning into the real world, we develop novel human-in-the-loop learning algorithms and systems that achieve both statistical efficiency and computational efficiency. In the first part, we study active machine learning with noisy data and rich model classes. While huge successes of active learning have been observed, due to technical difficulties, most guarantees are developed (i) under low-noise assumptions, and (ii) for simple model classes. We develop efficient algorithms that bypass these two fundamental barriers and thus make an essential step toward real-world applications of active learning. More specifically, by leveraging the power of abstention, we develop the first efficient, general-purpose active learning algorithm that achieves exponential label savings without any low-noise assumptions. We also develop the first deep active learning (i.e., active learning with neural networks) algorithms that achieve exponential label savings when equipped with an abstention option. In the second part, we study decision making with large action spaces. While researchers have explored decision making when the number of alternatives (e.g., actions) is small, guarantees for decision making in large, continuous action spaces remained elusive, leading to a significant gap between theory and practice. In this part, we bridge this gap by developing the first efficient, general-purpose contextual bandits algorithms for large action spaces, in both structured and unstructured cases. Our algorithms make use of standard computational oracles, and achieve nearly optimal guarantees, and have runtime and memory independent of the size of the action space. Our algorithms are also highly practical: They achieve the state-of-the-art performance on an Amazon dataset with nearly 3 million categories. In the third part, we study model selection in decision making. Model selection is the fundamental task in supervised learning, but it faces unique challenges when deployed in decision making: Decisions are made online and only partial feedback is observed. Focusing on the regret minimization setting, we establish fundamental lower bounds showing that model selection in decision making is strictly harder than model selection in standard supervised learning: Compared to an additional logarithmic cost suffered in supervised learning, one has to pay an additional polynomial cost in decision making. Nevertheless, we develop Pareto optimal algorithms that achieve matching guarantees (up to logarithmic factors). Focusing on the best action identification setting, we develop novel algorithms and show that model selection in best action identification can be achieved without too much additional cost.

Details

Additional Information