Videos, Slides, Films

Building and managing training datasets for ML with Snorkel

Available as
Online
Summary

"Alex Ratner outlines work on Snorkel, an open source framework for building and managing training datasets, and details three key operators for letting users build and manipulate training datasets...

"Alex Ratner outlines work on Snorkel, an open source framework for building and managing training datasets, and details three key operators for letting users build and manipulate training datasets: labeling functions for labeling unlabeled data, transformation functions for expressing data augmentation strategies, and slicing functions for partitioning and structuring training datasets. These operators allow domain expert users to specify ML models via noisy operators over training data, leading to applications that can be built in hours or days rather than months or years. Alex explores recent work on modeling the noise and imprecision inherent in these operators and using these approaches to train ML models that solve real-world problems, including a recent state-of-the-art result on the SuperGLUE natural language processing benchmark task. This session is from the 2019 O'Reilly Artificial Intelligence Conference in San Jose, CA."--Resource description page.

Details

Additional Information