By Peter Bailis - January 28, 2019
Achieving high accuracy in machine learning often requires extremely large amounts of training data – but just how much training data is required? Recent research highlights a surprising trend: training data provides diminishing marginal returns. That is, the final few percentage points of accuracy gains require orders of magnitude more training data than the first 95%.
Just as Moore’s Law helped define the economics of microprocessor R&D, this exponential trade-off between dataset size and accuracy has major implications for bringing ML to scale. Specifically, under this trade-off, we can coarsely classify applications of ML into two categories:
It’s clear from recent history that bespoke ML efforts can work well, especially if staffed by large and specialized teams of ML PhDs, data engineers, and human annotators. In contrast, off-the-shelf ML is far more accessible, but it’s unclear how to leverage these models to make useful business decisions. Attempting to fully automate existing workflows is unlikely to work well, except for the lowest-value use cases. Instead, using off-the-shelf ML in processes that leverage and augment human abilities have the most potential for widespread adoption. At Sisu, we believe delivering value with off-the-shelf ML represents an opportunity at the scale of Microsoft Office, and there’s an entirely new set of tools needed to fill the gap.