By Grant Shirk - May 28, 2019
At TieCON 2019, the world’s largest entrepreneurial conference, Sisu CEO and founder Peter Bailis gave a keynote address on “Whatever happened to democratizing ML?” In that conversation, Peter challenged the conventional wisdom (and described three inconvenient truths) about enterprise ML.
There’s no doubt that we’re in a golden age of AI. Over the last several months, we’ve seen incredible advances in applying artificial intelligence techniques to image recognition, language processing, planning, and information retrieval. We’re seeing practical applications of machine learning improving everyday activities. There are more amusing applications too, including one team teaching AI how to craft puns.
However – particularly in the world of business – it feels like we’re “not quite there yet” when it comes to finding meaningful enterprise ML and AI applications. There’s a growing sentiment that solutions in the market today are too bespoke, require extensive consulting investment, and are at risk for never showing a positive ROI.
At Sisu, we believe there are three inconvenient truths about enterprise ML that are at the root of this challenge. The good news is that each of these challenges are surmountable with the right focus.
One of the most valuable investments ever made into training data is the ImageNet project, a set of over 14M images categorized and labeled, and open to the public. Thanks to this investment from Fei-Fei Li and the ImageNet team, researchers and deep learning enthusiasts have been able to dramatically improve image classification accuracy.
However, gathering this kind of labeled data at scale can be very demanding. Particularly for tasks involving sensitive data or limited domain expertise, data is difficult or even impossible to come by. For example, the collection and labeling of DICOM medical image scans is challenging for privacy reasons and it’s even harder to find experts who can credibly identify and label tumors, tears, and abnormalities. These are really valuable tasks, but it’s an open question if it’s feasible to get enough data to effectively train upon.
What’s more, deep networks don’t help much with model accuracy for structured data use cases. This is particularly relevant for businesses, as most enterprise information is structured, tabular data. A great example of this in practice is a recent paper from Google on “Scalable and accurate deep learning with electronic health records.” The paper shows some dramatic results for prediction accuracy in healthcare outcomes, but at the same time also shows that simpler approaches like logistic regression perform almost as well.
Or in other words, we’re not quite at the point where the investment required to train a deep net on structured data delivers a significant ROI above and beyond other techniques.
AutoML has recently been touted as a major advance for enterprise ML. While automating key steps of the data science process can increase the pace of model creation, this automation is not a panacea to solving enterprise ML. There’s still a long way to go before AutoML models reach the level of accuracy needed for real-world success.
So what can we do in response? By taking each of these truths in turn, it is possible to identify a few key principles that can accelerate the adoption and effectiveness of machine learning in the enterprise.
We’re at the point where the luster on this golden age of AI is fading, but with the right investments, it’s possible to avoid widespread disillusionment with the technology. There are valuable, practical, and feasible applications for enterprise ML. It’s why we started Sisu, and we’re excited for the possibility of “off the shelf machine learning” and making new tools accessible to more people in the organization.