Composable Machine Learning

Even as machine learning (ML) algorithms become more sophisticated and powerful, the way ML teams build ML systems hasn’t changed much. In this article, we’ll explain the need for composable machine learning systems. First, take a look at the old, inefficient way. Once the team figures out the task (e.g. classify data, detect regions of interest, forecast the future, find anomalies, etc.) and the type of data involved (is it tabular, time-series, graph, image, natural text, or other?), the usual next step is to design and program a mathematical model and computational algorithm from scratch, or search for and download code for a matching pre-built model/algorithm. The team prepares the data pipeline, trains or fine-tune the model using the algorithm, validates, iterates, deploys, serves the ultimate solution, and upgrades to a better model when that becomes available.
[Related Article: ODSC Europe 2019 Preview: Tutorial on Automated Machine Learning]
But what happens when the task is highly complex, such as controlling an industrial manufacturing process (such as cement, steel production or oil & gas extraction) in real-time to improve its yield, power and raw material consumption? Or writing an x-ray medical report on behalf of a doctor? Or a cognitive chatbot that answers questions on behalf of customer support? Or when the task needs two or three data sources, such as radiological images plus doctors’ notes? Chances are there won’t be a ready-to-download model that nicely fits these requirements, so the team needs to build their own unique and substantially different model. Realistically, the great majority of ML teams find themselves limited to using downloadable models, and thus unable to innovate. Some teams might have the skills and research training to go farther and produce a hand-crafted, one-off solution—only to find that it cannot be scaled into production.
To break this status quo, old systems need to turn into composable machine learning, so that ML teams—not necessarily made up of advanced experts—can build applications for a richer spread of AI tasks, and take them into scalable production. Composable ML is not the same as ML research, which can be thought of as creating new model and algorithm building blocks. Instead, composable ML makes it simpler to design ML models—like electricians building or installing circuits at many different houses—by allowing non-researchers to put existing building blocks together, or to recombine them to solve new tasks.

At Petuum, we use Texar to create sophisticated and original ML systems for medical report writing from chest x-ray images, and multi-lingual (English, Chinese and Japanese) cognitive chatbots for retail in-store assistance and call center support, as well as reproduce and extend recent models from the research community such as BERT. We provide the Texar-created ML compositions with scalable ML infrastructure support that handles raw data ingestion and pre-processing, distributed model parameter training, elastic resource management, model versioning, training, and inference serving, as well as containerized program management. We believe that all these engineering elements beyond just ML models and algorithms are needed to create non-trivial ML systems that generate enormous customer value, yet do not burden customers with complex data and ML maintenance requirements that are un-scalable. Petuum offers Texar as open source under a friendly license, and we hope that it can benefit other ML teams searching for a sustainable way to produce the next generation of AI applications.
[Related Article: The Best Machine Learning Research of Summer 2019]
Eric P. Xing




