Simplifying and automating machine learning processes and techniques – that depend on large-scale, distributed datasets to achieve high statistical performance – is critical for the future of applied data science.
.
TUPAQ is a new architecture for automating machine learning comprised of a cost-based cluster resource allocation estimator, advanced hyperparameter tuning techniques, bandit resource allocation via runtime algorithm introspection, and physical optimization via batching and optimal resource allocation.
.
TUPAQ finds and trains models for a user’s predictive application and scales to models trained on Terabytes of data across hundreds of machines.
.
In the future innovative tools are required to simplify and automate machine learning and other data science processes and techniques – to enable data scientists to spend less time on administration and more time on high value solutions for complex problems.