Home » Sector Topics » News and Entertainment AI

DSC Weekly Digest 06 July 2021

  • Kurt Cagle 
Machine Learning Pipelines


The Future of Machine Learning

Before launching into my editorial this week, I wanted to make the announcement that starting with this issue, the DSC Newsletter will be sent out on Tuesdays rather than Mondays. To subscribe to the DSC Newsletter, go to Data Science Central and become a member today. It’s free! 

There has long been a pattern with computer technology. At a certain point in the evolution of technology, there is a realization that things that had been done repeatedly as one-offs occur often enough to start building libraries, or even extensions to languages. For a while, mastery of these libraries defines a certain subset of programmers. or analysts, and typically most of the innovations tend to take place as improvements to these libraries, articles on technical sites or journals, and so forth.

Eventually, however, the capabilities are abstracted into more sophisticated stand-alone applications, frequently with user interfaces that provide ways to handle the most frequent use cases while relegating the edge cases to specialized screens. The results of these, in turn, are wrapped within some kind of container (such as Kubernetes) that can then be incorporated into a pipeline with other similar containers.

This is the direction that machine learning is going. MLOps now complements DevOps, ensuring that changes to machine learning models – from the data engineering necessary to ensure that the source data is ready for production, to feature engineering that can be altered on the fly to try out different scenarios, through to presentation and productization that not only makes sure that the results are understandable to a business audience, but that also can then feed into other operational channels, is here now, and will likely become commonplace within the next couple of years within the industry.

This transformation is critical for several reasons. First, it makes it far easier to create ensemble models, models that are developed and work in parallel, and that can handle different starting scenarios. This is key because the more generalized a model has to be, the more expensive, time-consuming, and complex it turns out, and the less likely that it can handle edge cases accurately. This is especially important when dealing with sparse datasets, where the danger is that single, comprehensive models can badly overfit the input, making such models very brittle to initial conditions.

In addition to this, however, however, by reducing the overall costs of implementing models from months to weeks, or even days, organizations are able to better productize their data analytics in ways that would have been unheard of even a couple of years before. As not all problems can (or should) be solved with machine learning in the first place, the ability to take advantage of more generalized DevOps pipelines within your organization put machine learning right where it belongs – as a powerful tool among many, rather than a single, potentially shaky foundation on its own.

For machine learning and data science specialists, this has other implications as well. Domain proficiency in a given sector will mean more, the ability to write Python or R will mean less, save for those who focus more specifically on tool-building within integrated frameworks. However, having a good understanding of data operations in general and machine learning operations in particular, all engineering tasks, will likely increase in demand dramatically over the next few years. Additionally, those that are better at productizing data, integrating ML streams in with other streams towards the creation of digital assets that can then be published as physical assets, will do quite well.

Machine learning is maturing. There’s nothing wrong with that.

In media res,

Kurt Cagle
Community Editor,
Data Science Central


Data Science Central Editorial Calendar

DSC is looking for editorial content specifically in these areas for July, with these topics having higher priority than other incoming articles.

  • MLOps and DataOps
  • Machine Learning and IoT
  • Data Modeling and Graphs
  • AI-Enabled Hardware (GPUs and similar tools)
  • Javascript and AI
  • GANs and Simulations
  • ML in Weather Forecasting
  • UI, UX and AI
  • Jupyter Notebooks
  • No-Code Development

DSC Featured Articles


TechTarget Articles

Picture of the Week

 


To make sure you keep getting these emails, please add [email protected] to your browser’s address book.

This email, and all related content, is published by Data Science Central, a division of TechTarget, Inc.

275 Grove Street, Newton, Massachusetts, 02466 US

You are receiving this email because you are a member of TechTarget. When you access content from this email, your information may be shared with the sponsors or future sponsors of that content and with our Partners, see up-to-date  Partners List  below, as described in our  Privacy Policy . For additional assistance, please contact:  [email protected]

copyright 2021 TechTarget, Inc. all rights reserved. Designated trademarks, brands, logos and service marks are the property of their respective owners.

Privacy Policy  |  Partners List