As part of the research underpinning Developer Economics we actively monitor industry trends and opportunities, looking for new areas of significant developer interest. In our Developer Economics survey, we invested in trends in Data Science and Machine Learning among other areas of emerging tech- the latter probably being the least hyped emerging tech space with the most developer activity.
Making sense of data
A side effect of there now being a 1990s level supercomputer in 23 billion pockets worldwide is that we’re drowning in data. All of the data collected in human history, up to the turn of the millennium, is certainly less than we now generate every day. The Internet of Things is adding sensors to anything and everything, which will compound this problem. Some parts of the industry like to talk about Big Data. The term has been abused close to meaninglessness but at a minimum we’re talking about too much data to process easily on a single PC. Definitely too much to load up in a spreadsheet and make a few charts. The thing is, data doesn’t have to get this big before it’s too complex for a human to “eyeball” it and spot patterns. This is where data scientists come in. A Data Scientist’s role is to extract useful information from a sea of apparently incomprehensible data. There are a wide array of statistical tools for doing this in a guided way but when the dataset is very large, complex, or analysed in real time, a popular approach is to have the computer analyse the data and figure out what’s important. To achieve this the machine needs to learn about the structure of the data.
The idea of Machine Learning is an old one. Some of the early pioneers of computing believed that self-modifying systems that learn from real world data are the best possibility for producing artificial intelligence (AI). They didn’t have the hardware or software tools to build such systems but their intuition was good. The first artificial neural networks were implemented in the 1950s. However, by the end of the 1960s it became clear that the computing power available at the time was insufficient to train more complex networks, capable of solving non-trivial problems. Focus shifted to a range of other techniques, many of which have become mature and useful tools. Not everyone gave up on neural network approaches and eventual improvements in training algorithms and ongoing increases in computing power created breakthroughs. A set of techniques for structuring and training many layered neural networks, collectively known as Deep Learning, started reaching record levels of performance in a wide range of problems. Machine Learning tools are now at the forefront of AI research again, as well as being indispensable to data scientists. The trick is that both the data scientists and the AI systems are using Machine Learning tools to find structure and patterns in complex data. The output from the machine learning systems used by Data Scientists is being used to support human decision making at many of the largest organisations in the world. The output from Machine Learning subsystems within an AI system is used to drive autonomous behaviour.
The really headline grabbing breakthroughs in AI have come in the last year, so it’s perhaps not surprising that Machine Learning has a lot of developers newly interested and exploring via a side project. Data Science and Machine Learning in general have been helping companies do more with their data for several years now, so the fact that 41% of the developers in our latest survey are involved in some way makes sense too. 33% of them are professionally involved, and they show more of a bias towards enterprise and internal audiences. If we look at the hobby and side project activity it’s clear that a lot of developers are just learning the ropes. 41% are not sure what audience they’re targeting, typically suggesting a pure technology exploration. Across the whole sector open source toolkits in Python and R are extremely popular, whilst there are smaller but clear early signs of interest in internet giant-backed offerings like Google’s TensorFlow.
Skills for the future
Data Science and Machine Learning in general have been helping companies do more with their data for several years now,so the fact that 41% of the developers in our latest survey are involved in some way makes sense. Although there are already important and highly profitable applications for these emerging technologies, it’s clear that they have much bigger roles to play in the future. The global developer population is busy levelling up their skills to co-create that future, or at least be ready to benefit when it arrives. The huge interest in Machine Learning, when compared to the relatively tiny activity levels in messaging bots, is perhaps in recognition of their significance. Machine Learning is likely to be a central part of most of the really important systems created in the future. It’ll be the magic behind many amazing new products and the automating force that comes to replace many white collar workers.