This article was written by Bob Hayes.
Data science requires the effective application of skills in a variety of machine learning areas and techniques. A recent survey by Kaggle, however, revealed that a limited number of data professionals possess competency in advanced machine learning skills. About half of data professionals said they were competent in supervised machine learning (49%) and logistic regression (53%). Deep learning techniques were among the ML skills with the lowest competency rates: Neural Networks – GAN (7%); NN – RNNs (15%) and NN – CNNs (26%).
A majority of enterprises (80%) have some form of artificial intelligence (machine learning, deep learning) in production today. Additionally, about a third of enterprises are planning on expanding their AI efforts over the next 36 months. But who will lead these data science projects? Who will do the work? Some researchers suggest there is a lack of AI talent needed to fill those roles. Tencent estimates there are only 300,000 AI researchers and practitioners worldwide. ElementAI estimates there are 22,000 PhD-level researchers working in AI.
Kaggle conducted a survey in August 2017 of over 16,000 data professionals (2017 State of Data Science and Machine Learning). The survey asked respondents about their competence across a variety of AI-related approaches and techniques. Looking at different AI skills will give us a more detailed look into the specific AI skills that are driving this talent gap.
Competency in Machine Learning Areas
All respondents (employed or not) were were given a list of 13 machine learning areas and asked to indicate in which areas they consider themselves competent. The top 10 machine learning areas in which data professionals are competent were:
- Supervised Machine Learning (49%)
- Unsupervised Learning (26%)
- Time Series (25%)
- Natural Language Processing (19%)
- Outlier detection (16%)
- Computer Vision (15%)
- Recommendation Engines (14%)
- Survival Analysis (8%)
- Reinforcement Learning (6%)
- Adversarial Learning (4%)
Competency in Machine Learning Techniques
The survey included a question for all data professionals, employed or not, regarding their competency in 13 machine learning techniques (In which areas of machine learning do you consider yourself competent? (Select all that apply).) The top 10 machine learning techniques in which data pros are competent were :
- Logistic Regression (54%)
- Decision Trees – Random Forests (43%)
- Support Vector Machines (32%)
- Decision Trees – Gradient Boosted Machines (31%)
- Bayesian Techniques (27%)
- Neural Networks – CNNs (26%)
- Ensemble Methods (22%)
- Gradient Boosting (17%)
- Neural Networks – RNNs (15%)
- Hidden Markov Models HMMs (9%)
To read the whole article, with illustrations, click here.