Are data scientists obsolete in the agentic era?

Agentic AI Workflow Automation, Artificial intelligence AI driven decision-making concept illustration blue background

How AI agents are reshaping one of the most prestigious tech roles of the past decade

In a corner office of a Fortune 500 company, a team of data scientists once spent weeks crafting algorithms to predict customer churn. Today, a business user without data science support can craft an AI agent to perform same task in hours. As we enter 2025, we face a pivotal question: Are we witnessing the twilight of the data scientist profession?

According to 365 Data Science, data scientists currently rank among the world’s top-paid and most sought-after professionals, consistently appearing in the top three career choices in terms of compensation. Over the last two decades, the role has evolved significantly from basic analytics to strategic decision-making positions.

The golden age of data science

Between 2014 and 2021, data scientists experienced unprecedented career and salary growth, fueled by the explosion of big data and artificial intelligence. Companies scrambled to hire talent that could transform mountains of information into business intelligence, paying premium salaries for these specialized skills.

This period coincided with the third “golden age” of AI, marked by breakthroughs in the area of Neural Networks like AlexNet (2012), GANs (2014), Transformers (2017), GPT-3 (2020), and ChatGPT (2022). Python became the de-facto language for data science, supported by libraries like scikit-learn, NumPy, pandas, and PyTorch. Infrastructure evolved with GitHub, Docker, and cloud tools like Azure AI Foundry, Amazon SageMaker, and Databricks.

Yet despite these technological advances, the fundamental, five-step data science workflow remained unchanged—data collection, cleaning, model building, post-processing, and deployment, as follows:

Data Collection involves gathering relevant data to train AI models. For example, an AI designed to recognize cats requires thousands of cat images.

Data Cleaning ensures data quality by removing duplicates, fixing errors, handling missing values, and standardizing formats. Blurry or mislabelled images, for instance, would be removed from a dataset.

Model Building using clean data train AI models to recognize patterns and make predictions. Different models are tested to determine the best fit for the task.

Post-processing evaluates model performance by testing on new data, fine-tuning parameters, and interpreting results to ensure accurate decision-making.

Deployment is the final step that integrates the trained model into real-world applications, enabling continuous data processing and monitoring to maintain performance.

This workflow is iterative. Data scientists refine models based on real-world feedback, ensuring continuous improvement in AI systems. Just like you might adjust a recipe for your favorite beef stew, after getting feedback, data scientists often go back and refine their models based on how they perform in the real world. By following this workflow, data scientists have created AI systems that can do amazing things, from recognizing speech to predicting weather patterns!

AutoML: The first attempt at disruption, or was it?

The first significant challenge to traditional data science came in the form of Automated Machine Learning (AutoML). This technology aimed to automate the end-to-end machine learning process, with key companies like Google, Amazon, Microsoft, DataRobot, Databricks, and H2O.ai leading development.

As it turned out, AutoML largely failed to revolutionize the field. Why? Unchanged workflow: The core five-step data science workflow remained intact. More specifically:

Structured Data Dependency: AutoML worked primarily with structured data, limiting its application.
Data Scientist Control: AutoML was designed by data scientists for data scientists, limiting business user adoption.
Ongoing Maintenance Requirements: AutoML tools still needed continuous monitoring and deployment support.
Augmentation Rather Than Automation: Instead of fully automating processes, AutoML merely assisted data scientists.
Technical Expertise Required: In-depth knowledge of data science remained essential for effective use.

Perhaps it’s telling that data scientists themselves controlled AutoML’s development and implementation, creating tools that augmented rather than replaced their roles—an understandable act of self-preservation.

The rise of agentic AI

Today, we stand at the threshold of a new era. Agentic AI represents a fundamental shift from previous automation attempts, potentially succeeding where AutoML failed.

An AI agent combines multiple technologies to think, learn, and act in both real and virtual worlds, representing the natural evolution from Large Language Models (LLMs) to LLM+RAG (Retrieval-Augmented Generation) to AI agents.

What makes AI agents different? They incorporate:

Multimodality Augmentation: Processing images alongside text, enabling more complex analysis.
Tool Use: Interacting with backend systems and automating actions without human intervention.
Memory: Recalling previous interactions and recognizing patterns over time.
Reflection: Assessing and improving responses and decision making through iterative feedback.
Community Interaction: Collaborating with specialized agents and escalating to humans when necessary.

These capabilities make AI agents the most powerful automation force yet. One that does not merely assist data scientists with the five-step workflow but operates each step autonomously, delivering exponential savings in time and human resources.

The new technology stack

The emerging agent ecosystem is built on platforms like Hugging Face, with its 900,000+ models and 200,000+ datasets; frameworks like LangChain, which connects language models with external data sources; and orchestration tools like AutoGen and CrewAI, which facilitate cooperation among multiple agents. These are powered by APIs from OpenAI, Deepseek, Mistral, and Anthropic.

The differences between the old and new stacks are striking:

Aspect	Old Stack	New Stack
Data	Structured	Structured or unstructured
Tooling	Data science-friendly only	Data science-friendly and end-user-friendly
Costs	High upfront costs	Pay-as-you-go pricing models
Models	Bring your own models (BYO)	BYO or third-party mod

The new tech stack disrupts the five-step process, representing a fundamental shift in AI development. What was once custom-built, high-maintenance systems requiring expert knowledge has now become a flexible, accessible, and automated ecosystem. Which means that AI agents can be leveraged by a much broader audience, including line of business leaders, reducing the reliance on specialized data scientists while enabling faster, more scalable AI solutions that may address business challenges more ideally.

The Cautionary Tales of Technological Inertia

History is littered with the remains of once-dominant companies that failed to disrupt themselves when faced with technological shifts. Kodak, Blockbuster, Nokia, BlackBerry, and Borders all resisted transformation, clinging to legacy business models despite clear signals of impending disruption.

These cautionary tales share a common thread: companies that resist change, protect outdated models, or underestimate emerging technologies often face decline or obsolescence. The same fate may await organizations that fail to embrace the potential of AI agents and its impact on data science.

In contrast, companies that successfully navigated technological disruption did so through bold reinvention. Netflix abandoned DVDs for streaming, Amazon offered competitive products alongside its own on its web marketplace, Microsoft sidelined client/server solutions in favor of cloud, and AWS introduced serverless computing (Lambda) alongside its profitable EC2 server-based pay-per-use cloud service. All are brave moves that ensured the longevity of these brands.

The next generation data scientists

Looking into the crystal ball, it is not too difficult to see that data scientists will have a renewed and broader mandate beyond the traditional five-step process. As such, they should look to incorporate additional ancillary responsibilities that were previously outside their remit. The most successful data scientists will embrace this evolution, seeing agents as opening new frontiers for human creativity and strategic thinking. Possible roles include:

AI Agent Architect & Orchestrator: Designing, configuring, and managing multi-agent systems.

Business-AI Translator: Mapping business challenges into AI agent frameworks.

AI Ethics & Governance Specialist: Ensuring responsible AI deployment and compliance.

Data & Knowledge Engineer: Building high-quality data infrastructure for AI agents.

Human-AI Collaboration Expert: Creating workflows that integrate AI with human decision-making.

Reform or revolution?

Data scientists, however, may have felt protected since they were central to AI from the beginning. Perhaps they were too conservative in safeguarding their roles despite clear signs of disruption. Instead of fully redefining themselves for the agentic era, many took incremental steps to adapt, delaying the inevitable transformation.

The same choice now faces data scientists and organizations: reform incrementally and risk obsolescence or embrace the revolution and redefine their roles for an AI-first world.

For CIOs and business leaders, the message is clear: The agentic era is upon us. Organizations that recognize this shift, reimagine roles, and integrate AI agents strategically will gain a decisive competitive advantage. The future may not belong to those with the best data scientists but to those who most effectively deploy AI agents to augment human capabilities across the enterprise. History has shown that resisting transformation is a losing strategy—the only real question is who will adapt in time.