One concept in the “Thinking Like a Data Scientist” methodology that always seems to befuddle folks is the difference between the roles of Questions versus Decisions. The distinction is important because they serve very different but very important roles in understanding how organizations can leverage data and analytics to create new sources of customer, product, service, and operational value.
A question is a request for information or clarification, while a decision is a choice made between multiple options after considering various factors. In other words, a question seeks information, while a decision is an action based on the information received.
Questions have the following characteristics:
- Questions are easily identifiable when engaging with the subject matter experts because they are already asking these questions today.
- Asking Questions can be used to identify, explore, and validate assumptions and features to be tested for predictive relevance.
- Business Intelligence reports and dashboards are built to answer questions such as what sales were last month, how many new customers did we acquire last week, or how many customers visited our theme park last year.
However, one cannot attribute quantifiable value to asking better questions. Improving the questions one asks by 10% (if that’s even measurable) does not lead to quantifiable, measurable value. That is, you can’t monetize asking better questions.
Decisions have the following characteristics:
- Decisions are easily identifiable by the subject matter experts as they are trying to make these decisions as part of their normal operations.
- Making Decisions implies an action to be taken; that is, decisions are actionable.
- Data Scientists build advanced AI / ML analytics to improve Decisions such as what at-risk customers to target with what offers, or what compressors to replace by what technicians on what date, or what prospects to offer what scholarships for what amounts.
And most importantly, one can attribute quantifiable value from making better decisions. That is, improving one’s decision making by 10% (clearly measurable) does lead to quantifiable, realized value. That is, you can monetize making better decisions (Figure 1).
Figure 1: Difference Between Questions and Decisions
Important Role of Questions
While I may seem a bit dismissive of questions, questions play a very important role in their ability to uncover potential predictive features; that is, uncovering variables and metrics (features) that might be better predictors of performance.
Features are the attributes, properties, or data variables that Machine Learning models use during training and inference to make predictions.
As I discussed in my 3-part series on the business and operational importance of features, features are reusable, composable, economic assets that provide the building blocks for an organization’s AI / ML business strategy (Figure 2).
Figure 2: Data-to-Features-to-Use Case Value Topology
There are 3 key aspects of features:
- Features are used to make predictions. Featuresare a higher-level data construct created by mathematically transformations of data variables that ML models use during training and inference to make predictions about what is likely to happen next.
- Not all Features are of equal value. Some features are more important than other features in driving ML model predictive accuracy and precision (and that we can use advanced algorithms like Random Forest, Principal Component Analysis, and Shapley Additive Explanations to make those determinations).
- And maybe most importantly, Features are Economic Assets. Features can be shared, reused, and continuously refined across an unlimited number of ML model to support an unlimited number of business and operational use cases (Feature Stores).
What I find interesting about the feature discovery and feature engineering process, is that in many cases that process starts with questions! Yes, questions play a critical role in identifying potential variables and metrics (features) that might be better predictors of performance (Figure 3).
Figure 3: Questions to Features to Decisions Value Stream
In Figure 4, we can see how questions captured during a workshop from different stakeholders can be converted into potential features that our data science team can then explore to validate their predictive capabilities.
Figure 4: Question / Assumptions Converted into Potential Predictive ML Features
Important Role of Decisions
You monetize better decisions, not better questions
Decisions is a critical linkage point between business stakeholders and the data science team. And the collaborative envisioning process to identify, validate, value, and prioritize the key decisions that the organization needs to make (in context of the organization’s top priority business initiatives) is the key to helping organizations get more value from their data.
Ultimately, decisions – along with their desired outcomes and the KPIs and metrics against which we will measure decision effectiveness from the perspective of the different stakeholders – forms the foundation for a Use Case. And it is around Use Cases – such as increasing cross-sell effectiveness, reducing O&E inventory, reducing out-of-stocks, improving 4-year graduation rates, etc. – that we will focus the organizations data, analytics, and economic value creation efforts (Figure 5).
Figure 5: Anatomy of a Use Case
Summary: Role of Decisions and Questions
Decisions! Decisions! Decisions! Yes, I know that I’m sounding like a broken record. But identifying, validating, prioritizing, and optimizing decisions is the key for organizations who are struggling to get value from their data.
And while Decisions are the key to creating new sources of customer, product, service, and operational value, Questions play a vital role in help identify those potential predictive variables and metrics (features) that we can leverage in our ML models to optimize those decisions.
Note: if interested, you can read the entire three part series on the important role of features here:
- Features Part 1: Are Features the New Data? https://www.datasciencecentral.com/features-are-the-new-data/
- Features Part 2: Clarifying the Data-Features-Use Case Value Topology https://www.datasciencecentral.com/features-part-2-clarifying-the-data-features-use-case-value/
- Features Part 3: Features as Economic Assets https://www.datasciencecentral.com/features-part-3-features-as-economic-assets/