Question: Do you know which of your data sources are most valuable to the business?
A recent paper by PwC titled “Putting a value on data” raised the “Spectre” (“Bond, James Bond”) for organizations unable to answer this question. As the paper states: “The value of information assets has never been greater. According to the European Commission, by 2020 the value of personalized data – just one class of data – will be one trillion euros, almost 8% of the EU’s GDP.”
Question: Do you know which of your data sources are most valuable to the business?
Yea, a simple question to ask, but a dang hard question for many organizations to answer. To help organizations get a handle on this question, let’s review some of Schmarzo’s “Laws of Data Valuation”.
Data Valuation Law #1: Can’t Determine Value of Data in Isolation of the Business
Data has latent value; that is, data has potential value that has not yet realized. And the possession of data in of itself provides zero economic value, and in fact, the possession of data has associated storage, management, security, and backup costs and potential regulatory and compliance liabilities. That’s the classic challenge of Stage 1 of the Data Monetization Roadmap (Figure 1).
Figure 1: Data Monetization Roadmap
Data must be “activated” or put into use in order to convert that latent (potential) value of data into kinetic (realized) value. The key is getting the key business stakeholders to envision where and how to apply data (and analytics) to create new sources of customer, product, service, and operational value.
The good news is that most organizations are very clear as to where and how they create value. For example, the annual reports (and analyst calls) of most organizations are very clear as to how they are seeking to create value over the next 12 to 18 months (Figure 2).
Figure 2: Determining How Your Organization Creates Value
Fact: in order to determine the value of your organization’s data, you must understand how your organization creates value, including the identification the KPIs and metrics against which the organization measures value creation effectiveness.
Data Valuation Law #2: Value of the Data is Tied to Business Outcomes
The value of the organization’s data is tied directly to its ability to support quantifiable business outcomes or Use Cases. For example, Use Cases supporting an organization’s “Reduce Customer Attrition” business initiative might include “Accelerating Flagging High-risk Customers”, “Predicting Customer Lifetime Value”, “Improving Retention Campaign Effectiveness”, “Improving Customer Satisfaction”, “Improving Employee Satisfaction”, “Improving Customer Net Promoter Score”, “Improving Customer Social Media Advocacy” and “Improving Customer Likelihood to Recommend”.
In another example, if your business initiative is to increase same store sales over the next 12 months, then the use cases that might support that business initiative can be seen in Figure 3.
Figure 3: Brainstorming Use Cases that Support Organization’s Business Initiative
Collaboration between business stakeholders and the data science team is critical in identifying, validating, valuing, and prioritizing the decisions – and their supporting KPIs – that comprise the business and operational use cases. Decisions are a powerful “catalyst” around which to align the business and data science (data and analytic) resources because:
- Decisions are Easily identifiable. Business stakeholders understand the decisions that they make every day. To a certain extent, the decisions haven’t changed in years. What’s changed, courtesy of big data and advanced analytics, are the answers.
- Decisions are Actionable. Decisions are actions of deciding something and then acting. Note: Decisions are different from questions which are useful for validation and exploration but are not necessarily actionable.
- Decisions Yield Quantifiable Value. One can quantify the business impact of improving decision effectiveness. For example, what’s the value of improving customer retention by 2%? What’s the value of reducing operational downtime by 2%?
- Decisions are Optimizable. It is around decisions that data science teams can apply data engineering, feature engineer, and advanced analytics (AI / ML) to blend different data elements (features) to optimize decisions.
Data Valuation Law #3: Not all Data Sources are of Equal Value
Once we have identified the use cases that are the sources of value creation, there are three steps required to determine the value of your data.
Step 1: Determine Business Value of Each Use Case. Perform value engineering to identify the potential business value of each use case. For example, what is the potential revenue impact of increasing store traffic 15% through local events marketing (Figure 4)?
Figure 4: Assessment of Use Case Business Value Potential
NOTE: I use the term “business value” instead of “financial value” because organizations have broadened their definition of “value” to include non-financial categories of value such as customer satisfaction, employee satisfaction, operational excellence, environmental impact, and society impact.
Step 2: Determine Data Element Importance. There are analytic algorithms such as Principal Component Analysis (PCA), Random Forest, and Shapley Values that can determine the relative importance of the data sources to the analytic results. An example of the data sources-to-use cases mapping can be seen in Figure 5, where the check marks represent data sources that the data science team had determined where most influential on the analytic results.
Figure 5: Map Data Sources to Use Cases
Step 3: Allocate Use Case Business Value to Supporting Data Sources. The final step is allocating the use case business value (calculated in Step 1) to each of the supporting data sources. In Figure 6, I used a straight allocation method (i.e., if there were 4 data sources used to drive the use case outcome, then each sources set got one-fourth of the use case value) to allocate the use case business value to the supporting data sources.
Figure 6: Attributing Use Case Business Value to Each Contributing Data Source
Also, if you wanted to get more granular in your value allocation, you could use Random Forest or PCA or Shapley Variables to create a more accurate attribution of use case value to each data source instead of using the straight-line allocation method.
Finally, the last column in Figure 6 gives you the aggregated value of each data source across the named use cases. And there are many, many more use cases against which these data sources can be used to deliver even more quantifiable business value.
Summary: What is the Value of Your Data?
Many data management and data governance projects stall out because organizations lack a business-centric methodology for determining which of their data sources are the most valuable. If you don’t understand which data sources are most valuable to the business, then the organization ends up peanut buttering precious data and analytic resources across all the organization’s data sources regardless of its business value.
Some critical business questions that the CIO and the CDO/CDAO must be able to answer about their data includes:
- Upon which data sources should we prioritize our data quality and data governance investments?
- Upon which data sources should we prioritize our data enrichment (latency, granularity, standardization, normalization, metadata enrichment) investments?
- What new data sources should we acquire based upon their ability to deliver quantifiably improved business outcomes?
- Which data sources should we put into cold storage – or not even keep at all – due to lack of business value as compared to the regulatory and compliance risks and data storage, management, and protection costs?
Having quantifiable answers to these questions and more enables organizations to “Win an Unfair Game” against those competitors who lack the insights necessary to strategically invest to increase the value of the world’s most valuable resource – data.