Data management (DM) discussions can be frustrating because both those feeling the pain and the consultants who try to help them are–90+ percent of the time, it seems–still using the same old ways. Those ways only go so far, and won’t go any farther. That’s because those who reinforce the old ways assume that what is still a provincial IT paradigm will work; all it needs is a bit of fixing and some new things added on.
Most companies play this short game and say they’re done. Many consultants perpetuate the old ways, viewing old DM as an opportunity to create a perpetual annuity income stream, the gift that keeps on giving.
The problem with just playing the short game is that oversiloing, a habit computing users have had since the dawn of computing, makes it necessary to repeat QA processes over and over again, silo by silo. For example, traditional data management vendors advocate the data equivalent of a configurable, unpredictable, reverse osmosis filter installed at each data lake or warehouse, not to mention in the operational data environment. It’s a kind of purgatory businesses have built for themselves–doing the same thing over and over again, but expecting a different result.
The same issue of trapping resources that should be shared happens with business logic at the application layer. Multiple departments facing the issue try to alleviate the problem differently. Some subscribe to more SaaSes. Others try form builder apps in a SaaS they already subscribe to create a form to collect data…in a siloed DB.
Perhaps the designated department problem-solvers believe the data they’re hoping to collect with their method will magically appear. Perhaps, after a company-wide memo circulates that tells all hands to do their bit and input data about themselves that the company needs, things will improve and co-workers will do what they should..
Thing is, the overall system hasn’t been designed to encourage or collect feedback or evolve accordingly in these cases. What’s happening instead is that departments try to solve their own problems after a crisis occurs. What should be clear is that a provincial IT paradigm is obsolete. It needs company evangelists who see the value of a more modern knowledge graph, ontology-model driven paradigm.
Better data by design
The long game that enterprises should be playing involves a desiloed system by design, not just another stopgap measure, each of which constitutes yet another silo to manage separately.
The desiloed system–knowledge graph or intelligent web-based, ideally–should be designed to anticipate the future state of how online business should be done.
Almost unnoticed by the mainstream press, a community of people has soldiered on more than a decade to create the components of this system.
At this point, the systems are still mostly custom-built, but the technology involved isn’t really what’s getting in the way. What’s getting in the way are all the assumptions governing old-style data management, including data silos that have become data cartels.
Everyday techniques that can help create synergies in long game data enrichment efforts
A little semantics (adding meaning to data so it’s more precise, contextualized, machine-interpretable, and sharable) goes a long way, as semantic web pioneer and Director of the Institute for Data Exploration and Rensselaer CS Professor Jim Hendler points out. Enough semantics can be the determining factor in whether or not people can find each other when they need to, or whether or not web-scale integration and even interoperation can be feasible.
Knowledge graphs are the best method of incorporating and managing semantics at scale, for many reasons.
Semantic integration example: Diffbot
Diffbot declared its mission years ago to build a map of human knowledge. Merrill Cook of Diffbot explained how knowledge graphs help with semantic integration:
- “One benefit knowledge graphs provide for semantic integration is that each entity within knowledge graphs has a [globally] unique identifier. This allows for disambiguation (e.g. two ‘John Does’ may have different work history fields) even if different data types incorporate the same entity.
- “Knowledge graphs are also fantastic ways to integrate data into other data stores. The added semantics stored in entity ontologies can expand the available fields (and thus source types) that can be integrated into your existing entries.
- “The third way knowledge graphs support semantic integration is through their flexibility. Knowledge graphs provide flexible schemas that can incorporate new fact types ‘on the fly.’ The range of fact types that can be modelled in a graph structure including concepts, relationships between things, hierarchies, and types of things themselves.”
So as you can see from Cook’s observations, the knowledge graph paradigm suggests a system designed to evolve and scale in ways that provincial IT can’t.
Revisiting the data lifecycle with a systems design perspective
Data is not monolithic. By that I mean it’s diverse, often over characterized as one homogeneous thing. Some of it is highly perishable–use it or lose it. Some of it is not perishable at all.
The non-perishable data, content, and knowledge in your organization, as well as the business logic demand reuse.
Knowledge graphs allow the data, content, and knowledge management departments to harness the same graph-based management and development paradigm. The right side of this data lifecycle is where knowledge graphs will pay off the most.
We’re working in a more and more networked way. Much of the work won’t be in batch mode, nor should it be. It’s good to think in a hyper-networked way, the same way Ethernet pioneer Bob Metcalfe, who’s on the board of knowledge graph + blockchain company Origin.Trail.io, has pointed the way forward. The big efficiency payoff will be in shared knowledge (data and logic) for such applications as supply chain.
Unless you play this long game and take the trouble to make non-perishable data reusable via something like a knowledge graph, you will be stuck in provincial IT land. So why not play the long game too?
In addition to being an architect or other systems thinker, you’ll need at least one knowledge engineer versed in semantic graphs as well as existing roles such as a product manager (rather than just a project manager, given the long game you’ll be playing) and a UX designer. The UX designer will collaborate with the knowledge engineer to help with the bit of gamification needed.
This collaboration of a team with diverse skills who may not be used to working together is what systems thinking is all about.
Part II will explore involving machines in interactive collaboration with humans inside feedback loops.