This is the second in a series of six posts from David Loshin. Read the previous post and the next post in the series.
In my last post, I suggested that over time, organic yet siloed application development focused on functional requirements has provided the breeding ground for the types of data variance that impact data trustworthiness as transaction records and operational data sets are repurposed for reporting and analytics. One approach to addressing these potential inconsistencies is to understand where the variations occur, what the root causes are, and what can be done to address those root causes. This can be facilitated through a strategy for metadata management – collecting and managing the “data about the data” that provides the context for the formats, standards, definitions, and meanings associated with reference data sets, data element concepts, data elements, records, and tables.
Note that I said a strategy and not a tool for managing metadata. The reason is that a typical metadata management tool can provide a repository for collected information, but if it is only used as a repository (that is, as a storage place) for metadata without considering how that knowledge can be made actionable, the tool’s value is significantly diminished.
The potential resource investment in researching, documenting, and validating data element metadata can be surprisingly large, and questions will be raised about the investment of capital and time. At the same time, in the absence of any previously identified discrepancies or inconsistencies about conflicting formats, value domains, or definitions, there are few reasons to invest the energy in populating the repository.
In other words, we have a conundrum – when there is a need for the metadata, without the prior investment in populating a tool’s repository, there won’t be any of the critical information available to help the consumers. However, without a need, it may be difficult to acquire the resources necessary to capture and document the metadata in the first place!
That is the reason for a strategy for metadata that encompasses the finer points of enterprise data usability. Focusing on a tool alone effectively defers the analysis of the metadata until the tool evaluation and acquisition process has completed. Layering tool acquisition within a more complete strategy allows you to develop guidelines for metadata collection and harmonization and drives the determination of business needs to be addressed by corresponding technology. In turn, the objective morphs from initiating short-term projects intended to storm through data models and note the data element types and lengths into an ongoing program for aligning semantics for information consumption in ways that are consistent and imbue a level of trust in analytical results.
Collecting information about the definition, structure, origination points, and lineage of data concepts provides a road map for any downstream data consumers interested in using the data for alternate purposes. A proper metadata strategy can guide the users in knowing when entity concepts and their corresponding data elements are compatible and can be subjected to reporting and analysis in ways that are consistent and that make sense.
More importantly, the strategy introduces processes that bring data consumers and data practitioners together to enumerate potential issues, assign priorities in relation to the business demands, and organize communities of interest around particular areas of content. That enables the ecosystem surrounding business metadata to evolve into a framework through which the data professionals and business data users communicate and more effectively articulate the needs and expectations of the members of the business data consumer community.
Want to see more from David Loshin? Watch this webinar: Breaking Down the Fence – Using Business-IT Collaboration to Foster Consistent Enterprise Metadata
Want to learn more about ER/Studio? Try it for yourself free for 14 days!
About the author:
David Loshin, president of Knowledge Integrity, Inc. (www.knowledge-integrity.com), is a recognized thought leader and expert consultant in the areas of analytics, big data, data governance, data quality, master data management, and business intelligence. Along with consulting on numerous data management projects over the past 15 years, David is also a prolific author regarding business intelligence best practices, with numerous books and papers on data management, including the second edition of “Business Intelligence – The Savvy Manager’s Guide”.