Written by Christopher Bradley, DMA Advisors. Posted by Justina Lister.
In its early days of data modelling, it was mostly aimed at physical data design – I recall developing models top-down where the target was VSAM, DL/1 IDMS and the like. We started with conceptual level models, next logical, and then the physical data models and from these physical models generating DDL. Remember, this wasn’t simply targeted at relational databases; many of us, myself included, started long before relational was a mainstream concept.
However, one of the greatest misconceptions about data modelling is that it’s just for databases – it is not! They have considerably more use than this.
“The main purpose of a data model is not to design a database, but to describe a business.” (C. Bradley PhD CDMP Fellow 2010)
So here are 10 uses of data models that you may not have considered.
The first five illustrate where business-level data models play a significant part of a data strategy, enabling these core strategic considerations.
1. What is the data we need to run our business?
Business level data models describing the core data concepts of the organisation are built and should be linked to the corresponding business level process models.
A data vs process interaction matrix, which can be generated from modelling tools is an exceptionally useful deliverable here and is used by several other data disciplines such as Master Data Management, Data Governance and Data Security Management.
2. Do we agree what our data means?
Our business level data models should have clear descriptions and definitions included. These business definitions need to be relevant to the data in its different states throughout its lifecycle.
The definitions themselves should be agreed with the stakeholders, and are frequently repurposed into a business glossary.
The importance of definitions cannot be underestimated.
The Greek Philosopher Socrates: (470 – 399 BC) if he was around today could have been talking about business data models when he wrote:
“The beginning of wisdom is the definition of terms.”
3. Do we know where our data is stored?
We “should” be able to define the business-level data concepts in a conceptual data model. However, these concepts such as Customer, Employee, Product, Oil-Well etc will (in all but a few fortunate organisations) exist in the IT systems in considerably more than one location.
We need to understand where in the IT systems these data concepts are to be found. This is a key step in many other disciplines (Master and Reference Data Management for example) and is mandated by regulations for certain types of data, such as Personally Identifiable Information (PII) by GDPR.
Physical data models and the technical environment details themselves can be repurposed into a technical data dictionary and cross referenced or linked to the business glossary giving full triangulation and understanding of what business data is held where in our systems.
4. Have accountabilities with the right skills & processes been allocated to manage our data?
Data Governance is “The process of managing and improving data for the benefit of all stakeholders.” (C. Bradley PhD CDMP Fellow)
Many roles such as the Data Owner, Steward & Custodian (and more) exist in data governance frameworks. Also, several styles of data governance operating models are regularly described in data governance strategies.
The most powerful, and frankly the “gold standard” of these is the “data centric” operating model where significant data concepts are managed across all the processes that use them.
A data vs process matrix (utilising our data models of course) is particularly useful here, especially if we are interested in the notion of Critical Data Elements (CDE’s).
5. Is our data fit for purpose?
Here I’m expanding “fit for purpose” beyond just the classic definition used for data quality.
For our data to be fit for purpose, it also has to meet the correct security and access needs – the data vs process matrix (CRUD matrix) is frequently used to determine what data access is required by roles.
Data quality rules, thresholds etc are stored against the metadata for our data concepts.
Data classifications not just security, but also is it PII, business critical, financially sensitive etc are also ascribed to our data as defined in our data models.
6. Data Governance
Data models and the metadata from them are a key enabler for any data governance initiative. Some of the key data governance aspects held in metadata that’s defined during our business data modelling include:
Business Glossary: Defining the business metadata definitions for business data elements. Captured in the Conceptual Data Model & repurposed via a Glossary.
Technical Data Dictionary: Defining the technical metadata definitions for data(base)objects (fields, files, tables etc)
Master & Reference Data: If the data concept is Master data, what is the system of record?
Data Ownership & Stewardship: Aligning data stewardship or ownership roles to key data objects.
Data Standards: Metadata standards provide governance and rules for current and future development, based on both business and IT input.
Privacy & Security: Identification of privacy and security levels for business data is a key aspect of data governance can be managed with metadata definitions.
Traceability & Audit: Understanding of how data is used across the organization, how key financial figures are calculated, etc.
7. Application Package Selection & Configuration
Whilst it is true that we do not have to build a physical data model when implementing an application package, it is far from the truth to say that data modelling has no place with packages. Far from it. Data models play a significant role in:
- Requirements gathering and assessing package functional & data capabilities.
- Identifying gaps in functionality
- Configuration and set up.
- Data migration and take on
- Master data alignment
- Data lineage (particularly important with data lineage, GDPR & SOX compliance issues)
- For reporting (particularly business intelligence and data warehousing)
- But most importantly, for integration of the package metadata into your overall information architecture
8. Data Lineage & State Transitions
Data lineage comprise two major aspects:
Transformations, covering what has been done to the data, and
Business processes which show:
- Which business processes can be applied to the data?
- What type of actions do those processes perform (Create, Read, Update, Delete)?
- Audit trail – who has supplied, accessed, updated, approved, and deleted the data and when? And
- Which processes have acted on the data?
State transitions is an often under used but very powerful concept. Since many entities such as “Customer” in this example have several allowable life-cycle states, the State transition diagram depicts the allowable states for an entity, the transitions between them, and the rules governing those transitions.
9. Master Data Management – identifying the data to be mastered
Many techniques exist for identifying the data areas that are the prime candidates for being mastered in an organisation.
Data models and process models with their interaction can yield the CRUD matrix, however a simpler approach using high level business data models can be almost as effective and less time consuming.
Taking your conceptual data model, draw shapes around the entities used in major business area A. Then on another copy, do the same for business area B. Repeat for all the major business areas.
You will be able to identify the data entities that are used by multiple business areas. Anything that is used by several business areas is worth examining further as a potential candidate for mastering.
10. XML, JSON and Message based architectures
XML, JSON and several other message data types are hierarchical in structure.
Various NoSQL structures are also hierarchical. However, ER models are not hierarchical, but can be and frequently are network in structure (i.e., a parent can have many children and a child can have many parents).
In our data models, it’s common to have entities such as Book Authorship in this example which has more than one parent. It is not hierarchical. So, when we want to use data represented by the concepts in this model within XML, JSON, or other NoSQL hierarchic structures, we must turn the model into a hierarchic representation. We need to decide whether to make Book the parent of Book Authorship or to choose Writer. The semantic notion of book and writer, etc must remain consistent across all formats and all usages though. Having a data model ensures this.
The techniques for transforming a data model into a hierarchical schema have been around for a very long time. We used to do exactly this when designing DL/1 and IMS databases many decades ago.
As a Monty Python fan, I think of the Romans sketch in “Life of Brian” and have just as many answers to the question, “What has a data model ever done for us?”
The final quotation however must go to Sir Tim Berners-Lee, the Inventor of the World Wide Web “Data is a precious thing and will last longer than the systems themselves.”
Therefore, we should do everything we can to manage our data and treat it as a precious asset. Data models facilitate this is so many different ways – they are not just for databases.