This blog post was authored by Todd Schraml.
How Did Data Modeling Get Started?
Back when dinosaurs still ruled the world, or at least wax figurines of dinosaurs could be made for you while you waited at the pump as the attendant filled your gas tank, data structures were pretty much ad hoc things. If someone thought an element was needed, any data item was soldered into place. Things more or less worked, businesses did not crumble apart due to their data processing departments. But yes, random problems arose, usually involving data inconsistencies across those structures – as well as the occasional disaster concerning lost pointers, but I digress. A clever lad came up with a new way to consider how one might store large quantities of data. His idea required a little dabbling into formal logic — set theory and predicate calculus. The results were impressive. Data structures became more aligned with the semantics of an organization. The possibility of data inconsistencies was greatly diminished, and the data values themselves replaced the pointers, logically. All nifty thoughts for the time. Logic has been a tool of civilization for a few centuries. A new application of logic only helped the world continue to move ahead. The solution resulted in relational theory, relational databases, normalization, data independence, conceptual/logical/physical data modeling – a landslide of new thoughts.
How Does Data Modeling Benefit From Normalization?
Logical data modeling involves the usage of normalization techniques. Normalizing data is accomplished by working through an analysis of the functional dependencies within and across the data. One steps through various “normal forms” to bend the data into structures reflective of the identified functional dependencies.
In simpler terms, normalization causes one to ask the question, “What does my data mean?” As is often said, the nouns spoken within the business become the entities within the logical data model. This perspective highlights that logical data modeling is a semantic process. The data model should reflect the language and terms used within the business. The goal is not, as some may believe, to arrive at a pure, universal description that is the same for everyone across the globe. The goal of a logical data model is to fit the best possible structures for the semantics of the specific organization within which one is working.
Why Are Business Terms Important for Data Modeling?
In gathering the words and terms used, plus obtaining definitions for these items from business experts, the data modeler is learning the business. In acquiring an appreciation of the relationships between terms, one is unfolding the business rules that help the organization operate. During this process an experienced data modeler can start valuing the pain points the experts may bring up for discussion. Often these areas of pain are also semantic issues. Terms may have become overloaded across the organization so that one word has multiple, even conflicting, meanings. Does the business need to add a few more nuanced terms to fully realize their descriptive needs? Does such an expansion of language allow for the previously mentioned pain points to be remedied? By simple linguistic tweaks a data modeler can serve as a change agent enhancing an organization’s success.
The give and take of discussions into the business terms often results in the business experts and those others influenced by those experts learning to trust IT’s intentions. Through the data model, IT is demonstrating that it hears what business is saying. As the pieces move into place within the logical data model, a formalized description of the business comes into being. The logical data model helps reinforce the common language that designers, developers, and business personnel may use. The logical data model itself is a composition of business rules and requirements that is implementation neutral. Also, the logical data model serves as the shared “world view” of what is within the scope of this organizational solution space. Agreement on a logical data model is an important step in getting everyone involved “rowing in the same direction.” In many ways the logical data model is a team building experience for both IT and business.
How Can Data Modeling Save So Much Time?
It is often thought that data modeling is a time-consuming task. But that view is false. The data modeling tools in the current marketplace, like ER/ Studio, are very powerful and enable productivity by helping the user with the science of the modeling process. Creating a logical data model by building an Entity-Relationship Diagram [ERD] is a task that can be accomplished quickly. However, before one can perform data modeling, it takes time to understand your data. This learning curve time is often confused as being the “data modeling” time. But coming to understand the data is a separate task. And it is a separate task that needs to be done regardless of whether or not one’s goal is to build a data model. If you do not understand your data, what exactly are you doing? And WHY aren’t you stopping to learn your data before you attempt anything else? Not understanding one’s data is a prerequisite to inviting failure.
Initiatives that sidestep understanding the business often end up with hiccups along the way as major elements are reworked to recover from previous misunderstandings. In one place I worked there were functions dealing with duplicate transactions, these functions were being re-coded every few months. We locked the business experts into a conference room and did a deep dive into the definition of a duplicate transaction. When we were done, our language expanded into internal duplicates, horizontal duplicates, vertical duplicates, and a few more. We updated the data model to account for these new semantics, and recoded the processing to align. The code remained stable and unchanged from that point forward. Language is important, business rules are important, understanding is critical. These unexpected misunderstandings are often avoidable when designers and business experts lay a proper foundation by crafting a logical data model. Learn about your organization, be a good partner for the business, and build a logical data model.
Data stakeholders — including various IT functions, data governance teams and business users — need a comprehensive solution to understand, control and distribute reliable data. ER/Studio does this better than anyone by providing unhindered access to enterprise metadata.
Experience for yourself how ER/Studio can help you use data models to be good business partners by scheduling a product demonstration with one of IDERA’s industry experts.
About the Author
Todd Schraml has over fifteen years experience in application development and maintenance. This includes eleven years focused on data warehousing initiatives and five years experience in database administration on massively parallel processing database management systems. Positions held include Project Manager, Data Warehouse Architect, Technical Lead, Database Administrator, Business Analyst, Developer, and Teacher.
Todd’s focus is on data analysis and design; implementing databases for both operational applications and data warehouses; using new and emerging technologies; and seeking ways to integrate formalized quality practices into Information Technology arenas.