The Big Data Modeling Question…

by Dec 15, 2015

I was with a client for lunch one day and we asked the server "What do you have on tap?"  She was gone quite a long time, but came back and said "Beer."  It turned out she was right.  But her answer was not that helpful.

One of the odd parts of every new project I have to deal with is getting everyone to understand that the question "Can you help us with a data model" results in me asking. "What do you mean by data model?" That's right, I have to ask team members what a data model is. You'd think experts would know better. 

I have to do this because it seems like everyone has a different definition.  For most DBAs, they want a reverse-engineered image of a production database.  For a business user, data modeling that results in documentation about all the questions, answers and decisions were made. For a developer a specification of something they can build upon. An executive wants a high-level view of the data concepts a specific project will be addressing so she can approve scope and budgets.  A data scientist wants a consolidated view of both the physical data objects available to him and a logical definition of what they are.  Finally, a data modeler wants a list of previously modeled entities so that she doesn't have start from scratch on every project.

It's likely that every role in the organization wants a different data model with a different set of metadata. It's also why we need to have a discussion about conceptual, logical and physical data models.  Even that set of terms has differing definitions. That's nearly unforgivable given that we data modelers preach that we should use consistent definitions. Then to make this even more complex, we need to discuss the primitives in the Zachman Framework as well.

This is why using native database tools aren't good enough to solve all those needs.  This is why a simple drawing tool isn't enough.  What an enterprise needs are tools that can author, design, and present all those types of data models without creating duplicate copies of those data concepts. It's also why a data modeler needs to ask the question: What' do *you* mean by data model?  You need to ask your team members what they are expecting before you start working.  You may need to negotiate priorities or formats.  You may need to create separate views of your models. That's wonderful, though, to deliver what they need from your data models. It's all good.

It's not that we modelers don't know the 100+ possible answers to that question.  It's that we know there are 100+ answers.

It's not that we modelers don't know the 100+ possible answers to that question.  It's that we know there are 100+ answers. That's what data modeling is after all: getting to the right answer for this requirement.