Seven Deadly Sins of Database Design

by Jul 31, 2014

Categories

Tags

Administration agent-based monitoring Agentless Monitoring alert responses alert thresholds alerting Alerts Amazon Aurora Amazon EC2 Amazon RDS Amazon RDS / Aurora Amazon RDS for SQL Server Amazon Redshift Amazon S3 Amazon Web Services (AWS) Analytics application monitoring Aqua Data Studio automation availability Azure Azure SQL Database azure sql managed instance Azure VM backup Backup and recovery backup and restore backup compression backup status Backup Strategy backups big data Blocking bug fixes business architecture business data objects business intelligence business process modeling business process models capacity planning change management cloud cloud database cloud database monitoring cloud infrastructure cloud migration cloud providers Cloud Readiness Cloud Services cloud storage cloud virtual machine cloud VM clusters code completion collaboration compliance compliance audit compliance audits compliance manager compliance reporting conference configuration connect to database cpu Cross Platform custom counters Custom Views customer survey customer testimonials Dark Theme dashboards data analysis Data Analytics data architect data architecture data breaches Data Collector data governance data lakes data lineage data management data model data modeler data modeling data models data privacy data protection data security data security measures data sources data visualization data warehouse database database administration database administrator database automation database backup database backups database capacity database changes database community database connection database design database developer database developers database development database diversity Database Engine Tuning Advisor database fragmentation database GUI database IDE database indexes database inventory management database locks database management database migration database monitoring database navigation database optimization database performance Database Permissions database platforms database profiling database queries database recovery database replication database restore database schema database security database support database synchronization database tools database transactions database tuning database-as-a-service databases DB Change Manager DB Optimizer DB PowerStudio DB2 DBA DBaaS DBArtisan dBase DBMS DDL Debugging defragmentation Demo diagnostic manager diagnostics dimensional modeling disaster recovery Download drills embedded database Encryption End-user Experience entity-relationship model ER/Studio ER/Studio Data Architect ER/Studio Enterprise Team Edition events execution plans free tools galera cluster GDPR Getting Started Git GitHub Google Cloud Hadoop Healthcare high availability HIPAA Hive hybrid clouds Hyper-V IDERA IDERA ACE Index Analyzer index optimization infrastructure as a service (IaaS) infrastructure monitoring installation Integrated Development Environment interbase Inventory Manager IT infrastructure Java JD Edwards JSON licensing load test load testing logical data model macOS macros managed cloud database managed cloud databases MariaDB memory memorystorage memoryusage metadata metric baselines metric thresholds Microsoft Azure Microsoft Azure SQL Database Microsoft PowerShell Microsoft SQL Server Microsoft Windows MongoDB monitoring Monitoring Tools Monyog multiple platforms MySQL news newsletter NoSQL Notifications odbc optimization Oracle PeopleSoft performance Performance Dashboards performance metrics performance monitoring performance schema performance tuning personally identifiable information physical data model Platform platform as a service (PaaS) PostgreSQL Precise Precise for Databases Precise for Oracle Precise for SQL Server Precise Management Database (PMDB) product updates Project Migration public clouds Query Analyzer query builder query monitor query optimization query performance Query Store query tool query tuning query-level waits Rapid SQL rdbms real time monitoring Real User Monitoring recovery regulations relational databases Releases Reporting Reports repository Restore reverse engineering Roadmap sample SAP Scalability Security Policy Security Practices server monitoring Server performance server-level waits Service Level Agreement SkySQL slow query SNMP snowflake source control SQL SQL Admin Toolset SQL CM SQL code SQL coding SQL Compliance Manager SQL Defrag Manager sql development SQL Diagnostic Manager SQL Diagnostic Manager for MySQL SQL Diagnostic Manager for SQL Server SQL Diagnostic Manager Pro SQL DM SQL Doctor SQL Enterprise Job Manager SQl IM SQL Inventory Manager SQL Management Suite SQL Monitoring SQL Performance SQL Quality SQL query SQL Query Tuner SQL Safe Backup SQL script SQL Secure SQL Security Suite SQL Server sql server alert SQL Server Migration SQL Server Performance SQL Server Recommendations SQL Server Security SQL statement history SQL tuning SQL Virtual Database sqlmemory sqlserver SQLyog Storage Storage Performance structured data Subversion Support tempdb tempdb data temporal data Tips and Tricks troubleshooting universal data models universal mapping unstructured data Uptime Infrastructure Monitor user experience user permissions Virtual Machine (VM) web services webinar What-if analysis WindowsPowerShell
If you don’t get the data right, nothing else matters. However, the business focus on applications often overshadows the priority for a well-organized database design. Addressing some simple data modeling and design fundamentals can put you on the right path. Here are seven common database design “sins” that can be easily avoided and ways to correct them in future projects.
Sin #1: Poor or missing documentation for database(s) in production
Documentation for databases usually falls into three categories: incomplete, inaccurate, or none at all. This causes developers, DBAs, architects, and business analysts to scramble to get on the same page. They are left up to their own imagination to interpret the meaning and usage of the data.
The best approach is to place the data models into a central repository and spawn automated reports so that with minimal effort, everyone benefits. Producing a central store of models is only half the battle, though. Once that is done, executing validation and quality metrics will enhance the quality of the models over time. As your level of data management increases, you can extend what metadata is captured in the models.
Sin #2: Little or no normalization
There are times to denormalize a database structure to achieve optimized performance, but sacrificing flexibility will paint you in a corner. Despite the long-held belief by developers, one table to store everything is not always optimal. Another common mistake is repeating values stored in a table. This can greatly decrease flexibility and increase difficulty when updating the data.
Understanding even the basics of normalization adds flexibility to a design while reducing redundant data. The first three levels of normalization are usually sufficient for most cases:

·         First Normal Form: Eliminate duplicate columns and repeating values in columns

·         Second Normal Form: Remove redundant data that apply to multiple columns

·         Third Normal Form: Each column of a table should be dependent on the primary identifier

Sin #3: Not treating the data model like a living, breathing organism
There are numerous examples of customers performing the modeling up front, but once the design is in production, all modeling ceases.
To maintain flexibility and ensure consistency when the database changes, those modifications need to find their way back to the model.
Sin #4: Improper storage of reference data
There are two main problems with reference data. It is either stored in many places or, even worse, embedded in the application code.
Reference values provide valuable documentation which should be communicated in an appropriate location. Your best chance is often via the model. The key is to have it defined in one place and used in other places.
Sin #5: Not using foreign keys or check constraints
Customers complain all the time about the lack of referential integrity (RI) or validation checks defined in the database when reverse engineering databases. For older database systems, it was thought that foreign keys and check constraints slowed performance, thus, the RI and checks should be done via the application.
If it is possible to validate the data in the database, you should do it there. Error handling will be drastically simplified and data quality will increase as a result.
Sin #6: Not using domains and naming standards
Domains and naming standards are probably two of the most important things you can incorporate into your modeling practices. Domains allow you to create reusable attributes so that the same attributes are not created in different places with different properties. Naming standards allow you to clearly identify those attributes consistently.
Sin #7: Not choosing primary keys properly
The simplest principle to remember when picking a primary key is SUM: Static, Unique, Minimal. It is not necessary to delve into the whole natural vs. surrogate key debate; however, it is important to know that although surrogate keys may uniquely identify the record, they do not always uniquely identify the data. There is a time and a place for both, and you can always create an alternate key for natural keys if a surrogate is used as the primary key.
Sin-free databases

 

Hopefully this information has helped you to evaluate your data management habits and assess your current database structure. It may be a multi-step process to define your strategy to eliminate all of these issues, but it is important to have a plan to get there. Embarcadero can help you to address these concerns; contact us to learn more. You can also read the white paper on this topic: http://forms.embarcadero.com/7_deadlysins_datamodeling.