Cloud Day 2–Understanding Cloud Storage

by Nov 9, 2020

Welcome to the Cloud Day 2—in this series you will learn about some slightly more advanced topics in the public cloud. We’re going to go beyond just deploying a VM with SQL Server installed or deploying an Azure SQL Database. Some of the topics will relate directly to SQL Server while others (such as this post on storage) will supply the supporting information a DBA needs to make their cloud deployments successful.  I hope you enjoy it, and if you have any questions feel free to put them in the comments.

Storage is a touchy topic for DBAs. For anyone who is relatively new to IT, storage (prior to the widespread usage of Solid State Disks (SSDs)) was long-time bottleneck in database performance. For about 20 years, the performance of CPUs and memory increased while storage merely got denser, but not significantly faster. With the advent of SSDs and their use in storage area networks (SANs) in the last decade storage performance has improved leaps and bounds and has become much less of a concern. In addition to performance, in an on-premises world, adding storage can require lead time—you may need to order storage, and deal with storage admins judging you because you asked for another terabyte of backups.

Then everyone started to move to the cloud. Cloud computing changed the paradigm of storage—effectively you have limitless storage available to you, so there’s no lead-time if you need a petabyte of storage, say next week. Storage in the cloud tends to be cheaper than on-premises storage, but your mileage may vary. That’s not the only difference—unlike most SANs which are attached to physical servers via fibre optic cables, and storage in the cloud uses a technology called object-based storage versus the traditional block-based storage model (Some on-premises SANs also used approach).

Block storage splits files into evenly sized blocks of data, each having its own address, but with no other information about what the data is. For example, your SAN doesn’t know the difference from a block containing an operating system disk containing SQL Server data or an image file, or a video. The block storage model works well, but at scale beyond hundreds of petabytes, you can run into durability problems, hard limitations with storage infrastructure and high management overhead.

In object storage, everything is a file—an OS disk becomes a virtual disk file, a video is a video. But also, in additional to storing the file, the object store also contains metadata and a unique identifier (yes, GUIDs really do run the world). Object based storage is designed as a scale out infrastructure and capacity are managed by simply adding additional nodes. Another difference is that object-based storage rejects the model of traditional redundant array of independent disk (RAID) technology, and instead persists data by maintaining three copies of each object (or file).  This redundancy allows for easy corruption checking by simple checksum calculations, and in case of failure, the storage fabric adds an additional copy from one of the remaining good copies.

It is not really important for you to understand the fundamental architecture of cloud storage, but it does help you understand durability, and how the cloud providers have built the storage solutions they have built. One thing you should note, while some of these offerings appear to be to be the same as traditional on-premises offerings, they are all built the aforementioned object-based technology.

First let’s talk about file storage—you can effectively think about this is a file server as a service. You have scale and performance options, and while this option can potentially be used to host database files that is not a common configuration. This option is relatively expensive compared to Azure Blob or Amazon S3 storage but does offer an easy replacement to an on-premises solutions, and can allow for authentication using Windows Active Directory credentials.

The original implementation of cloud storage is fully object-based blob storage. In Azure this service is called Azure Blob Storage, and in Amazon S3. In fact, in the earlier days of Azure, when you created a disk in a VM you used to have to create a virtual hard disk (VHD) file in a storage account and map it to the VM. While this is no longer the case for VMs, this object storage can be used by all sorts of applications for storing data. Whether it be database backup (if you are in Azure, you can take advantage of SQL Server’s backup to URL functionality) files, or images that you are hosting on a web site, object storage offers you an extremely cost-effective way of storing data. Both S3 and Azure Blob Storage offer service tiers, which balance performance and usage with pricing. You should note that for cold and archive tiers, while the storage costs can be much lower, the data retrieval costs can be very expensive, so use these tiers with care.

VM storage comes in a couple of different flavors. VMs in most platforms have two types of storage—local flash storage which is ephemeral (erased when the VM is powered off and deallocated). The core storage for VMs is network attached storage, which itself comes in a few different performance tiers. The locally attached storage is the lowest latency option, however the only practical use for it with SQL Server, is with TempDB files which are recreated upon instance startup. You can’t easily store data and log files on these drives (unless you build a highly available solution like Azure SQL DB uses), but for busy TempDB workload. The network attached storage (called Managed Disk in Azure and Elastic Block Storage in AWS) is what is used for most database workloads. In Azure you should use Premium Storage or Ultra Disk, and in Amazon Provisioned IOPs or NVMe volumes. Standard storage, and even standard SSD disks from both cloud providers, do not offer the performance needed for even light database workloads.

A common pattern to improve disk performance is to add multiple storage volumes and use Storage Spaces in Windows or Logical Volume Management in Linux to provide performance. Also, you should pay attention to the disk caching options available to you on the cloud platform—on Azure specifically SQL data file volume should have read caching enabled. Most SQL Server workloads, even on OLTP systems have 70-80% read operations, so caching reads will benefit nearly all databases. SQL Server does not benefit from write caching, you shouldn’t enable it.

Storage in the cloud is quite different than what you are used to in an on-premises world. While the ultimate performance of cloud storage might not match high-end on-premises storage, in my experience properly configured cloud storage can meet the needs of 99% of workloads. Learning your options and balancing costs, data volume, and performance are they key to successful cloud database deployments.