The Cheat-Sheet on SharePoint’s Distributed Cache

by Nov 3, 2013

I was in Indiana this past weekend at SharePoint Saturday Indianapolis, and one of the presentations I delivered at the event was a favorite of mine: “Caching-In” for SharePoint Performance.” With SharePoint 2013 hitting the streets, I decided it would be a good idea to update the presentation with changes that are coming in the new version of the platform.

When it comes to the topic of caching and SharePoint 2013, there’s no bigger change than the addition of the Distributed Cache service. In addition to my own questions, I’ve recently seen quite a few posts in #SPHelp on Twitter that suggest others want to learn more. For these reasons, I figured I’d round up all the information I could find on the Distributed Cache service and crunch though it.

What Is It?

The SharePoint Distributed Cache service provides additional caching support beyond the options that already exist and have been carried over from SharePoint 2010 and SharePoint 2007 before it (i.e., the Object Cache, BLOB Cache, and Page Output Cache).

The Distributed Cache service is actually built on top of the Windows Server AppFabric Cache. You don’t know anything about the Windows Server AppFabric Cache, you say? No worries: SharePoint 2013’s prerequisites installer takes care of installing the Windows Server AppFabric and configuring it for cache operations on your behalf. There is no initial action required on the part of administrators to get the Distributed Cache service installed and operational in their SharePoint 2013 environment.

If for some reason you’ve already installed the Windows Server AppFabric on one or more of your SharePoint servers, it is recommended that you uninstall it and let the prerequisite installer take care of installation and configuration for you as part of the SharePoint 2013 setup process. If for some reason you really want (or need) to set up the Windows Server AppFabric manually, be sure to follow the steps laid-out on TechNet to ensure that the it is configured correctly for SharePoint’s usage.

The Distributed Cache Service Is Running On All Of My Servers?
Yes, it is running on all of your SharePoint 2013 Servers by default. Consider the small farm shown on the right. The farm consists of three servers (and please excuse my old terminology and descriptors for the sake of indicating each server’s primary function):

Server A. SharePoint 2013 web front-end (WFE) with 12GB of memory
Server B. SharePoint 2013 service application host with 20GB of memory
Server C. SQL Server 2012 database server with 32GB of memory
Of these three servers, two of them have the SharePoint 2013 bits installed on them: Server A and Server B. These two servers, by extension, are also running the Windows Server AppFabric service. If you were to look at the list of services running on each of these servers, you’d find something similar to the following.

Server A and Server B are known as cache hosts, and together they form a cache cluster of shared memory for use by all SharePoint farm members.

How Much Memory Is In-Use?

When SharePoint 2013 is initially set up and the Distributed Cache service is configured to run, it is set to use 10% of the server’s total (physical) memory. In the case of the small farm previously shown this would mean the following:

Server A. 1.2GB memory (10% of 12GB)
Server B. 2.0GB memory (10% of 20GB)
Server C. Not a cache host
Altogether, this yields a total cache cluster size of 3.2GB (1.2GB + 2.0GB) of shared memory for the farm.

The cache hosts (Server A and Server B) are said to be running in collocated mode in the example because other services are running on the servers in addition to the Distributed Cache service and the underlying Windows Server AppFabric cache. It is possible to configure cache hosts in dedicated mode by shutting down non-essential services.

It is worth noting that the memory allocation for cache hosts can be increased or decreased via the Update-SPDistributedCacheSize PowerShell cmdlet.

Note: You may find contradictory information relating to the Distributed Cache service on TechNet, so pay attention to the date an article or page was last updated. At the time I wrote this, for example, the Update-SPDistributedCacheSize cmdlet page indicates that the default cache size is 5% of total system RAM rather than the 10% I cited earlier in this post. The value of 10% was obtained from a newer TechNet source, and so I consider it to be “more correct.”

Microsoft recommends that you avoid running the following services/applications on servers where the Distributed Cache service is in operation:

SQL Server 2008 or SQL Server 2012
Search Service
Excel Services
Project Server Services
What Goes In The Cache?
Actually, there are a number of different pieces of data that end up in the Distributed Cache. Many of these are tied to SharePoint 2013’s social features, such as tags and document activities. A lot of authentication and security information gets cached for performance, as well – items tied to (Claims) tokens, security trimming, and more. I’d recommend checking out TechNet if you want the complete list.

Why Do We Need The Distributed Cache?
Many of the objects that end up in the Distributed Cache are computationally expensive, time-intensive to fetch, or a combination of the two. Storing objects and data of this nature in a chunk of memory that is spread across servers allows SharePoint to carry out many different types of requests and operations more quickly.

Consider a lengthy process such as authentication in a federated identity scenario. A complex orchestration of network and service calls is required to authenticate a user and build a claims token for use in subsequent authorization operations. Caching such tokens in the Distributed Cache removes the need to constantly re-execute the sequence of (relatively slow) calls and can dramatically improve SharePoint performance.

Are There Any Guidelines?

Yes, Microsoft has published a handful of guidelines pertaining to the Distributed Cache service. The suggestions are scattered all over the place, but I tried to aggregate them below.

Bear in mind that at the time I’m writing this, the Distributed Cache service is still “new technology” in the SharePoint world. Over time and with much broader use, it is likely that guidance regarding the Distributed Cache service may evolve and even change. Pay particular attention to publication dates on posts and articles!

Oh, and an important note: if you’re going to make any configuration or service changes, I implore you to pay attention to the following warning:

We don’t often receive warnings that suggest a change or reconfiguration on our part could force a full SharePoint farm rebuild, so be careful!

Without further ado, here’s the guidance I’ve managed to gather and interpret:

1. As the warning above strongly suggests, do not manage the Distributed Cache service through either the Services MMC snap-in or the generic Windows Server AppFabric tools! Use SharePoint Central Administration and the SharePoint PowerShell cmdlets designed for the purpose.

2. Anytime you need to shut down the Distributed Cache service on a cache host (via Stop-SPDistributedCacheServiceInstance cmdlet), such as to remove a cache host from its cache cluster, use the –Graceful switch to avoid data loss. Although it takes longer to shut the service down this way, cached items are preserved (i.e., transferred to another cache host) and end-users get a better experience.

3. If your SharePoint Servers (specifically, your cache hosts) are virtual machines (VMs), do not use dynamic memory for those VMs. Dynamic memory allows you to squeeze “more” out of a hardware host, but it can cause problems for the Distributed Cache service since actual physical memory assigned to a VM is variable. For Distributed Cache cache hosts, used fixed memory allocations in your VM configurations.

4. When adding and removing cache hosts to a cache cluster, be aware that the Distributed Cache service depends on Internet Control Message Protocol (ICMP) for operation – likely to ping other cache hosts to determine their availability and readiness. This may require you to make firewall changes in your environment and on your cache hosts.

5. All cache hosts in a cache cluster should be configured with the same Distributed Cache service memory allocation, and that value shouldn’t be less than 8GB per server.

6. Don’t allocate more than 16GB of memory to the Distributed Cache service on any single cache host – even if the system has more RAM available. Allocating more than 16GB of memory may cause the server to stop responding for periods in excess of 10 seconds.

7. The maximum number of cache hosts per cache cluster is 16.

8. The Distributed Cache service on a cache host throttles requests when memory consumption approaches 95%. Until memory utilization levels drop back to (approximately) 70%, cache read and write requests are not accepted. Keep an eye on memory usage and the Event Log for signs that a server is memory starved and consider adding additional cache hosts in such circumstances.

9. The SharePoint 2013 Health Analyzer has a few rules that will surface issues with the Distributed Cache service. Keep an eye on Central Administration and Health Analyzer Reports.

References and Resources
Event: SharePoint Saturday Indianapolis
Blog: Presentations on SharePointInterface.com
MSDN: Caching in SharePoint
MSDN: Windows Server AppFabric Caching Physical Architecture Diagram
TechNet: Hardware and Software Requirements for SharePoint 2013
TechNet: Plan for Feeds and the Distributed Cache Service in SharePoint 2013
TechNet: Update-SPDistributedCacheSize
TechNet: Overview of Microblog Features, Feeds, and the Distributed Cache Service in SharePoint Server 2013
MSDN: Claims-Based Architectures
TechNet: Manage the Distributed Cache Service in SharePoint 2013
TechNet: Stop-SPDistributedCacheServiceInstance
TechNet: Hyper-V Dynamic Memory Configuration Guide
Wikipedia: Internet Control Message Protocol
Steve Peschka: Configuring Multiple Distributed Cache Servers in SharePoint 2013
TechNet: Plan and Use the Distributed Cache Service in SharePoint 2013
TechNet: Software Boundaries and Limits for SharePoint 2013