With a unified IT monitoring solution like up.time one of the more popular discusses we have with users is agent versus agentless monitoring. In this blog I wanted to capture some of the relevant discussion points in an effort to help those that are having this discussion inside their organizations.
We monitor technology for a variety of reasons. The most popular reasons are to assess performance and health. We want to understand when technology is having problems so that we can fix those problems, or ideally, identify impending problems so we can address them before they happen. The technology we monitor can range from devices; such as servers, network or storage devices; to software and applications.
There are two popular approaches to gathering data from the technology we want to monitor – with an agent and without an agent. Agent based monitoring typically involves installing an agent (small executable) on, or alongside, the technology we want to monitor. Agents can be built in a variety of technologies and can work in different ways. They are often supplied by a vendor with a monitoring solution.
There are also technologies we can leverage to perform agentless monitoring. The term “agentless” is a bit of a misnomer because the agentless technology we leverage is essentially an agent. However, the agentless technology is often provided by the technology vendor for the technology we want to monitor, and is often based on a defined standard. The term “agentless” really speaks to the idea that a 3rd party agent is not required. Often the agentless technology we leverage needs to be installed, configured, enabled, etc; not unlike a vendor supplied agent.
The most popular technologies we leverage in agentless monitoring are Windows Management Instrumentation (WMI) and Simple Network Management Protocol (SNMP). WMI is most often used to monitor and manage Microsoft Windows and other Microsoft software. SNMP is typically used more generically to monitor and manage things like Linux and Unix systems, network and other devices (it can also be used to monitor/manage Microsoft Windows systems, but often isn’t). A detailed discussion of these technologies is outside of the scope of this post. The discussion that follows assumes some familiarity with agentless monitoring technologies.
Both agent and agentless monitoring have advantages and disadvantages; we will discuss some of the more popular advantages and disadvantages below.
Agent Advantages
One of the advantage of running an agent to gather data is that it can often be extended, depending on who created the agent, beyond its base capabilities. Often an agent can be used to perform specific actions. For example, if when monitoring a server we notice that a service or daemon is stopped it can be restarted. Also, agents can often be extended to monitor more than one thing. For example, an agent might gather both operating system data and data related to a specific application. In short, agents can be made to do lots of things.
Agent Disadvantages
The primary disadvantage to using an agent is just that – you are using an agent that you have to install, manage, update, etc. Depending on the creator of the agent this can be a more or less time consuming task.
Agentless Advantages
The advantage to agentless monitoring is that nothing has to be installed on devices, with applications, etc, from the monitoring vendor. This means that managing the monitoring environment has less moving parts and is often simpler.
Agentless Disadvantages
There is an assumption that required components for agentless monitoring (WMI, SNMP, etc) are in place and are running. This often is not the case by default. For example, a number of Linux distros no longer come with SNMP installed by default. For users that want to run agentless monitoring it is a good practice to ensure the required components are part of their base server installation.
The other big challenge with agentless monitoring is that it uses well known interfaces that by definition allow remote access. This can create security concerns. It is worth noting that agentless technologies like WMI and SNMP provide more than just access to performance data, they also provide some fairly significant management capabilities such as being able to reboot a server. Like most security concerns this can be limited through proper configuration, however, that leaves an education burden on the part of the user to fully understand the agentless technologies they are using.
Choosing the Right Approach
In up.time we support a variety of agent based and agentless monitoring options to try to satisfy the needs of different users. I encourage users to think about their monitoring requirements to assess which option(s) is(are) right for them. For example, if you are looking for the easiest option, and you are not concerned about potential security risks and you don’t need to extend monitoring to perform actions then agentless monitoring might be right for you. If you need the ability to monitor applications not supported by an agentless platform then agent based monitoring might be right for you.
Sometimes the choice of which approach to take is dictated by other groups within the organization. For example, I know some users that will absolutely, under no circumstances, enable SNMP on their servers – they see it as a security risk. In those environments agent based monitoring solutions are often the best choice.
Also, keep in mind that sometimes the choice of which monitoring approach is available is dictated by the technology. Some devices don’t allow an agent to be installed, for example network and storage devices often don’t allow anything to be installed. In those cases an agentless approach must be used. Another good example of popular technology that must be monitored without an agent is VMware. VMware provides an API that can be used to extract monitoring data because they no longer allow an agent to be installed on the hypervisor.
This is far from a deep dive into all of the possible considerations for choosing an approach to monitoring. Hopefully, this gives you something to think about when you are evaluating which monitoring approach is right in your environment.