We have a rather large SQLDM configuration here, with 140+ servers / 1000+ databases being monitored simultaneously from one SQLDM instance. However, monitoring feels like it is starting to fail – many servers, even ones that are not that busy, have gaps in their monitoring, varying from 10-15 minutes up to several days.
There is some correlation with server load, with the busiest servers having the largest and most frequent gaps. However, the correlation is not straightforward. Production Server A, which is our biggest server with the highest average load maxes out at gaps of about a day, whereas Production Server S has a currently running gap of nearly four days at this moment. Development Server A has gaps when nothing is running of up to 50 minutes, QA Server S has gaps of up to an hour, but has been running heavily for the last few days, whereas QA Server A has gaps larger than production server A.
When you are onitoring a specific server, such as server S, you will see values feed into the graphs as time passes. However if you switch to monitoring another server, then return to S, there were no snapshots recorded in the period.
Does anyone know — Is this just the limit of what Idera can monitor on the hardware we are using, or has anyone experienced this before with something to fix or examine?
NOTE: We intend to open a ticket with Idera soon, but I was hoping that some genius here could give us a hint on what we should be looking at.
IDERA SQL Diagnostic Manager Desktop Client 10.2.0.3269
IDERA SQL Diagnostic Manager Repository 10.2
IDERA SQL Diagnostic Manager Management Service 10.2.0.3270
IDERA SQL Diagnostic Manager Collection Service 10.2.0.3269
Microsoft Data Access Component (MDAC) 6.3.9600.16384
Microsoft .Net Framework 4.0.30319.34014
Microsoft Windows Operating System Microsoft Windows NT 6.2.9200.0
SQLDM Mobile and Newsfeed Version 220.127.116.11