why unix | RBL service | netrs | please | ripcalc | linescroll
hosted services

hosted services

outline

Firstly, it was not my choice of monitoring tools to work with. I've used proprietary, open source and inhouse tools. -This- OMS is the worst.

There are several reasons why I say this, I will come to them at the end.

If you find yourself in a situation where you're using SCOM you will notice that the SCOM management system wants to push the agent to the monitored machine. This breaks a lot of best practices. You should not have a monitoring server act as a system management machine, that's just wrong as this immediately gives the monitoring users full control of the monitored machines. You should not do this. Anyone with administration rights on the monitoring machines is now the sysadmin. Bad move MS.

OMS took this a step further, the MS design thought it a good idea to allow the agent on the machine to self update and to own the scripts that it runs as. So if you thought you had some control left over your environment, think again as you cannot easily control with omsagent upgrades itself in test or production.

sudo rules

If you wish to use a custom monitor with SCOM is it not easy to predict what your sudo rules will be upfront, if at all. This mechanism copies a file to the linux machine from the SCOM manager, and then executes it, in /etc/opt/microsoft/scom/tmp. In this setup, you have to allow the SCOM manager full control over all linux machines. My preference would be to permit individual commands, rather than arbitrary files.

certificates

SCOM uses CA signing infrastructure for trust relationships. Upon installation the linux/unix monitoring clients create a self-signed certificate that omi presents to inbound connections. Once the SCOM manager is happy a CA signed certificate is copied, then moved into place, omi is reloaded and the machine is trusted by the SCOM manager.

I'm not entirely sure why this step is needed as the certificate fingerprint could have been trusted at this point without the need for a signing stage.

Race conditions exist in the SCOM manager that make signing hard to predict when during the signing dialog phases that the machine will be ready.

multihome

Best avoided.

powershell

It is possible to query the SCOM manager performance database, but I've not spent long enough in this area.

reporting

SCOM does something similar to RRD (Round Robin Database) when it comes to reporting. When you add data in SCOM's monitor the data will be recorded with high granularity for a little over two and a half days. After that period you will need to query the 'data warehouse' if you want metrics. The sad thing about data warehouse is that the report often gives you either hour aggregation data, or daily averages. This is not terribly helpful. With RRD you can customise the data aperture and determine how accurate you want the data and for how long. There are no nasty surprises either when the RRD file grows since you can decide all this ahead of time. Cacti uses this, and you can, too.

Getting reports from SCOM can seem laborious at first.

  • start operations manager
  • navigate to reporting -> 'Microsoft Generic Report Library'
  • performance
  • data aggregation (daily)
  • objects
  • change...
  • new chart -> new series
  • add object group
  • locate the machine you're interested in, click add
  • click ok
  • click rule -> browse -> management pack (RHEL...) click ok
  • click ok, then run

I prefer command line interfaces myself.