Health-Check & Monitoring

opsi offers a health check that checks the operating status of the opsi server. This health check can be called up via the web interface of the opsi server, for example. The health check checks the functionality and provides an overview of the status of the various components of the opsi server.

The health check can be integrated into a monitoring system in order to monitor the status of the opsi server.

Health check

The opsiconfd provides a health check that can check various settings and versions of opsi components and thus provide information on possible problems. The health check can be started in different ways. All variants obtain their data from the API call service_healthCheck. The opsi API returns the data in JSON format. Such a JSON file is particularly useful for support requests.

One way to start a health check is via the admin page, RPC interface tab (see section RPC Interface). The WebGUI also provides quick access to the Health-Check. On the command line, the call is made via the command opsiconfd health-check. The parameter --help is used to display a help text; opsiconfd health-check --documentation displays a description of all checks. Without further options, the check runs once and writes its results to stdout.

opsiconfd health-check --list outputs all available checks with name and ID. A specific check can be executed with opsiconfd health-check --checks <ID>. Checks that are not to be executed can be skipped with --skip-checks <ID>. The checks and skip-checks can also be configured in /etc/opsi/opsiconfd.conf or via environment variables (OPSICONFD_CHECKS or OPSICONFD_SKIP_CHECKS).

You can start the health check in the terminal. image::server:opsi-health-check.png[“You can start the health check in the terminal.”, width=800, pdfwidth=80%]

The health check can also be started with the command line tool opsi-cli (see section opsi-cli support). Quick access to a terminal on the opsi server is provided by the admin page via the Terminal tab (see section Opsiconfd Terminal).

Output formats

The health check can produce the following outputs:

  • cli (default) Output customized for the command line.

  • json: Output in JSON format.

  • checkmk: Output so that it can be processed by Check_MK.

The output formats can be controlled via the --format parameter.

opsiconfd health-check --format json

Caching

The results of the individual checks are saved in Redis and remain valid for 24 hours. Various functions, such as creating a backup, ensure that the cache for the backup check is cleared.

If the health check is called with the --clear-cache parameter, the cache is emptied and all checks are executed again. The parameter clear_cache can also be passed to the API method service_healthCheck to clear the cache.

{
  "method": "service_healthCheck",
  "params": {
	"clear_cache": true
  }
}

Monitoring

The health check can be integrated into a monitoring system. To do this, the health check can be called up via the monitoring system and the results can be evaluated.

CheckMK

In order to use the results of the health check in CheckMK, one or more 'local checks' are set up. For this purpose, the opsiconfd provides a template shell script which converts the output of the health check into the CheckMK format. The file opsi_checkmk is located under /usr/lib/opsiconfd. This file can be integrated into CheckMK as a 'local check'. The template is copied to /usr/lib/check_mk_agent/local/<cache-time> and made executable.

Here is an example:

cp /usr/lib/opsiconfd/opsi_checkmk /usr/lib/check_mk_agent/local/7200/opsi_check
chmod +x /usr/lib/check_mk_agent/local/7200/opsi_check

In this case, the check results are updated every 7200 seconds (2 hours). The documentation for the CheckMK local checks can be found here: https://docs.checkmk.com/latest/de/localchecks.html

With the parameters --checks or --skip-checks the checks can be customized in the opsi-check script (see section Health-Check).