Health-Check & Monitoring
opsi offers a health check that checks the operating status of the opsi server. This health check can be called up via the web interface of the opsi server, for example. The health check checks the functionality and provides an overview of the status of the various components of the opsi server.
The health check can be integrated into a monitoring system in order to monitor the status of the opsi server.
Health check
The opsiconfd
provides a health check that can check various settings and versions of opsi components and thus provide information on possible problems. The health check can be started in different ways. All variants obtain their data from the API call service_healthCheck
. The opsi API returns the data in JSON format. Such a JSON file is particularly useful for support requests.
One way to start a health check is via the admin page, RPC interface tab (see section RPC Interface). The WebGUI also provides quick access to the Health-Check. On the command line, the call is made via the command opsiconfd health-check
. The parameter --help
is used to display a help text; opsiconfd health-check --documentation
displays a description of all checks. Without further options, the check runs once and writes its results to stdout.
opsiconfd health-check --list
outputs all available checks with name and ID. A specific check can be executed with opsiconfd health-check --checks <ID>
. Checks that are not to be executed can be skipped with --skip-checks <ID>
.
The checks
and skip-checks
can also be configured in /etc/opsi/opsiconfd.conf
or via environment variables (OPSICONFD_CHECKS or OPSICONFD_SKIP_CHECKS).
You can start the health check in the terminal. image::server:opsi-health-check.png[“You can start the health check in the terminal.”, width=800, pdfwidth=80%]
The health check can also be started with the command line tool opsi-cli (see section opsi-cli support). Quick access to a terminal on the opsi server is provided by the admin page via the Terminal tab (see section Opsiconfd Terminal).
|
Output formats
The health check can produce the following outputs:
-
cli (default) Output customized for the command line.
-
json: Output in JSON format.
-
checkmk: Output so that it can be processed by Checkmk.
-
zabbix: Output so that it can be processed by Zabbix.
-
nagios: Output so that it can be processed by Nagios.
The output formats can be controlled via the --format
parameter.
opsiconfd health-check --format json
Caching
The results of the individual checks are saved in Redis and remain valid for 24 hours. Various functions, such as creating a backup, ensure that the cache for the backup check is cleared.
If the health check is called with the --clear-cache
parameter, the cache is emptied and all checks are executed again.
The parameter clear_cache
can also be passed to the API method service_healthCheck
to clear the cache.
{
"method": "service_healthCheck",
"params": {
"clear_cache": true
}
}
Monitoring
The health check can be integrated into a monitoring system to enable continuous monitoring of system health and performance. The health check can be called up regularly via the monitoring system. The results can be evaluated for various purposes, such as alarms, logs or for the automation of maintenance processes.
Checkmk
To use the results of the health check in Checkmk, one or more “local checks” are set up. Checkmk is a flexible and expandable monitoring tool that has been specially developed for monitoring IT infrastructures. By integrating health check data, administrators can ensure that systems are always in optimum condition and that any problems are detected at an early stage.
The opsiconfd comes with a template shell script that converts the output of the health check into the Checkmk format. This script is contained in the opsi_checkmk file, which can be found under /usr/share/opsiconfd. The file can be integrated into Checkmk as a "local check" in order to regularly collect and evaluate the health check data.
For integration, copy the script to /usr/lib/check_mk_agent/local/<cache-time>, where <cache-time> stands for the cache interval at which the results are updated. After copying, the script must be made executable in order to work correctly.
Here is an example:
cp /usr/share/opsiconfd/opsi_checkmk /usr/lib/check_mk_agent/local/7200/opsi_check
chmod +x /usr/lib/check_mk_agent/local/7200/opsi_check
In this case, the check results are updated every 7200 seconds (2 hours). The documentation for the Checkmk local checks can be found here: https://docs.checkmk.com/latest/de/localchecks.html
With the parameters --checks
or --skip-checks
the checks can be customized in the opsi-check script (see section Health-Check).
Zabbix
To integrate the results of the health check into Zabbix, so-called "UserParameters" can be used.
The opsiconfd provides the shell script /usr/share/opsiconfd/opsi_zabbix
for this purpose, which simplifies the integration.
Here is an example of a UserParameter that executes the Redis check:
UserParameter=opsi.redis,/usr/share/opsiconfd/opsi_zabbix redis
To find out which checks are available, use the command opsiconfd health-check --list --detailed
.
Detailed information about a specific check can be obtained with opsiconfd health-check --docs
.
In the Zabbix configuration, you then create an item of type Text
that uses this UserParameter.
Select Zabbix Agent
as the type and use the name of the UserParameter as the key, in this case opsi.redis
.
The update interval can be chosen flexibly, for example 60 seconds, since the results of the health check are cached by opsiconfd in Redis and thus do not cause high system load.
Set the timeout for the item to 10 seconds.
For alerting, set up triggers that evaluate the text of the item.
You should check whether the result starts with OK
, WARNING
, or CRITICAL
to correctly recognize the status.
Downtime
A downtime can be set for certain health checks to allow for temporary failures or maintenance periods without triggering false alarms. The downtime can be set either for a specific check or for the entire monitoring system. This is particularly useful when performing planned maintenance or updates that result in a temporary loss of system availability.
The downtime for a health check can be defined using the following configuration parameters:
-
opsi.check.enabled (default: true) - Enables or disables the health check. If false, the check is not executed.
-
opsi.check.downtime.start (default: zero) - Start time of the downtime in ISO-8601 format (e.g. 2025-03-25T08:00:00).
-
opsi.check.downtime.end (default: null) - End time of the downtime in ISO-8601 format. The end of the downtime determines when the check is reactivated after the specified pause.
These parameters can be set in the client overview in opsi-configed. In the client details area on the right-hand side, you will find a display with the title Health check active and a cogwheel symbol. You can configure the downtime parameters via this cogwheel icon.

For a detailed description of the health check settings and configurations, you can find further information in the configed chapter of the documentation (see configed).
