release notes | Book: 1.9.5, 1.9.12 (opt, FHS), 2.6 (FHS), 2.7 (FHS), 2.8 (FHS), 2.9 (FHS), 2.10 (FHS), 2.11 (FHS), | Wiki | Q&A black_bg
Web: Multi-page, Single page | PDF: A4-size, Letter-size | eBook: epub black_bg

The Basic Setup

It is advisable to run the alarms service in a separate domain and list this domain first in the layout file. That way the alarms service gets booted first and can catch startup errors reported by the other domains. Since both the httpd service and the alarms service will access the storage file generated at /var/lib/dcache/alarms/alarms.xml the alarms service should be defined on the same host as the httpd service. You can modify where this file is placed by setting the property alarms.store.path to a different location.

Add a domain for the alarms service to the layout file where the httpd service is defined.

[alarmserverDomain]
[alarmserverDomain/alarms]
...

[httpdDomain]

If all of the dCache domains run on the same host, then the default setting (localhost) will work.

[return to top]

Configure where the alarms service is Running

In general your dCache will not be configured to run on one node. In this case each node needs to know on which node the alarms service is running. The alarms service and the httpd will run on one of the nodes. On all the other nodes you need to modify the /etc/dcache/dcache.conf file or the layout file to set the alarms.server.host property to the host on which the alarms service is running and restart dCache.

Example:

Look at an example of a dCache which consists of a head node, some door nodes and some pool nodes. Assume that the httpd service and the alarms service are running on the head node. Then you would need to set the property alarms.server.host on the pool nodes and on the door nodes to the host on which the alarms service is running.

alarms.server.host=<head-node>

[return to top]

The Defined Alarms

The alarms defined are listed below. There are four different levels of severity, CRITICAL, HIGH, MODERATE and LOW.

CRITICAL
  • SERVICE_CREATION_FAILURE
  • DB_OUT_OF_CONNECTIONS
  • DB_UNAVAILABLE
  • JVM_OUT_OF_MEMORY
  • OUT_OF_FILE_DESCRIPTORS

The affected dCache can’t work (is down).

HIGH
  • IO_ERROR
  • HSM_READ_FAILURE
  • HSM_WRITE_FAILURE
  • LOCATION_MANAGER_UNAVAILABLE
  • POOL_MANAGER_UNAVAILABLE

These functions are affected and not working or not working properly, even though the dCache domain may be running.

MODERATE
  • POOL_DISABLED
  • CHECKSUM

There is an issue which should be taken care of in the interest of performance or usability, but which is not impeding the functioning of the system as a whole.

LOW

This issue might be worth investigating if it occurs, but is not urgent.

Given that an alarm has been triggered, you will find an entry in the file /var/lib/dcache/alarms/alarms.xml.

As it is not very convenient to read an XML file, the Alarms Web Page can be used to inspect and manage the generated warnings.