aboutsummaryrefslogtreecommitdiffstats
path: root/modules/xymon/templates/hobbit-alerts.cfg
blob: 1eb8b544ed2b0042df3dab31611da686b16eab84 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
#
# The hobbit-alerts.cfg file controls who receives alerts
# when a status in the BB system goes into a critical
# state (usually: red, yellow or purple).
#
# This file is made up from RULES and RECIPIENTS.
#
# A RULE is a filter made from the PAGE where a host
# is located in BB; the HOST name, the SERVICE name,
# the COLOR of the status, the TIME of day, and the
# DURATION of the event.
#
# A RECIPIENT can be a MAIL address, or a SCRIPT.
#
# Recipients can also have rules associated with them,
# that modify the rules for a single recipient, e.g.
# you can define a rule for alerting, then add an
# extra criteria e.g. so a single recipient does not get
# alerted until after 20 minutes.
#
# A sample rule:
#
# HOST=www.foo.com SERVICE=http
#      MAIL webadmin@foo.com REPEAT=20 RECOVERED
#      MAIL cio@foo.com DURATION>60 COLOR=red
#      SCRIPT /usr/local/bin/sendsms 1234567890 FORMAT=SMS
#
# The first line sets up a rule that catches alerts
# for the host "www.foo.com" and the "http" service.
# There are three recipients for these alerts: The first
# one is the "webadmin@foo.com" - they get alerted 
# immediately when the status goes into an alert state,
# and the alert is repeated every 20 minutes until it
# recovers. When it recovers, a message is sent about
# the recovery.
#
# The second recipient is "cio@foo.com". He gets alerted
# only when the service goes "red" for more than 60 minutes.
#
# The third recipient is a script, "/usr/local/bin/sendsms".
# The real recipient is "1234567890", but it is handled
# by the script - the script receives a set of environment
# variables with the details about the alert, including the
# real recipient. The alert message is preformatted for 
# an SMS recipient.
#
# You can use Perl-compatible "regular expressions" for
# the PAGE, HOST and SERVICE definitions, by putting a "%"
# in front of the regex. E.g.
#
# HOST=%^www.*
#      MAIL webadmin@foo.com EXHOST=www.testsite.foo.com
#
# This sets up a rule so that alerts from any hostname 
# beginning with "www" goes to "webadmin@foo.com", EXCEPT
# alerts from "www.testsite.foo.com"
#
# The following keywords are recognized:
#    PAGE      - rule matching an alert by the name of the
#                page in BB. This is the name following
#                the "page", "subpage" or "subparent" keyword
#                in the bb-hosts file.
#    EXPAGE    - rule excluding an alert if the pagename matches.
#    HOST      - rule matching an alert by the hostname.
#    EXHOST    - rule excluding an alert by matching the hostname.
#    SERVICE   - rule matching an alert by the service name.
#    EXSERVICE - rule excluding an alert by matching the hostname.
#    GROUP     - rule matching an alert by the group ID.
#                (Group ID's are associated with a status through the
#                 hobbit-clients.cfg configuration).
#    EXGROUP   - rule excluding an alert by matching the group ID.
#    COLOR     - rule matching an alert by color. Can be "red",
#                "yellow", or "purple".
#    TIME      - rule matching an alert by the time-of-day. This
#                is specified as the DOWNTIME timespecification
#                in the bb-hosts file (see bb-hosts(5)).
#    DURATION  - Rule matching an alert if the event has lasted
#                longer/shorter than the given duration. E.g.
#                DURATION>10 (lasted longer than 10 minutes) or
#                DURATION<30 (only sends alerts the first 30 minutes).
#    RECOVERED - Rule matches if the alert has recovered from an 
#                alert state.
#    NOTICE    - Rule matches if the message is a "notify" message
#                (typically sent when a status is enabled or disabled).
#    MAIL      - Recipient who receives an e-mail alert. This takes
#                one parameter, the e-mail address.
#    SCRIPT    - Recipient that invokes a script. This takes two
#                parameters: The script filename, and the recipient
#                that gets passed to the script.
#    FORMAT    - format of the text message with the alert. Default
#                is "TEXT" (suitable for e-mail alerts). "SMS" is
#                a short message with no subject for SMS alerts.
#                "SCRIPT" is a brief message template for scripts.
#    REPEAT    - How often an alert gets repeated, in minutes.
#    STOP      - Valid for a recipient: If this recipient gets an
#                alert, recipients further down in hobbit-alerts.cfg
#                are ignored.
#    UNMATCHED - Matches if no alerts have been sent so far.
#
#
# Script get the following environment variables pre-defined so
# that they can send a meaningful alert:
#
#    BBCOLORLEVEL  - The color of the alert: "red", "yellow" or "purple"
#    BBALPHAMSG    - The full text of the status log triggering the alert
#    ACKCODE       - The "cookie" that can be used to acknowledge the alert
#    RCPT          - The recipient, from the SCRIPT entry
#    BBHOSTNAME    - The name of the host that the alert is about
#    MACHIP        - The IP-address of the host that has a problem
#    BBSVCNAME     - The name of the service that the alert is about
#    BBSVCNUM      - The numeric code for the service. From SVCCODES definition.
#    BBHOSTSVC     - HOSTNAME.SERVICE that the alert is about.
#    BBHOSTSVCCOMMAS - As BBHOSTSVC, but dots in the hostname replaced with commas
#    BBNUMERIC     - A 22-digit number made by BBSVCNUM, MACHIP and ACKCODE.
#    RECOVERED     - Is "1" if the service has recovered.
#    DOWNSECS      - Number of seconds the service has been down.
#    DOWNSECSMSG   - When recovered, holds the text "Event duration : N" where
#                    N is the DOWNSECS value.

<%
builder = ['ecosse','rabbit']
builders = builder.map{|x| x + "." + domain }.join(',')
%>
HOST=<%= @builders %> SERVICE=cpu
    MAIL=sysadmin-reports@ml.<%= @domain %> DURATION>6h RECOVERED NOTICE REPEAT=3h STOP
 
HOST=%.*.<%= @domain %>
    MAIL=sysadmin-reports@ml.<%= @domain %> DURATION>5 RECOVERED NOTICE REPEAT=3h