Configuring Sentries and Agents
This page was last modified 06:03, 14 June 2013.From Documentation
Revision as of 05:05, 28 September 2006 Daniels (Talk | contribs) (→Data Logging) ← Previous diff |
Current revision Mike (Talk | contribs) (→Running Agents or Sentries in Test Mode) |
||
Line 65: | Line 65: | ||
#Select Tables > Agents. The ‘Agents’ window opens, showing details of all the agents defined on this host. | #Select Tables > Agents. The ‘Agents’ window opens, showing details of all the agents defined on this host. | ||
#Select Maintain > Add. The ‘Add Agent’ form opens. | #Select Maintain > Add. The ‘Add Agent’ form opens. | ||
- | #:<code>Figure 19 — ‘Add Agent’ window</code> | + | #: [[Image:sentinelfigure19.jpg|frame|Figure 19 — ‘Add Agent’ window]] |
#Enter these fields: | #Enter these fields: | ||
#;Knowledge Base: The KB that this agent belongs to. Click to choose one of the predefined KBs installed on this host, otherwise leave this field blank if the agent is not associated with a particular KB. | #;Knowledge Base: The KB that this agent belongs to. Click to choose one of the predefined KBs installed on this host, otherwise leave this field blank if the agent is not associated with a particular KB. | ||
Line 121: | Line 121: | ||
such a line is found, portions of the record can be assigned to one or more variables. | such a line is found, portions of the record can be assigned to one or more variables. | ||
- | ;;Figure 20 — Example: options for an agent that monitors the message log | + | [[Image:sentinelfigure20.jpg|frame|Figure 20 — Example: options for an agent that monitors the message log]] |
;Logfile name: The name of the file to be monitored. | ;Logfile name: The name of the file to be monitored. | ||
Line 166: | Line 166: | ||
==== Agent options for SNMPPolled class ==== | ==== Agent options for SNMPPolled class ==== | ||
- | Figure 22 — Example: agent options for an agent in SNMPPolled class | + | [[Image:sentinelfigure22|frame|Figure 22 — Example: agent options for an agent in SNMPPolled class]] |
;MIB file name: The name of the MIB file, which Sentinel3G expects to find in the directory lib/tnm2.1.10/mibs under the COSmanager home directory. If the MIB file is not already stored there, copy it there now. | ;MIB file name: The name of the MIB file, which Sentinel3G expects to find in the directory lib/tnm2.1.10/mibs under the COSmanager home directory. If the MIB file is not already stored there, copy it there now. | ||
Line 179: | Line 179: | ||
;Timeout: (secs) The maximum time to wait for a response from the node being polled. | ;Timeout: (secs) The maximum time to wait for a response from the node being polled. | ||
- | When you have finished setting the SNMPPolled agent options, click Accept to return to the main Add Agent form. | + | When you have finished setting the SNMPPolled agent options, click Accept to return to the main Add Agent form. |
==== Agent options for Text class ==== | ==== Agent options for Text class ==== | ||
- | Figure 23 — Agent options for agents in the Text class | + | [[Image:sentinelfigure23.jpg|frame|Figure 23 — Agent options for agents in the Text class]] |
;Record length: The total number of lines in each record. | ;Record length: The total number of lines in each record. | ||
Line 202: | Line 202: | ||
When you have finished setting the Text agent options, click Accept to return to the main Add Agent form. | When you have finished setting the Text agent options, click Accept to return to the main Add Agent form. | ||
- | ==== Adding Variables Used By the Agent ==== | + | ==== Adding Variables Used by the Agent ==== |
#From the console, select Configure > Host monitor. | #From the console, select Configure > Host monitor. | ||
#If you have not already selected the host to update, Sentinel3G will ask you to choose one now. This is the host that the agent will run on. The ‘All Sentries’ window opens, showing details of all the sentries defined on this host. | #If you have not already selected the host to update, Sentinel3G will ask you to choose one now. This is the host that the agent will run on. The ‘All Sentries’ window opens, showing details of all the sentries defined on this host. | ||
Line 208: | Line 208: | ||
#Select the agent, then select Maintain > Variables. The ‘Variables from Agent <agent_name>’ window opens. | #Select the agent, then select Maintain > Variables. The ‘Variables from Agent <agent_name>’ window opens. | ||
#Select Maintain > Add. The ‘Add Variable’ form opens. | #Select Maintain > Add. The ‘Add Variable’ form opens. | ||
- | Figure 24 — ‘Add variable’ window | + | [[Image:sentinelfigure24.jpg|frame|Figure 24 — ‘Add variable’ window]] |
- | ;Variable name: Enter a name for the variable. The name must not contain spaces. It must not have the same name as another variable belonging to this agent. It doesn’t need to be unique across agents; two agents may have variables with the same name. If the agent is in the API class, the name must match one of the variable names passed by the external application. | + | ;Variable name: Enter a name for the variable. The name must not contain spaces, but it may contain underscores instead. It must not have the same name as another variable belonging to this agent. It doesn’t need to be unique across agents; two agents may have variables with the same name. If the agent is in the API class, the name must match one of the variable names passed by the external application. By convention, agent variable names are lower case. |
;Class: Select one of these options, depending on how the value of the variable will be set: | ;Class: Select one of these options, depending on how the value of the variable will be set: | ||
:;raw: The value is set by the agent. | :;raw: The value is set by the agent. | ||
- | :;derived: The value will be computed from other variables (in the Expression field on this form). This is often used to express | + | :;derived: The value will be computed from other variables (in the Expression field on this form). This is often used to express the value of another variable in a different way, such as a rate, proportion, or percentage. |
- | the value of another variable in a different way, such as a rate, proportion, or percentage. | + | |
:;trigger: If this variable is included in the list of trigger variables for a state, the Expression field will be evaluated when the sentry changes into that state. This is usually used to save the previous value when a new value is received, so that the old and new values can be compared. | :;trigger: If this variable is included in the list of trigger variables for a state, the Expression field will be evaluated when the sentry changes into that state. This is usually used to save the previous value when a new value is received, so that the old and new values can be compared. | ||
Line 220: | Line 219: | ||
:;number: An integer or floating point number. | :;number: An integer or floating point number. | ||
:;string: A text string. | :;string: A text string. | ||
- | :;boolean: Mainly used by agents in the ExitStatus class. If the command returns 0, the boolean value is true, otherwise it is | + | :;boolean: Mainly used by agents in the ExitStatus class. If the command returns 0, the boolean value is true, otherwise it is false. |
- | false. | + | :;date: A date in Functional Database internal format. The date will be stored in the form YYYYMMDD and output as MM/DD/YY (U.S. display format) or DD/MM/YY (European display format). Mainly used by agents in the DB class. |
- | :;date: A date in Functional Database internal format. The date will be stored in the form YYYYMMDD and output as MM/DD/YY (U.S. | + | |
- | display format) or DD/MM/YY (European display format). Mainly used by agents in the DB class. | + | |
:;datetime: A date and time in Functional Database internal format. The date will be stored in the form YYYYMMDD.hhmmss and output as MM/DD/YY-hh:mm (U.S. display format) or DD/MM/YY-hh:mm (European display format). Mainly used by agents in the DB class. | :;datetime: A date and time in Functional Database internal format. The date will be stored in the form YYYYMMDD.hhmmss and output as MM/DD/YY-hh:mm (U.S. display format) or DD/MM/YY-hh:mm (European display format). Mainly used by agents in the DB class. | ||
- | clock A count of the number of seconds since 1 Jan 1970 (GMT). You can subtract from the current value a clock value saved earlier to return a time period (for example how long a process has run). | + | :;clock: A count of the number of seconds since 1 Jan 1970 (GMT). You can subtract from the current value a clock value saved earlier to return a time period (for example how long a process has run). |
;Private: Leave unchecked if you wish this variable to be available for logging and graphing. | ;Private: Leave unchecked if you wish this variable to be available for logging and graphing. | ||
Line 271: | Line 268: | ||
#Select Configure > Host monitor. The ‘All Sentries’ window opens, showing details of all the sentries defined on this host. | #Select Configure > Host monitor. The ‘All Sentries’ window opens, showing details of all the sentries defined on this host. | ||
#Select Maintain > Add. The ‘Add Sentry’ form opens. | #Select Maintain > Add. The ‘Add Sentry’ form opens. | ||
- | Figure 25 — Sentry details form | + | [[Image:sentinelfigure25.jpg|frame|Figure 25 — Sentry details form]] |
===Knowledge Base=== | ===Knowledge Base=== | ||
Line 301: | Line 298: | ||
;On/Off: Set the initial condition of the sentry. on means the sentry will be operating normally. off means the sentry will not process agent data unless it is switched on manually. | ;On/Off: Set the initial condition of the sentry. on means the sentry will be operating normally. off means the sentry will not process agent data unless it is switched on manually. | ||
;Label: If set, this will be used on the console as the name of the instance. If it is not set, the instance name will be used on the console. | ;Label: If set, this will be used on the console as the name of the instance. If it is not set, the instance name will be used on the console. | ||
+ | ;Group: If the sentry has Instance Groups defined, you can optionally pre-assign the instance to a particular group here. | ||
;Agent data: Agent-specific data for this instance (used by certain agents only). Example: the ProcessInfo agent can be passed a regular expression in this field to match process names. | ;Agent data: Agent-specific data for this instance (used by certain agents only). Example: the ProcessInfo agent can be passed a regular expression in this field to match process names. | ||
Line 348: | Line 346: | ||
=== Advanced options === | === Advanced options === | ||
Click the Advanced button to display some additional options relating to notification and data logging. | Click the Advanced button to display some additional options relating to notification and data logging. | ||
- | Figure 26 — Advanced sentry options form | + | |
+ | [[Image:sentinelfigure26.jpg|frame|Figure 26 — Advanced sentry options form]] | ||
;Notification type: Select none to turn off notification for this sentry. Select default to use the global NotifyList and | ;Notification type: Select none to turn off notification for this sentry. Select default to use the global NotifyList and | ||
Line 403: | Line 402: | ||
== Adding States == | == Adding States == | ||
The next step is to define the states that the sentry can be in. | The next step is to define the states that the sentry can be in. | ||
- | 1. If the console is not in Host View, select Go > Hosts. | + | #If the console is not in Host View, select Go > Hosts. |
- | 2. Select the host that the agent will run on. | + | #Select the host that the agent will run on. |
- | 3. Select Configure > Host monitor. The ‘All Sentries’ window opens, | + | #Select Configure > Host monitor. The ‘All Sentries’ window opens, showing details of all the sentries defined on this host. |
- | showing details of all the sentries defined on this host. | + | #Select the sentry, then select Maintain > States. The ‘States for sentry <sentry_name>’ window opens. |
- | 4. Select the sentry, then select Maintain > States. The ‘States for sentry | + | #:States are listed in the order in which the entry conditions are evaluated. By default, this is in order of decreasing severity, with the most severe at the top. The first state whose entry condition evaluates to true will cause the sentry to enter that state. It follows that states with a NULL entry condition (usually the ‘normal’ state) should be last, for example: |
- | <sentry_name>’ window opens. | + | #:;critical $pct_free == 0 |
- | States are listed in the order in which the entry conditions are evaluated. By | + | #:;severe $pct_free < 5 |
- | default, this is in order of decreasing severity, with the most severe at the | + | #:;alarm $pct_free < 10 |
- | top. The first state whose entry condition evaluates to true will cause the | + | #:;warning $pct_free < 15 |
- | sentry to enter that state. It follows that states with a NULL entry condition | + | #:;normal <null> |
- | (usually the ‘normal’ state) should be last, for example: | + | #:You can the drag the states into a different order using the Order > Reorder menu option. |
- | You can the drag the states into a different order using the Order > | + | #There are three ways in which you can add states: |
- | Reorder menu option. | + | #*If the sentry was created by cloning, a separate copy of the original sentry’s states is made. If these states are exactly as required, you don’t need to do anything more. If you need to modify a state in some way, select it now and use Maintain > Change to make the changes (see [[Maintain State Details]]). |
- | 5. There are three ways in which you can add states: | + | #*If there is another sentry that already has a set of states that are similar or identical to what is required, you can copy that sentry’s states: select Maintain > Copy states, then choose the sentry. |
- | • If the sentry was created by cloning, a separate copy of the original sentry’s | + | #*If the states are exactly as required, you don’t need to do anything more. If you need to modify a state in some way, select it now and use Maintain > Change to make the changes (see [[Maintain State Details]]). |
- | states is made. | + | #*You can add each state manually: select Maintain > Add. The ‘Add States’ form opens, as shown in Figure 27. |
- | If these states are exactly as required, you don’t need to do anything more. If | + | [[Image:figure27.jpg|frame|Figure 27 — State details form ]] |
- | you need to modify a state in some way, select it now and use Maintain > | + | ---- |
- | Change to make the changes (see Maintain State Details on page 122). | + | ;Note: Copy states is only available if the sentry has no states defined. If the new sentry already has states and you wish use the states of another sentry instead, remove the states from the new sentry first. |
- | • If there is another sentry that already has a set of states that are similar or | + | ---- |
- | identical to what is required, you can copy that sentry’s states: select | + | |
- | Maintain > Copy states, then choose the sentry. | + | |
- | Note Copy states is only available if the sentry has no states defined. If | + | |
- | the new sentry already has states and you wish use the states of | + | === Maintain State Details === |
- | another sentry instead, remove the states from the new sentry first. | + | |
- | critical $pct_free == 0 | + | |
- | severe $pct_free < 5 | + | |
- | alarm $pct_free < 10 | + | |
- | warning $pct_free < 15 | + | |
- | normal <null> | + | |
- | 122 Configuring Sentries and Agents | + | |
- | If the states are exactly as required, you don’t need to do anything more. If | + | |
- | you need to modify a state in some way, select it now and use Maintain > | + | |
- | Change to make the changes (see Maintain State Details on page 122). | + | |
- | • You can add each state manually: select Maintain > Add. The ‘Add | + | |
- | States’ form opens, as shown in Figure 27. | + | |
- | Maintain State Details | + | |
To define the sentry’s attributes and appearance while in this state, enter these fields: | To define the sentry’s attributes and appearance while in this state, enter these fields: | ||
- | Figure 27 — State details form | + | |
- | State Enter a unique name for the state. The name must not contain | + | ;State: Enter a unique name for the state. The name must not contain spaces. |
- | spaces. | + | ;Severity: Select the severity level for this state. The severity determines how the sentry will look while in this state (that is, its color and the color and type of any associated overlay icon). Note that this will result in a notification message being sent if the new severity is at or above the notification level specified globally or for this sentry. |
- | Severity Select the severity level for this state. The severity determines how | + | |
- | the sentry will look while in this state (that is, its color and the color | + | :The options are listed in order of increasing severity from normal to critical. disabled is a special severity that indicates the |
- | and type of any associated overlay icon). Note that this will result | + | sentry is ‘down’ or otherwise unavailable, but doesn’t require attention (for example, a device that has been taken offline for |
- | in a notification message being sent if the new severity is at or | + | |
- | above the notification level specified globally or for this sentry. | + | |
- | Configuring Sentries and Agents 123 | + | |
- | The options are listed in order of increasing severity from normal | + | |
- | to critical. disabled is a special severity that indicates the | + | |
- | sentry is ‘down’ or otherwise unavailable, but doesn’t require | + | |
- | attention (for example, a device that has been taken offline for | + | |
maintenance). | maintenance). | ||
- | Description Enter a description for the state. | + | ;Description: Enter a description for the state. |
- | Entry condition Enter a conditional expression. This is a TCL expression made up | + | ;Entry condition: Enter a conditional expression. This is a TCL expression made up of any combination of agent variables, constants, text strings, numbers, history variables, boolean values, and TCL functions. |
- | of any combination of agent variables, constants, text strings, | + | <pre>Examples: |
- | numbers, history variables, boolean values, and TCL functions. | + | |
- | Examples: | + | |
$Status == "Off" && $PID != "-1" | $Status == "Off" && $PID != "-1" | ||
$pct_free < $LOW | $pct_free < $LOW | ||
- | [hist_avg @cpu_idle] | + | [hist_avg @cpu_idle]</pre> |
- | If the entry condition is left blank it evaluates to true. For the | + | |
- | correct syntax to refer to variables, see Expressions on page 23. | + | If the entry condition is left blank it evaluates to true. For the correct syntax to refer to variables, see [[Expressions]]. |
- | Console | + | |
+ | ==== Console ==== | ||
These fields define how the sentry will appear on the console while in this state: | These fields define how the sentry will appear on the console while in this state: | ||
- | Text Enter a text string to provide extra information about the current | + | |
- | state of the sentry. The string will be displayed in the status area of | + | ;Text: Enter a text string to provide extra information about the current state of the sentry. The string will be displayed in the status area of the console, and may contain both informative messages and the values of variables. Example: HTTP hits: &HttpHits; HTTP errors: &HttpErrors… |
- | the console, and may contain both informative messages and the | + | |
- | values of variables. Example: | + | where HttpHits and HttpErrors are the names of variables returned by the primary agent or a secondary agent. |
- | HTTP hits: &HttpHits; HTTP errors: &HttpErrors | + | |
- | … where HttpHits and HttpErrors are the names of variables returned by the | + | |
- | primary agent or a secondary agent. | + | |
If you prefix a variable name with & it will be formatted for output on the console. | If you prefix a variable name with & it will be formatted for output on the console. | ||
- | For example, numeric variables will be displayed with the correct number of decimal | + | |
- | places and with the units after the value, while date/time values will be formatted as | + | For example, numeric variables will be displayed with the correct number of decimal places and with the units after the value, while date/time values will be formatted as a readable dates or times. If you prefix a variable name with $ the raw value will be displayed on the console. |
- | a readable dates or times. If you prefix a variable name with $ the raw value will be | + | |
- | displayed on the console. | + | You can change the appearance of the sentry’s icon while it is this state, by specifying a different icon or by overlaying the normal icon with an additional indicator icon to modify its appearance. |
- | 124 Configuring Sentries and Agents | + | |
- | You can change the appearance of the sentry’s icon while it is this state, by specifying | + | |
- | a different icon or by overlaying the normal icon with an additional indicator | + | ---- |
- | icon to modify its appearance. | + | ;Note: To add new icons, see [[Add Icons]]. |
- | Note To add new icons, see Add Icons on page 154. | + | ---- |
- | Icon Click to choose a different icon to represent this sentry while it | + | |
- | is in this state. Example: the sentry normally has a 32x32 icon | + | |
- | representing a remote system. When it is in network_down | + | ;Icon: Click to choose a different icon to represent this sentry while it is in this state. Example: the sentry normally has a 32x32 icon representing a remote system. When it is in network_down state, you choose to use another icon that has a red X indicating a problem with the network connection. |
- | state, you choose to use another icon that has a red X indicating a | + | :If Icon is left blank the default sentry icon will be used. |
- | problem with the network connection. | + | ;Indicator: Click to choose a 16x16 indicator icon. This is an additional small icon that overlays the main icon in the top right hand corner. |
- | If Icon is left blank the default sentry icon will be used. | + | :You can use this to modify the appearance of the icon while the sentry is in this state. |
- | Indicator Click to choose a 16x16 indicator icon. This is an additional | + | :If Indicator is left blank the overlay specified in the sentry (thermometer or pie chart) will be used. If neither overlay is specified in the sentry, the default overlay icon and color for the current severity is used (see [[Overlays and Indicator Icons]]). |
- | small icon that overlays the main icon in the top right hand corner. | + | ;Notes file: Enter the file name only (without the path) of a notes file in the Sentinel3G doc directory. These notes will be available from the console to operators when monitoring or responding to alerts relating to this sentry. |
- | You can use this to modify the appearance of the icon while the | + | |
- | sentry is in this state. | + | ==== Trigger variables ==== |
- | If Indicator is left blank the overlay specified in the sentry | + | Click next to the Trigger vars field to list the trigger variables for this sentry. Choose one or more variables whose values you want to reset when the sentry enters this state (see [[Trigger Variables]]). |
- | (thermometer or pie chart) will be used. If neither overlay | + | |
- | is specified in the sentry, the default overlay icon and color for the | + | ==== Responses ==== |
- | current severity is used (see Overlays and Indicator Icons on page 16). | + | The Responses button lets you define several responses that can be run automatically while the sentry is in this state. |
- | Notes file Enter the file name only (without the path) of a notes file in the | + | Responses Click to see the options for notification, escalation and running automatic responses. |
- | Sentinel3G doc directory. These notes will be available from | + | |
- | the console to operators when monitoring or responding to alerts | + | [[Image:sentinelfigure28.jpg|frame|Figure 28 — Defining the responses for a state]] |
- | relating to this sentry. | + | |
- | Trigger variables | + | The Response form contains three blocks of response fields which are run in turn if the sentry remains in this state for longer than the specified period. Each response block specifies a waiting period, then three actions Sentinel3G can take at the end of each period. The actions are: changing the severity level, which in turn changes the appearance of the sentry on the console; sending a notification message; and running a command. These options are not mutually exclusive; you can specify any or all actions in each response block. |
- | Click next to the Trigger vars field to list the trigger variables for this sentry. | + | |
- | Choose one or more variables whose values you want to reset when the sentry | + | The final group, the Escalation/acknowledgement block, specifies a period after which the sentry will be forced into a different state. See [[Escalation/acknowledgement]]. |
- | enters this state (see Trigger Variables on page 43). | + | |
- | Configuring Sentries and Agents 125 | + | A typical response is to run a command to fix the problem, and if it succeeds to return the sentry to a normal state. If the command doesn’t fix the problem you may choose to leave the sentry in that state, and specify a later response to run another command or to notify someone. |
- | Responses | + | |
- | The Responses button lets you define several responses that can be run automatically | + | These responses are all optional. You can specify all three, or one, or even none. If no responses or escalation period are specified here, the sentry will remain in this state until and unless the evaluation condition evaluates to a different severity level. |
- | while the sentry is in this state. | + | |
- | Responses Click to see the options for notification, escalation and running | + | If a sentry changes state while it is waiting to process a response, then all responses for that state are cancelled, and any responses for the new state are started. |
- | automatic responses. | + | |
- | Figure 28 — Defining the responses for a state | + | ;Wait: The length of time to wait after running the previous response, or if this is the first response, after entering this state. Each period is cumulative. In other words the wait period for Response #2 is counted from the end of the period for Response #1. |
- | The Response form contains three blocks of response fields which are run in turn if | + | :By default the wait time is in seconds. However an optional suffix of either <b>secs</b>, <b>mins</b>, <b>hrs</b> or <b>polls</b> can also be used to change the unit of time. For example, if the primary agent's poll time is 120 seconds, "2 polls" will wait for 240 seconds before activating. |
- | the sentry remains in this state for longer than the specified period. Each response | + | :If the wait time of Response #1 is set to 0, the response will occur as soon as the sentry enters this state. |
- | block specifies a waiting period, then three actions Sentinel3G can take at the | + | ;New severity: After the wait period, change the appearance of the sentry on the console to this new severity level. This would usually be done to trigger a global or sentry-level notification message or to increase the apparent urgency of the event by making the icon flash or change color. Note that changing the severity does not change the state of the sentry. Select unchanged if you wish to leave the severity level as it is. |
- | end of each period. The actions are: changing the severity level, which in turn | + | ;Notification: Click to specify who will be notified if the sentry is still in this state at the end of the wait period. |
- | changes the appearance of the sentry on the console; sending a notification message; | + | :In the Type field, select one of the following: |
- | and running a command. These options are not mutually exclusive; you can | + | :;default: Use the default notification list for this sentry. |
- | specify any or all actions in each response block. | + | :;specify: In the Who field, choose the names of one or more users. |
- | The final group, the Escalation/acknowledgement block, specifies a period after | + | :These are in addition to the default notification level for this sentry (see [[Advanced options]]). |
- | which the sentry will be forced into a different state. See Escalation/acknowledgement | + | :If you don’t want to do any additional notification for this response, select none. |
- | on page 127. | + | |
- | A typical response is to run a command to fix the problem, and if it succeeds to | + | ;Command: Enter a command to be run by the Host Monitor. This would usually attempt to fix the problem. Example: a ‘free disk space’ sentry could archive files to an offline storage device or remove files such as core and *.o that are deemed expendable. |
- | return the sentry to a normal state. If the command doesn’t fix the problem you may | + | ;Fire agent: Choose an agent to be run. This agent will be polled immediately at the end of the wait period. To poll a specific instance only, enter the instance name in brackets after the agent name. Example: Filesystem(/tmp). This is useful to cause the sentry's states to be re-evaluated quickly after running a response, rather than having to wait for the next poll. |
- | choose to leave the sentry in that state, and specify a later response to run another | + | |
- | command or to notify someone. | + | ==== Escalation/acknowledgement ==== |
- | These responses are all optional. You can specify all three, or one, or even none. If | + | Another way to respond to an alert is simply to wait for a while to see if the problem corrects itself or more information is received, then to change to another state at the end of that period. |
- | no responses or escalation period are specified here, the sentry will remain in this | + | [[Image:sentinelfigure29.jpg|frame|Figure 29 — Defining the escalation condition for a state]] |
- | state until and unless the evaluation condition evaluates to a different severity level. | + | |
- | 126 Configuring Sentries and Agents | + | Change to a more severe state if the problem would normally be expected to resolve itself either spontaneously or by the running of the automatic responses. If the sentry is still in this state at the end of the waiting period it suggests some other action |
- | If a sentry changes state while it is waiting to process a response, then all responses | + | |
- | for that state are cancelled, and any responses for the new state are started. | + | |
- | Wait (secs) The length of time to wait after running the previous response, or | + | |
- | if this is the first response, after entering this state. Each period is | + | |
- | cumulative. In other words the period for Response #2 is counted | + | |
- | from the end of the period for Response #1. | + | |
- | New severity After the wait period, change the appearance of the sentry on the | + | |
- | console to this new severity level. This would usually be done to | + | |
- | trigger a global or sentry-level notification message or to increase | + | |
- | the apparent urgency of the event by making the icon flash or | + | |
- | change color. Note that changing the severity does not change the | + | |
- | state of the sentry. Select unchanged if you wish to leave the | + | |
- | severity level as it is. | + | |
- | Notification Click to specify who will be notified if the sentry is still in this | + | |
- | state at the end of the wait period. | + | |
- | In the Type field, select one of the following: | + | |
- | default Use the default notification list for this sentry. | + | |
- | specify In the Who field, choose the names of one or more users. | + | |
- | These are in addition to the default notification level for this sentry | + | |
- | (see under Advanced options on page 116). | + | |
- | If you don’t want to do any additional notification for this | + | |
- | response, select none. | + | |
- | Command Enter a command to be run by the Host Monitor. This would | + | |
- | usually attempt to fix the problem. Example: a ‘free disk space’ | + | |
- | sentry could archive files to an offline storage device or remove | + | |
- | files such as core and *.o that are deemed expendable. | + | |
- | Fire agent Choose an agent to be run. This agent will be polled immediately | + | |
- | at the end of the wait period. To poll a specific instance only, enter | + | |
- | the instance name in brackets after the agent name. Example: | + | |
- | Filesystem(/tmp). | + | |
- | Configuring Sentries and Agents 127 | + | |
- | Escalation/acknowledgement | + | |
- | Another way to respond to an alert is simply to wait for a while to see if the problem | + | |
- | corrects itself or more information is received, then to change to another state at the | + | |
- | end of that period. | + | |
- | Figure 29 — Defining the escalation condition for a state | + | |
- | Change to a more severe state if the problem would normally be expected to resolve | + | |
- | itself either spontaneously or by the running of the automatic responses. If the sentry | + | |
- | is still in this state at the end of the waiting period it suggests some other action | + | |
must be taken. | must be taken. | ||
- | Change to a less severe state (typically, normal state) if the problem appears to | + | |
- | have been a one-off event. For example, the Bad_SU sentry goes into warning | + | Change to a less severe state (typically, normal state) if the problem appears to have been a one-off event. For example, the Bad_SU sentry goes into warning state if a failed su attempt is detected. If no further failed su attempts are detected by the end of the waiting period no action need be taken and the sentry can be returned to normal state. |
- | state if a failed su attempt is detected. If no further failed su attempts are detected | + | |
- | by the end of the waiting period no action need be taken and the sentry can be | + | The change of state may depend on manual confirmation from an operator (Acknowledgement) or it may happen automatically (Escalation). |
- | returned to normal state. | + | ;Wait: The length of time to wait after running the previous response, or if there are no previous responses, after entering this state. The format and meaning of this field is the same as for <b>Responses</b>, above. |
- | The change of state may depend on manual confirmation from an operator | + | ;Go to state: Choose the new state to change to. |
- | (Acknowledgement) or it may happen automatically (Escalation). | + | ;Type: Select acknowledge if the change of state depends on manual confirmation from an operator. Select escalation if the change of state should happen automatically at the end of the waiting period. Typically "acknowledgement" is a change to a less severe state, and "escalation" is to a more severe state. |
- | Wait (secs) The length of time to wait after running the previous response, or | + | |
- | if there are no previous responses, after entering this state. If the | + | When you have finished defining responses and escalation details, click Accept to return to the main Add State form. |
- | waiting period is set to 0 seconds, the response (either escalation | + | |
- | or the appearance of the acknowledgement icon on the | + | If you have finished defining this state, click Accept to save it and return to the ‘State Details’ window. |
- | console) will occur as soon as the sentry enters this state. | + | |
- | Go to state Choose the new state to change to. | + | ==== Constants ==== |
- | Type Select acknowledge if the change of state depends on manual | + | Click next to the Constants field to maintain the list of constants and thresholds for this sentry. You can add a new constant, change the details of an existing constant, or adjust the threshold values at which the sentry changes from one state to another. For details about all these tasks, see [[Maintaining Constants and Thresholds]]. |
- | confirmation from an operator. Select escalation if the change | + | |
- | of state should happen automatically at the end of the waiting | + | == Adding Instance Groups == |
- | period. | + | A multi-instance sentry can optionally have a set of <b>Instance Groups</b>, a feature which allows certain instances to use different configurations from that defined in the sentry. The configurations that can be set at the Instance Group level are: |
- | 128 Configuring Sentries and Agents | + | * How the instance appears on the console (icon and label) |
- | When you have finished defining responses and escalation details, click Accept to | + | * Constants / Threshold values used in state conditions |
- | return to the main Add State form. | + | * The user(s) to notify for that instance |
- | If you have finished defining this state, click Accept to save it and return to the | + | An instance is assigned to an Instance Group one of two ways: |
- | ‘State Details’ window. | + | * Statically, using the <b>Group</b> field in the Instance form, or |
- | Constants | + | * Dynamically, using the "Assign if" condition. |
- | Click next to the Constants field to maintain the list of constants and thresholds | + | Once assigned to an Instance Group, the configuration for that group is used, overriding those defined in the sentry. |
- | for this sentry. You can add a new constant, change the details of an existing | + | |
- | constant, or adjust the threshold values at which the sentry changes from one state | + | <b>Notes</b>: |
- | to another. For details about all these tasks, see Maintaining Constants and Thresholds | + | # The assignment to an instance group happens once only when the Host Monitor starts. |
- | on page 140. | + | # If an instance is not assigned to a group, the default configuration from the sentry is used. |
- | Configuring Sentries and Agents 129 | + | |
- | Adding an Action or Report | + | To maintain the Instance Groups of a sentry: |
- | A sentry can have several associated actions, which an operator can choose to run | + | # From the ‘Sentries’ window, select the sentry. |
- | from the console. Actions may either be tied to particular states, or can be made | + | # Select Maintain > Instance groups. The ‘Instance Groups of sentry <sentry_name>’ window opens. |
- | available when the sentry is in any state. There are two types: actions typically are used | + | # Select Maintain > Add. The ‘Add instance group (Sentry <sentry_name>)’ form opens. |
- | to try to fix a problem; reports display output on the screen and help the operators to | + | |
- | diagnose the problem. You can assist operators by explaining in the monitoring | + | ;Group: Enter the name of the Instance Group. |
- | notes for the sentry or state when and how each action should be used. | + | ;Description: Enter the description of the group. |
- | When designing an action for a multi-instance sentry you can set it up to run for | + | ;Icon: To use a different icon on the Sentinel3G console for instances of this group, enter it here. |
- | selected instances or for every instance in a parent folder. For example, you can set | + | ;Label: To have a different label on the Sentinel3G console for instances of this group, enter it here. The label may contain the variables $Instance, $Group or any "raw" agent variable. |
- | up an action so that the output for all instances is combined into one report. | + | ;Assign if: If instances are to be dynamically assigned to the group, enter a TCL boolean expression. The expression is evaluated for each instance of the sentry, and when true, the instance is assigned to this group. The expression may contain the variable $Instance and/or and "raw" agent variable. |
- | 1. From the ‘All Sentries’ window, select the sentry. | + | ;Constants: Click this field to override some or all of the constants (thresholds). |
- | 2. Select Maintain > Actions. The ‘Actions for sentry <sentry_name>’ | + | ;Notification: Click this field to override the users to be notified about instances in this group. |
- | window opens. | + | |
- | 3. Select Maintain > Add. The ‘Add actions for sentry <sentry_name>’ | + | Click Accept to save the group. |
- | form opens.(Tip: If a similar action has already been defined for a sentry | + | |
- | that uses the same agent, it may be faster to use Maintain > Copy.) | + | == Adding an Action or Report == |
- | Figure 30 — Action details form | + | A sentry can have several associated actions, which an operator can choose to run from the console. Actions may either be tied to particular states, or can be made available when the sentry is in any state. There are two types: actions typically are used to try to fix a problem; reports display output on the screen and help the operators to diagnose the problem. You can assist operators by explaining in the monitoring notes for the sentry or state when and how each action should be used. |
- | 130 Configuring Sentries and Agents | + | |
- | Action Enter a name for the action. This is the name that will appear in | + | When designing an action for a multi-instance sentry you can set it up to run for selected instances or for every instance in a parent folder. For example, you can set up an action so that the output for all instances is combined into one report. |
- | the list of actions that the operator can select from. | + | #From the ‘Sentries’ window, select the sentry. |
- | Type Select whether or not you want the output from the command to | + | #Select Maintain > Actions. The ‘Actions for sentry <sentry_name>’ window opens. |
- | be displayed on the operator’s screen: | + | #Select Maintain > Add. The ‘Add actions for sentry <sentry_name>’ form opens.(Tip: If a similar action has already been defined for a sentry that uses the same agent, it may be faster to use Maintain > Copy.) |
- | action Simply runs the command without displaying any output. Example: | + | |
- | starting a service when it is stopped. | + | [[Image:sentinelfigure30.jpg|frame|Figure 30 — Action details form]] |
- | report Displays the command’s output on the screen. | + | |
- | Command Enter the command, using UNIX shell syntax. The command can | + | ;Action: Enter a name for the action. This is the name that will appear in the list of actions that the operator can select from. |
- | make use of the variables $Sentry, $Host, and $Action, | + | ;Type: Select whether or not you want the output from the command to be displayed on the operator’s screen: |
- | which will be set in the environment when the action is run. For | + | :;action: Simply runs the command without displaying any output. Example: starting a service when it is stopped. |
- | multi-instance sentries you can also refer to $Instance, which | + | :;report: Displays the command’s output on the screen. |
- | contains the instance name. To use agent variables in the | + | ;Command: Enter the command, using UNIX shell syntax. The command can make use of the variables $Sentry, $Host, and $Action, |
- | command, select Uses agent data? below. | + | which will be set in the environment when the action is run. For multi-instance sentries you can also refer to $Instance, which |
- | This example shows how to define a simple report for a singleinstance | + | contains the instance name. To use agent variables in the command, select Uses agent data? below. |
- | sentry: | + | :This example shows how to define a simple report for a singleinstance sentry: |
- | echo -n "Report '$Action' "; date; echo " | + | <code>echo -n "Report '$Action' "; date; echo "</code> |
- | Sentry: $Sentry"; echo " Host: $Host"; | + | <code>Sentry: $Sentry"; echo " Host: $Host";</code> |
- | When the report is run it will display the name of this action, the | + | :When the report is run it will display the name of this action, the date, and the name of the sentry and the host it runs on. For more details and examples of actions and reports, see [[Actions]]. |
- | date, and the name of the sentry and the host it runs on. For more | + | ;Display command: If Type is report, enter a command to display the output from the Command (examples: scroll, db_scroll, db_graph). The default is Sentinel3G’s own browser widget. If the action is run on several instances, all the output from all the |
- | details and examples of actions and reports, see Actions on page 24. | + | |
- | Display command | + | |
- | If Type is report, enter a command to display the output from | + | |
- | the Command (examples: scroll, db_scroll, db_graph). | + | |
- | The default is Sentinel3G’s own browser widget. If the | + | |
- | action is run on several instances, all the output from all the | + | |
commands will be piped to the same display command. | commands will be piped to the same display command. | ||
- | Display command is optional if Command handles the | + | :Display command is optional if Command handles the displaying of the data itself. |
- | displaying of the data itself. | + | ;Run as user: Choose a user name from the password file on this host. The command will run with the privileges of this user. The default account is root. Example: some RDBMS packages require that certain administrative commands be run from a special DB admin |
- | Run as user Choose a user name from the password file on this host. The | + | |
- | command will run with the privileges of this user. The default | + | |
- | account is root. Example: some RDBMS packages require that | + | |
- | certain administrative commands be run from a special DB admin | + | |
account. | account. | ||
- | Configuring Sentries and Agents 131 | + | ;In state(s): You can make this action available in only certain states. Example: the Services sentry has Stop and Restart actions that are available when a service is in a state that indicates it is running, and a Start action when it is not running. |
- | In state(s) You can make this action available in only certain states. Example: | + | :Click to choose one or more states. Leave this field blank to make the action available at all times. |
- | the Services sentry has Stop and Restart actions that are | + | ;Access role: If set, only users with the specified role can perform this action. If blank, all Sentinel3G users who have the action capability may perform this action. |
- | available when a service is in a state that indicates it is running, and | + | ;Authenticate?: Tick this field to ask for the operator’s password before running the action. |
- | a Start action when it is not running | + | ;Uses agent data?: Tick this checkbox if you wish to use any of the agent variables in the Command or Display command fields. This gives the commands access to the same primary agent variables as the sentry. |
- | Click to choose one or more states. Leave this field blank to | + | ;Reads from STDIN?: If Uses agent data? is ticked, use this field to specify where the action can find the data: |
- | make the action available at all times. | + | :;no: Command will expect the variables to be set in the environment and accessed by name (e.g. $pct_free). |
- | Access role If set, only users with the specified role can perform this action. If | + | :;yes: The data will be passed from STDIN in Functional Database format. Use this option if you wish to manipulate the data using the Functional Toolset. |
- | blank, all Sentinel3G users who have the action capability may | + | ;Fire agent: Tick this checkbox if you wish the sentry’s agent to be polled after the action has been run. |
- | perform this action. | + | ;Export to parent?: Tick this checkbox if you wish the action to be available from the parent folder of this sentry. If the action is exported, operators will be able to choose this action both in relation to a selected sentry and for all sentries in the parent folder. |
- | Authenticate? Tick this field to ask for the operator’s password before running | + | :Example: a Free Space report that displays details for a selected filesystem (single sentry) or all user filesystems on a host (parent folder). |
- | the action. | + | |
- | Uses agent data? | + | Click Accept to save the action. |
- | Tick this checkbox if you wish to use any of the agent variables in | + | |
- | the Command or Display command fields. This gives the | + | To test the action from the console, select the sentry and then select Sentry > Action. |
- | commands access to the same primary agent variables as the | + | |
- | sentry. | + | == Adding a Realtime Graph == |
- | Reads from STDIN? | + | |
- | If Uses agent data? is ticked, use this field to specify where | + | |
- | the action can find the data: | + | |
- | no Command will expect the variables to be set in the environment | + | |
- | and accessed by name (e.g. $pct_free). | + | |
- | yes The data will be passed from STDIN in Functional Database format. | + | |
- | Use this option if you wish to manipulate the data using the | + | |
- | Functional Toolset. | + | |
- | Fire agent Tick this checkbox if you wish the sentry’s agent to be polled after | + | |
- | the action has been run. | + | |
- | Export to parent? | + | |
- | Tick this checkbox if you wish the action to be available from the | + | |
- | parent folder of this sentry. If the action is exported, operators will | + | |
- | be able to choose this action both in relation to a selected sentry | + | |
- | and for all sentries in the parent folder. | + | |
- | 132 Configuring Sentries and Agents | + | |
- | Example: a Free Space report that displays details for a selected | + | |
- | filesystem (single sentry) or all user filesystems on a host (parent | + | |
- | folder). | + | |
- | 4. Click Accept to save the action. | + | |
- | To test the action from the console, select the sentry and then select Sentry > | + | |
- | Action. | + | |
- | Configuring Sentries and Agents 133 | + | |
- | Adding a Realtime Graph | + | |
Realtime graphs plot recent values returned by selected variables for a sentry. | Realtime graphs plot recent values returned by selected variables for a sentry. | ||
- | 1. From the ‘All Sentries’ window, select the sentry. | + | |
- | 2. Select Maintain > Realtime graphs. The ‘Realtime graphs for | + | [[Image:sentinelfigure31.jpg|frame|Figure 31 — Example: displaying free disk space as a stack graph]] |
- | sentry <sentry_name>’ window opens. | + | |
- | 3. Select Maintain > Add. The ‘Add realtime graphs for sentry | + | #From the ‘All Sentries’ window, select the sentry. |
- | <sentry_name>’ form opens. | + | #Select Maintain > Realtime graphs. The ‘Realtime graphs for sentry <sentry_name>’ window opens. |
- | (Tip: If a similar graph has already been defined for a sentry that uses the | + | #Select Maintain > Add. The ‘Add realtime graphs for sentry <sentry_name>’ form opens. |
- | same agent, it may be faster to use Maintain > Copy.) | + | |
- | Figure 31 — Example: displaying free disk space as a stack graph | + | ---- |
- | 134 Configuring Sentries and Agents | + | ;Tip: If a similar graph has already been defined for a sentry that uses the same agent, it may be faster to use Maintain > Copy. |
- | Specify attributes of the graph | + | ---- |
- | Graph type Select the type of graph you wish to use to display the data: | + | |
- | line line graphs are useful for gauging trends. | + | === Specify attributes of the graph === |
- | bar bar graphs are useful for comparing variables within one observation | + | |
- | or comparing adjacent observations. | + | ;Graph type: Select the type of graph you wish to use to display the data: |
- | stack stack graphs are typically used where all values add up 100%, | + | :;line: line graphs are useful for gauging trends. |
- | Example: CPU usage = %user + %system + %idle | + | :;bar: bar graphs are useful for comparing variables within one observation or comparing adjacent observations. |
+ | :;stack: stack graphs are typically used where all values add up 100%. Example: CPU usage = %user + %system + %idle | ||
+ | |||
Figure 32 shows the same data presented using each type of graph: | Figure 32 shows the same data presented using each type of graph: | ||
- | Figure 32 — Sample disk space data shown as line, bar, and stack graphs | + | |
- | Polls displayed The number of values to display across the X-axis of the graph. | + | [[Image:sentinelfigure32.jpg|frame|Figure 32 — Sample disk space data shown as line, bar, and stack graphs]] |
- | For example, if you enter 3, values from the last three polls will be | + | |
- | displayed. | + | ;Polls displayed: The number of values to display across the X-axis of the graph. For example, if you enter 3, values from the last three polls will be displayed. |
- | Line graph… | + | |
- | …Stack graph | + | The first two fields control the scale on the graph’s Y-axis. If Min value and Max value are not specified, the Y-axis will be sized to the current minimum and maximum data value. This means the scale may change as new values are graphed. To |
- | Bar graph… | + | |
- | Configuring Sentries and Agents 135 | + | |
- | The first two fields control the scale on the graph’s Y-axis. If Min value and Max | + | |
- | value are not specified, the Y-axis will be sized to the current minimum and maximum | + | |
- | data value. This means the scale may change as new values are graphed. To | + | |
keep the scale constant, set both Min value and Max value. | keep the scale constant, set both Min value and Max value. | ||
- | Use close minimum and maximum values if you want to focus on relatively small | + | |
- | differences among data values. For example, if a set of variables is mainly of interest | + | Use close minimum and maximum values if you want to focus on relatively small differences among data values. For example, if a set of variables is mainly of interest when the values are clustered near 100%, a minimum value of 90 will help to separate them. |
- | when the values are clustered near 100%, a minimum value of 90 will help to separate | + | |
- | them. | + | ;Min value: The minimum value to display next to the Y-axis. If the values will always be positive, set Min value=0. |
- | Min value The minimum value to display next to the Y-axis. If the values will | + | ;Max value: The maximum value to display next to the Y-axis. |
- | always be positive, set Min value=0. | + | ;Scale to max?: (For stack charts only) Tick this checkbox if you want the values to be scaled so that their sum equals Max value. This is useful where the total adds up approximately to Max Value. Scaling ensures a flat top to the stack. |
- | Max value The maximum value to display next to the Y-axis. | + | |
- | Scale to max? (For stack charts only) Tick this checkbox if you want the values | + | === Specify variables === |
- | to be scaled so that their sum equals Max value. This is useful | + | |
- | where the total adds up approximately to Max Value. Scaling | + | You can now choose the names of up to five agent variables whose values are to be graphed. |
- | ensures a flat top to the stack. | + | |
- | Specify variables | + | Click next to the Variable details field to specify the attributes of the chosen variables. For example, Figure 33 shows the details for two variables called MBfree and MBused. |
- | You can now choose the names of up to five agent variables whose values are to be | + | |
- | graphed. | + | [[Image:sentingfigure33.jpg|frame|Figure 33 — Example of a realtime graph that plots two variables]] |
- | Click next to the Variable details field to specify the attributes of the | + | |
- | chosen variables. For example, Figure 33 shows the details for two variables called | + | |
- | MBfree and MBused. | + | |
- | 136 Configuring Sentries and Agents | + | |
- | Figure 33 — Example of a realtime graph that plots two variables | + | |
For each variable, enter the following details: | For each variable, enter the following details: | ||
- | Color Select the color to be used to display this variable. | + | ;Color: Select the color to be used to display this variable. |
- | Label If you wish you can change the default label displayed for this | + | ;Label: If you wish you can change the default label displayed for this variable. For example, if you wish to scale down a value for free disk space by a factor of 1000, you could also change the label to read GB (gigabytes) instead of MB (megabytes). |
- | variable. For example, if you wish to scale down a value for free | + | ;Scale by: This is an optional scaling factor. The values displayed will be multiplied (scaled up) by this factor. Use this to convert very large or small numbers to more manageable units. Example: specify 0.001 to divide the reported values by 1000. |
- | disk space by a factor of 1000, you could also change the label to | + | |
- | read GB (gigabytes) instead of MB (megabytes). | + | |
- | Scale by This is an optional scaling factor. The values displayed will be | + | |
- | multiplied (scaled up) by this factor. Use this to convert very large | + | |
- | or small numbers to more manageable units. Example: specify | + | |
- | 0.001 to divide the reported values by 1000. | + | |
Click Return to save the variable details. | Click Return to save the variable details. | ||
- | Configuring Sentries and Agents 137 | + | |
- | Specify threshold markers | + | === Specify threshold markers === |
You can now specify up to four markers to be superimposed over the data values. | You can now specify up to four markers to be superimposed over the data values. | ||
- | Each marker is displayed as a colored horizontal line and represents a state threshold | + | |
- | or other significant value. You can specify both constants associated with this | + | Each marker is displayed as a colored horizontal line and represents a state threshold or other significant value. You can specify both constants associated with this sentry and enter arbitrary integers or floating-point numbers (such as 20, 40, 60, |
- | sentry and enter arbitrary integers or floating-point numbers (such as 20, 40, 60, | + | |
80). | 80). | ||
- | Click next to the Threshold markers field to specify up to four markers. | + | Click next to the Threshold markers field to specify up to four markers. For each marker, specify these details: |
- | For each marker, specify these details: | + | ;At value: Enter a floating point number, or click to choose one of the constant values defined for this sentry. See [[Maintaining Constants and Thresholds]]. |
- | At value Enter a floating point number, or click to choose one of the | + | ;Color: Select the color to be used to display this threshold. Remember to use a different color from those used to graph the variables. Use the Test graph option to see which colors show up best. |
- | constant values defined for this sentry. See Maintaining Constants and | + | :If the threshold is equivalent to a boundary between states, it may be helpful to use the color of the severity level for the higher state. For example, if the constant LOW is the boundary between normal and warning state, and the sentry goes orange when it is in warning state, use orange as the color of the threshold marker. |
- | Thresholds on page 140. | + | |
- | Color Select the color to be used to display this threshold. Remember to | + | [[Image:sentinelfigure34.jpg|frame|Figure 34 — Generating a Test graph to check colors and thresholds]] |
- | use a different color from those used to graph the variables. Use | + | |
- | the Test graph option to see which colors show up best. | + | When you have finished specifying markers, click Return to return to the ‘Add realtime graphs for sentry <sentry_name>’ form. |
- | If the threshold is equivalent to a boundary between states, it may | + | |
- | be helpful to use the color of the severity level for the higher state. | + | === Test the graph === |
- | For example, if the constant LOW is the boundary between | + | Now you can test the appearance of the graph. Click next to the Test graph field to generate a graph based on the settings in the form and the most recent data returned by the agent on this host. |
- | normal and warning state, and the sentry goes orange when it is in | + | |
- | warning state, use orange as the color of the threshold marker. | + | ---- |
- | 138 Configuring Sentries and Agents | + | ;Note: The host monitor must be running and the agent must be returning valid data. |
- | Figure 34 — Generating a Test graph to check colors and thresholds | + | ---- |
- | When you have finished specifying markers, click Return to return to the ‘Add | + | |
- | realtime graphs for sentry <sentry_name>’ form. | + | |
- | Test the graph | + | You can display several graphs at once by experimenting with different settings and clicking Test graph again. If this is a multi-instance sentry you can test different instances. |
- | Now you can test the appearance of the graph. Click next to the Test graph | + | |
- | field to generate a graph based on the settings in the form and the most recent data | + | |
- | returned by the agent on this host. | + | |
- | Note The host monitor must be running and the agent must be returning | + | |
- | valid data. | + | |
- | You can display several graphs at once by experimenting with different settings and | + | |
- | clicking Test graph again. If this is a multi-instance sentry you can test different | + | |
- | instances. | + | |
When you are finished with each graph press F3 to dismiss it. | When you are finished with each graph press F3 to dismiss it. | ||
- | Save the graph details | + | |
- | Graph name Enter a unique name to identify this graph. | + | === Save the graph details === |
- | Threshold | + | |
- | markers | + | ;Graph name: Enter a unique name to identify this graph. |
- | 0.75 | + | ;Description: Enter a description that explains what this graph will show or when it should be used. This will help operators to select the correct graph to diagnose problems. |
- | 0.5 | + | ;Title: The title that appears in the heading of the graph. It can contain plain text, a variable such as $Instance (for a multi-instance sentry, the name of this instance), or a combination of the two. |
- | Configuring Sentries and Agents 139 | + | ;Export to parent?: Tick this checkbox if you wish the graph to be available from the parent folder of this sentry. If the graph is exported, operators will be able to choose this graph when they select the parent folder. |
- | Description Enter a description that explains what this graph will show or | + | |
- | when it should be used. This will help operators to select the | + | Click Accept to save the graph. |
- | correct graph to diagnose problems. | + | |
- | Title The title that appears in the heading of the graph. It can contain | + | To test the graph from the console, restart the host monitor, select the sentry and then select Report > Realtime graph. |
- | plain text, a variable such as $Instance (for a multi-instance | + | |
- | sentry, the name of this instance), or a combination of the two. | + | == Maintaining Constants and Thresholds == |
- | Export to parent? | + | Constants are like variables, but they are associated with a sentry rather than an agent. You can use constants in a state’s Entry condition field to define thresholds between states, and as a visual aid on realtime graphs. |
- | Tick this checkbox if you wish the graph to be available from the | + | |
- | parent folder of this sentry. If the graph is exported, operators will | + | Example of use: You create a sentry and its states. Some of the states have an entry condition that compares the current data value from the agent with a constant such as VERY_LOW. You clone the sentry. The same set of states is shared between the old and new sentry, but you set the constant VERY_LOW to different a value in each sentry. |
- | be able to choose this graph when they select the parent folder. | + | |
- | 4. Click Accept to save the graph. | + | === To display the constants for a sentry === |
- | To test the graph from the console, restart the host monitor, select the sentry and | + | #From the ‘All Sentries’ window, select the sentry. |
- | then select Report > Realtime graph. | + | #Select Maintain > Constants. The ‘Constants for sentry <sentry_name>’ window opens. |
- | 140 Configuring Sentries and Agents | + | |
- | Maintaining Constants and Thresholds | + | === To add a constant for a sentry === |
- | Constants are like variables, but they are associated with a sentry rather than an | + | #Select Maintain > Add. The ‘Add Constants’ form opens. |
- | agent. You can use constants in a state’s Entry condition field to define | + | #Enter the following fields: |
- | thresholds between states, and as a visual aid on realtime graphs. | + | #;Constant: Enter a name for the constant. The convention is to use uppercase letters and underscores only (e.g., HALF_FULL). The name must be different from other constants belonging to this sentry, though another sentry can have a constant with the same name. |
- | Example of use: You create a sentry and its states. Some of the states have an entry | + | #;Value: Set the value of the constant (examples: 3; 0.5; true). |
- | condition that compares the current data value from the agent with a constant such | + | #;Comment: Enter an optional comment. |
- | as VERY_LOW. You clone the sentry. The same set of states is shared between the | + | #;Group override?: Can an instance group override the value of this constant? If this option is set to yes, the value of this constant always applies to any instance that uses it. If this option is set to no, the value set in an instance group can override the value set here. |
- | old and new sentry, but you set the constant VERY_LOW to different a value in each | + | #Click Accept to save the constant. |
- | sentry. | + | === To adjust the values of a sentry’s constants === |
- | To display the constants for a sentry | + | You can adjust the values of all the constants belonging to a sentry. You can use this to fine tune the thresholds at which a sentry changes from one state to another. |
- | 1. From the ‘All Sentries’ window, select the sentry. | + | #From the ‘Constants for sentry <sentry_name>’ window, select Maintain > Change values. |
- | 2. Select Maintain > Constants. The ‘Constants for sentry | + | #Change the values next to any of the constants. |
- | <sentry_name>’ window opens. | + | #Click Accept to save the new values. |
- | To add a constant for a sentry | + | |
- | 1. Select Maintain > Add. The ‘Add Constants’ form opens. | + | <b>Note</b>: If the sentry has Instance Groups defined, you may also need to change the values of constants defined there. |
- | 2. Enter the following fields: | + | |
- | Configuring Sentries and Agents 141 | + | == Running Agents or Sentries in Test Mode == |
- | Constant Enter a name for the constant. The convention is to use uppercase | + | When you are developing sentries, you leave them “switched off ” until you are ready to move them to production mode. You can use the commands "hostmon -A" to test agents, and "hostmon -T" command to test sentries (and their agents) even if they are off, or are in KBs that are off. |
- | letters and underscores only (e.g., HALF_FULL). The name | + | |
- | must be different from other constants belonging to this sentry, | + | Running the sentries in test mode will attempt to start all the agents required by the selected sentries and display status messages including an error messages. You can correct any configuration problems and retest the sentries. When you are satisfied that the sentries will work correctly you can change their condition to on. |
- | though another sentry can have a constant with the same name. | + | |
- | Value Set the value of the constant (examples: 3; 0.5; true). | + | |
- | Comment Enter an optional comment. | + | |
- | Group override? Can an instance group override the value of this constant? If this | + | |
- | option is set to yes, the value of this constant always applies to any | + | |
- | instance that uses it. If this option is set to no, the value set in an | + | |
- | instance group can override the value set here. | + | |
- | 3. Click Accept to save the constant. | + | |
- | To adjust the values of a sentry’s constants | + | |
- | You can adjust the values of all the constants belonging to a sentry. You can use this | + | |
- | to fine tune the thresholds at which a sentry changes from one state to another. | + | |
- | 1. From the ‘Constants for sentry <sentry_name>’ window, select | + | |
- | Maintain > Change values. | + | |
- | 142 Configuring Sentries and Agents | + | |
- | 2. Change the values next to any of the constants. | + | |
- | 3. Click Accept to save the new values. | + | |
- | Configuring Sentries and Agents 143 | + | |
- | Running Sentries in Test Mode | + | |
- | When you are developing sentries, you leave them “switched off ” until you are | + | |
- | ready to move them to production mode. You can use the hostmon -T command | + | |
- | to test sentries even if they are off, or are in KBs that are off. | + | |
- | Running the sentries in test mode will attempt to start all the agents required by the | + | |
- | selected sentries and display status messages including an error messages. You can | + | |
- | correct any configuration problems and retest the sentries. When you are satisfied | + | |
- | that the sentries will work correctly you can change their condition to on. | + | |
To test sentries, start a Sentinel3G shell then run hostmon -T <sentries…>. | To test sentries, start a Sentinel3G shell then run hostmon -T <sentries…>. | ||
+ | |||
Example: | Example: | ||
+ | <pre> | ||
cos sentinel -c bash | cos sentinel -c bash | ||
hostmon -T Clients Swap_Size | hostmon -T Clients Swap_Size | ||
- | [[Image:Example.jpg]] | + | </pre> |
+ | |||
+ | To just test an agent to verify that it's variables are being set as expected, run hostmon -A <agent> | ||
+ | |||
+ | Example: | ||
+ | <pre> | ||
+ | hostmon -A db_agent | ||
+ | </pre> |
Current revision
This chapter describes how to define sentries to monitor resources and respond to events. The main procedures, in order, are:
- define an agent to collect data about the resource you wish to monitor
- define raw and derived variables to pass key parts of this data to the sentry
- define a sentry to represent the resource on the console, and specify whatvariables are to be logged
- define all the different states that the sentry may be in, including any automatic responses that may be run to solve problems
- define actions, reports, and graphs that operators may choose to run at the time of an alert
The first topic lists some questions you should look into when preparing to configure sentries.
- Note
- You must have the Admin or Manager role to configure sentries.
Contents |
Planning your Sentry
When preparing to build sentries it is useful to consider these questions:
What object or resource do you wish to monitor?
What is the purpose of monitoring this resource:
- To alert operations staff when a threshold is reached?
- To display useful information?
- To record information for future analysis?
How can the status of the resource be queried:
- By running a command?
- By checking in a log file?
- By querying a database?
- By querying an SNMP MIB?
- By receiving data from an application via an API?
Does running one command return data about several instances of an object or resource? Multiple instances imply a ‘cloning’ sentry, in which case one of the attributes or variables that uniquely identifies each instance must be designated as the key column. Will every instance be monitored, or should selected instances be filtered out?
What part of the output from the agent do you wish to capture?
- The exit status?
- Some part of a message in a log file?
- Some part of the text output from STDOUT or STDERR?
How can a message of interest be identified (e.g., by its position? by searching for a unique string or pattern)? How can portions of the data destined to be saved in variables be separated from the rest of the output? Can the output be simplified by discarding header lines or a message prefix?
Will there be ‘spikes’ or gaps in the data that must be allowed for? In other words, when the sentry tests a variable, is it sufficient to test just the current value, or would it be more realistic to take an average of several recent values?
How often does the agent need to collect data? You may need to consider a tradeoff between running the command too frequently and affecting system performance, and not running it frequently enough, in which case the information provided may not be current.
What agent data (that is, variables) do you want to appear on the console? How should they be formatted?
What are the thresholds or triggers that cause a sentry to move from one state to another? If the sentry remains in a state for a long time, does that indicate that the problem is getting more serious? Or that the problem may have resolved itself ?
Should the sentry be changed to another state after a certain period?
How many different states are of interest? Be careful not to multiply states unnecessarily. For example, if you would not expect Sentinel3G or an operator to respond differently when a sentry is in either of two states, then it’s probably safe to merge one of the states into the other. You can even define an ‘information only’ sentry with no states, which simply logs and displays on the console information supplied ny the agent. On the other hand, you may wish to define separate states for reporting purpose. For example, a printer can be idle or printing and both are considered of normal severity, but if you are interested in the ratio of time spent idle to printing, you would need two states to represent this information.
What additional information would be useful to help an operator understand, diagnose, or fix a problem when the sentry is in each state? Is there a command, report, or graph that could be offered to operators to help diagnose the problem?
Is there a standard procedure that should be followed by operators? If so, consider attaching a monitoring notes file to the sentry, state, or agent.
Adding an Agent
Each sentry tests variables supplied by an agent. If the agent has not already been defined you must add it first, then define its variables, before adding the sentry.
- From the console, select Configure > Host monitor.
- If you have not already selected the host to update, Sentinel3G will ask you to choose one now. This is the host that the agent will run on. The ‘All Sentries’ window is displayed, showing details of all the sentries defined on this host.
- Select Tables > Agents. The ‘Agents’ window opens, showing details of all the agents defined on this host.
- Select Maintain > Add. The ‘Add Agent’ form opens.
- Enter these fields:
- Knowledge Base
- The KB that this agent belongs to. Click to choose one of the predefined KBs installed on this host, otherwise leave this field blank if the agent is not associated with a particular KB.
- Agent name
- Enter a unique name for the agent. The name must not contain spaces.
- Class
- Choose the TCL handler. This tells Sentinel3G how to parse the output from the agent command. See Agent Classes and Variables for more details about selecting an agent class.
- API
- An external application sends data via the Sentinel3G API.
- DB
- The agent returns data in Functional Database format, typically as a result of a query on a Functional Database table.
- ExitStatus
- The agent returns the exit status of the command.
- LogFile
- The agent searches in a log file for messages that match a pattern.
- In the Agent options form you specify a select pattern to select records of interest, and an extract pattern for each text string in the record that must be assigned to a variable.
- SNMPPolled
- The agent polls for the current status of a managed object in an SNMP MIB, such as a device or port.
- Text
- The agent returns text data to STDOUT. You can use the Agent options form to filter out extraneous text such as blank lines, header lines, and labels.
- Note Additional classes may be listed depending on which KBs are installed.
- See the documentation accompanying the KB for more information.
- Description
- Enter a description for the agent.
- Command
- If the Class is DB, ExitStatus, or Text, enter the command to be run. Other agent classes don’t require a command as they obtain their data by different means.
- Poll time (secs)
- There are two ways to use the poll time setting.
- A short-running command runs once per poll and terminates immediately after returning its data. Poll time is the time the agent waits before rerunning the command. Note that this is the time between the end of the previous run and the start of the next, so the command won’t run exactly this often. In other words, if the poll time is set to 60 seconds and command takes about 5 seconds, on average the command will run about every 65 seconds.
- A persistent or ‘long-running’ command simply returns to the agent the latest data accumulated at an interval determined by the poll time. The command must accept the polling frequency as an argument, which you pass in an environment variable $PollTime. Example: you wish a command ntping to return data to the agent every 60 seconds. ntping has a flag - p that sets the polling frequency. Set Poll time (secs) to 60 and the Command field to: exec ntping -sentinel -p$PollTime <other_arguments>
- Note Avoid running the command too frequently if testing shows that it may degrade system performance.
Multi-instance agents
- Some agents can return a matrix of values, with each row of the matrix potentially supplying data for one sentry. One example would be an agent that runs the df command to list details of mounted filesystems and returns a sentry instance for each one. An agent that can return a matrix of values and can support multiple sentry instances is called a multi-instance agent. If this is a multi-instance agent, tick the field Multi-instance? and enter these fields also:
- Instance variable
- The instance variable is the key field that uniquely identifies each row. In the df example, the instance variable would be the filesystem name. You cannot add an agent’s variables until the agent itself exists, so type in the variable name now and continue to the next field. You can add the details of the variable later. See Adding Variables Used By the Agent.
- Instance type
- This setting specifies how the list of instances is generated.
- explicit
- Each sentry will explicitly list the names of the instances. For example, a sentry that monitors log files would list the names of the files it will monitor. The agent will ‘gather’ all the instance names from its associated sentries and pass them to the Command as $Instances.
- cloning
- The agent creates or ‘clones’ instances for each row of data returned by the command.
- both
- The final list of instances can include both instances specified explicitly by sentries and instances discovered by the agent. For example, the ProcessInfo agent can be set up always to monitor specific system processes, but also to discover arbitrary application or user processes.
If Instance type is set to cloning or both, you can specify patterns for instance names to be included in or excluded from the list returned by the agent. If Instance type is explicit, Include and Exclude are disabled.
Include and Exclude work in much the same way: the agent gets some data for a particular instance, then, if Include is set, it tests whether the instance name matches any of the patterns. If not it rejects the instance. If the name did match one of the patterns, or if Include is not set, it then matches the instance against the patterns in the Exclude field. If there is a match, the instance is rejected.
- Include
- An optional list of patterns used to select instances to be included.
- Exclude
- An optional list of patterns used to select instances to be excluded. To stop a sentry being created for a particular instance even if a row is returned, enter its name here. For example if the purpose of the agent is to monitor available disk space, you can exclude rows that represent read-only filesystems such as a CD-ROM drive.
Agent options
You can pass flags and other arguments recognized by this agent class.
- Agent options
- Click to see the available options. These depend on the Class:
- If the class is Logfile, see Agent options for LogFile class.
- If the class is SNMPPolled, see Agent options for SNMPPolled class
- If the class is Text, see Agent options for Text class.
If no options are available for this class the button is disabled.
- Discovery pgm
- An optional command that is run before the agent starts. Its job is to return an exit status of true or false based on the existence or status of a resource. If the discovery program returns false, this agent and its associated sentries will not be started. For more details and examples, see Discovery program.
- Notes file
- Enter the file name only (without the path) of a notes file in the Sentinel3G doc directory. These notes will be available from the console to operators when monitoring or responding to alerts relating to this agent. Typically the notes file would describe the variables returned by the agent.
Click Accept to save the agent.
Agent options for LogFile class
The Logfile agent class checks the contents of an ASCII file (usually a log file) for lines that match a pattern. A line is defined as a string of characters terminated by a newline character. One record in a log file may comprise one or more lines. When such a line is found, portions of the record can be assigned to one or more variables.
- Logfile name
- The name of the file to be monitored.
- Select pattern
- A regular expression that is used to select records from the log file. Lines matching this pattern will be returned by the agent.
- Record length
- The total number of lines in each record.
- Record offset
- How many lines before the matching line is the first line in the record.
- Strip initial chars
- Ignore this number of characters at the start of every line. Use this to discard a fixed-length prefix (such as a time-stamp) if it will not be used by the sentry.
- Clear pattern
- Remove any string that matches this regular expression and replace it with a tab. Use this to discard any text or fields that will not be passed by the agent as a variable, or to simplify the Select pattern by removing extraneous or variable-length text from the middle of a line.
- Split data by
- How should the agent assign parts of each matching line to variables? In this field you specify how to break the data into columns. See Assigning Text and Log File Data to Variables. Later when adding variables for this agent you will specify which columns to assign to each variable – see Adding Variables Used By the Agent.
- column
- Don’t split the data into fields. Instead each line will be treated as a string of characters, with the first character being column 1, the second character column 2, and so on. You will use the "c <col>-<col>" format in the Column field to define each variable.
- whitespace
- Break the line into a series of columns separated by white space. Each column is numbered in turn, starting from column 1.
- tab
- Break the line into a series of tokens separated by a single tab character (two tabs in a row define a NULL field between them). Each column is numbered in turn, starting from column 1. Clear pattern should be used to replace an unwanted string with a single tab character before the fields are split.
- pattern
- Specify a regular expression containing at least one extract pattern, each of which is contained in parentheses. The first extract pattern is treated as column 1, the second extract pattern column 2, and so on.
If you selected pattern, specify one or more regular expressions to match patterns in each line of the record.
- Pattern (line 1)
- Extract matching variable(s) from the first line in the record.
- More patterns
- Click to extract data from additional lines in the record. For example, specify a regular expression in the Pattern (line 2) field to extract matching variables from the second line in the record.
Example: selecting a multi-line record from a log file
Here is a fragment from a log file, showing one record of five lines.
INFILE:/data/MFLA/2001Q2/DACCin/reg3.dat Validating file...... DACCedit V4.1 © 1991–99 TransDACC Ltd. 17 transactions flagged 1128 transactions passed
The unique string that identifies this record is the program name, DACCedit, at the start of the third line. The first line containing a variable we wish to keep (the input file) is two lines earlier. If any transactions were flagged we wish the agent to report the file name and the number of transactions that were flagged.
- Set Select pattern to DACCedit
- Set Record length to 5
- Set Start record to 2 (the first line in the record is two lines before the line containing the select pattern)
- Set Split data by to pattern
- Set Pattern (line 1) to: INFILE:(.*)
- Leave Pattern (line 2) blank
- Leave Pattern (line 3) blank
- Set Pattern (line 4) to: ([0-9]*) transactions flagged
When you have finished setting the LogFile agent options, click Accept to return to the main Add Agent form.
Agent options for SNMPPolled class
- MIB file name
- The name of the MIB file, which Sentinel3G expects to find in the directory lib/tnm2.1.10/mibs under the COSmanager home directory. If the MIB file is not already stored there, copy it there now.
- Multi-host?
- Select yes if this agent will query multiple hosts, each specified by a separate instance. If this agent does not support multiple instances (that is, if the Multi-instance? field on the Add Agent form is set to no), this field will be disabled.
- IP address
- The IP address or hostname of the host to be queried.
- Port
- The UDP port number or service name used for SNMP queries (usually specified in /etc/services).
- SNMP table
- The name of a table contained in the MIB, which specifies a set of sequences relating to a device being monitored. The agent will “walk the tree” specified in this table to obtain details of each component of the device. For example, if this table specifies that a switch has multiple ports, the agent queries the switch to get the specified details of each port.
- SNMP version
- The version of SNMP that the MIB file conforms to.
- Community
- The SNMP community to identify and validate the sender of SNMP messages (SNMPv1 and SNMPv2c only).
- User
- The user name to identify the sender of SNMP messages (SNMPv2u only).
- Password
- Password corresponding to the user name (SNMPv2u only).
- Timeout
- (secs) The maximum time to wait for a response from the node being polled.
When you have finished setting the SNMPPolled agent options, click Accept to return to the main Add Agent form.
Agent options for Text class
- Record length
- The total number of lines in each record.
- Skip initial lines
- How many lines to skip at the start of the record. You can use this field to skip a repeating title or header.
- Skip initial records
- When the agent starts up, some spurious alerts may be generated from the first couple of polls. For example if the agent is being started during the system boot procedure, the resource being monitored may be under an unusual load from all the other user processes being started, or a large number of events may have accumulated while the agent was not running. You can choose not to process the data collected by the agent in the first few polls. Examples: enter 2 to skip the first two polls; enter 1 to skip only the first poll. Sentinel3G can extract variables from up to four lines of data. If a record contains more than four lines, you can use the next two fields to discard lines that don’t contain data needed by a sentry, such blank lines and headers.
Skip blank lines Select yes to discard blank lines. Skip pattern Skip lines containing a match for this pattern. Use this to discard lines that won’t be used to set variables.
- Skip initial chars
- Ignore this number of characters at the start of every line. Use this to discard a fixed-length prefix (such as a time-stamp) if it will not be used by the sentry Clear pattern Remove any string that matches this regular expression and replace it with a tab. Use this to discard any text or fields that will not be passed by the agent as a variable, or to simplify the Select pattern by removing extraneous or variable-length text from the middle of a line.
- Split data by
- How should the agent assign parts of each matching line to variables? In this field you specify how to break the data into columns–see Assigning Text and Log File Data to Variables. Later. when adding variables for this agent you will specify which columns to assign to each variable–see Adding Variables Used By the Agent
- column
- Don’t split the data into fields. Instead each line will be treated as a string of characters, with the first character being column 1, the second character column 2, and so on. You will use the "c <col>-<col>" format in the Column field to define each variable.
- whitespace
- Break the line into a series of columns separated by white space. Each column is numbered in turn, starting from column 1.
- tab
- Break the line into a series of tokens separated by a single tab character (two tabs in a row define a NULL field between them). Each column is numbered in turn, starting from column 1. Clear pattern should be used to replace an unwanted string
with a single tab character before the fields are split.
- pattern
- Specify a regular expression containing at least one extract pattern, each of which is contained in parentheses. The first extract pattern is treated as column 1, the second extract pattern column 2, and so on.
If you selected pattern, specify one or more regular expressions to match patterns in each line of the record.
- Pattern (line 1)
- Extract matching variable(s) from the first line in the record.
- More patterns
- Click to extract data from additional lines in the record. For example, specify a regular expression in the Pattern (line 2) field to extract matching variables from the second line in the record.
When you have finished setting the Text agent options, click Accept to return to the main Add Agent form.
Adding Variables Used by the Agent
- From the console, select Configure > Host monitor.
- If you have not already selected the host to update, Sentinel3G will ask you to choose one now. This is the host that the agent will run on. The ‘All Sentries’ window opens, showing details of all the sentries defined on this host.
- Select Tables > Agents. The ‘Agents’ window opens, showing details of all the agents defined on this host.
- Select the agent, then select Maintain > Variables. The ‘Variables from Agent <agent_name>’ window opens.
- Select Maintain > Add. The ‘Add Variable’ form opens.
- Variable name
- Enter a name for the variable. The name must not contain spaces, but it may contain underscores instead. It must not have the same name as another variable belonging to this agent. It doesn’t need to be unique across agents; two agents may have variables with the same name. If the agent is in the API class, the name must match one of the variable names passed by the external application. By convention, agent variable names are lower case.
- Class
- Select one of these options, depending on how the value of the variable will be set:
- raw
- The value is set by the agent.
- derived
- The value will be computed from other variables (in the Expression field on this form). This is often used to express the value of another variable in a different way, such as a rate, proportion, or percentage.
- trigger
- If this variable is included in the list of trigger variables for a state, the Expression field will be evaluated when the sentry changes into that state. This is usually used to save the previous value when a new value is received, so that the old and new values can be compared.
- Type
- The internal data type.
- number
- An integer or floating point number.
- string
- A text string.
- boolean
- Mainly used by agents in the ExitStatus class. If the command returns 0, the boolean value is true, otherwise it is false.
- date
- A date in Functional Database internal format. The date will be stored in the form YYYYMMDD and output as MM/DD/YY (U.S. display format) or DD/MM/YY (European display format). Mainly used by agents in the DB class.
- datetime
- A date and time in Functional Database internal format. The date will be stored in the form YYYYMMDD.hhmmss and output as MM/DD/YY-hh:mm (U.S. display format) or DD/MM/YY-hh:mm (European display format). Mainly used by agents in the DB class.
- clock
- A count of the number of seconds since 1 Jan 1970 (GMT). You can subtract from the current value a clock value saved earlier to return a time period (for example how long a process has run).
- Private
- Leave unchecked if you wish this variable to be available for logging and graphing.
- Description
- A longer description of the source or purpose of this variable.
- Column
- Enter the field name or the column number(s) in the output that you want to assign to this variable. The format depends on the agent class:
- Text or LogFile
- Enter the column number(s)— see Assigning Text and Log File Data to Variables.
- DB
- Enter the column name from the Functional Database dictionary entry.
- SNMPPolled
- Enter the object ID from the MIB.
- API
- Leave this field blank. The variable name and value will be passed explicitly by the external application.
- ExitStatus
- Leave this field blank.
- On NULL
- How should the variable be set if the agent doesn’t return a valid value? The options are:
- zero
- set the variable to zero
- null
- set the variable to null
- ignore
- leave the value of the variable unchanged from the previous poll
- History
- Should recent values be stored for use in state conditions and realtime graphing?
- none
- don’t keep historical values–the previous value will be overwritten
- each
- time the agent polls.
- time
- keep all values collected within a time period.
- count
- keep this number of recent values, one for each time the agent has run.
- Keep the last
- If History is set to time, enter a number of seconds to store all values collected within this period. If History is set to count, enter a number of values to be stored.
- Expression
- An expression (using TCL EXPR syntax) that calculates, modifies, or reformats the current value of the variable (see Expressions). How the expression is used depends on the variable class:
- derived
- Reformulate the value of another variable as a rate, proportion, or percentage. Any variables attached to the same sentry (including history variables) may be used in the expression.
- trigger
- Trigger variables are used to save a previous value that would otherwise be overwritten when the agent receives new data. If this is a trigger variable, use the Expression field to copy the value of another variable you want to save. Any variables attached to the same sentry (including history variables) may be used in the expression.
- raw
- An optional expression to post-process the data received from the agent (contained in $data) E.g change units from KB to MB. Note that for “raw variables” the only variable that can be used in the expression is $data, which is the value returned by the agent.
- Note
- To return a floating point number, put .0 at the end of any constant values. This is because TCL will do an integer calculation if both parameters are integers. Example: if $data is an integer, $data / 1024.0 will return a floating-point value; $data / 1024 will return an integer.
- Initial value
- An expression (using TCL EXPR syntax) that, when evaluated, returns an initial value for the variable. This can be used to set a starting value before the first time the agent polls, or to initialize a trigger variable before it is set with a real value. This is important if, for example, this variable is used elsewhere in an arithmetic expression, to avoid the calculation generating a data error.
The last two fields affect how the variable will appear on the console.
- Units
- Choose from the table of descriptive units. To add a new type of unit, see Maintain List of Numeric Units.
- Decimal places
- The value will be rounded to this number of decimal places.
Click Accept to save the variable.
Adding a Sentry
- If the console is not in Host View, select Go > Hosts.
- Select the host that the agent will run on.
- Select Configure > Host monitor. The ‘All Sentries’ window opens, showing details of all the sentries defined on this host.
- Select Maintain > Add. The ‘Add Sentry’ form opens.
Knowledge Base
Choose the name of the KB that this sentry belongs to. Click to choose one of the predefined KBs installed on this host, otherwise leave this field blank if the agent is not associated with a particular KB.
- Class/Folder
- Choose the name of the folder that the sentry will appear within in the console.
- Sentry
- Enter a name for the sentry. The name must not contain spaces.
- Host
- The host to which this sentry applies. If you leave the field blank it defaults to being the Host Monitor host. If the agent is running remotely from the Host Monitor, you may want the icon for a resource to appear under a different host. If so, you can enter the name of the remote host here.
- On/Off
- Set the initial condition of the sentry. on means the sentry will be operating normally. This means the agent will be running and setting variables to be tested by the sentry. off means the agent is not required to collect data on behalf of this sentry, in which case the agent will not be running (unless it also happens to be collecting data for another sentry). You can switch the sentry off if you wish to test it before running it on a production system— see Running Sentries in Test Mode .
- Description
- Enter a description for the sentry.
- Primary agent
- Click to choose the main agent whose variables supply this sentry with data. If other agents also supply variables, list them in the Secondary agents field on the Advanced options form below. Variables from the primary agent can be referenced by their name alone. Note that if a primary agent and a secondary agent both have a variable with the same name, the primary agent’s variable is used.
Instance details
If the primary agent supports multiple instances, click the Instance details button to specify the instance details. There are three ways to define sentry instances:
- by cloning one instance for each key value returned by the primary agent
- by explicitly defining each instance
- by running a command to ‘discover’ the list of instances
- Clone
- Tick this checkbox to clone (create another instance of) a new sentry for each instance return by the multi-instance agent.
- Clone if
- If Clone is ticked, you can specify an optional TCL expression. New sentries will only be cloned if this expression evaluates to true. Example: You can create two almost identical sentries, one for small filesystems ( < 1GB) and one for large filesystems ( >= 1GB) with different thresholds. Both would use the same agent, but each would have Clone if set to "$size < 1000" and "$size >= 1000" (where $size is in MB) respectively. Sentinel3G will then clone the appropriate sentry only. The two sentries can even have the same name so that they look indistinguishable on the console. If this field is left blank, new sentries will always be cloned.
- Discover insts
- This is an optional command that is run when the Host Monitor starts up. For example, the commend could return a list of object names. A sentry instance will be generated for line of data returned by the command.
Instances
The Instances window enables you to define sentries explicitly. If the Clone field is ticked, you can predefine some of the attributes of the cloned instances (for example, to specify a different label or to turn the instance off).
- Instances
- Click to maintain the list of instance names. Select Maintain > Add to add up to four instances at a time.
- Instance
- The name of a specific instance, which the sentry passes to the agent. through the $Instances variable in the agent command.
- On/Off
- Set the initial condition of the sentry. on means the sentry will be operating normally. off means the sentry will not process agent data unless it is switched on manually.
- Label
- If set, this will be used on the console as the name of the instance. If it is not set, the instance name will be used on the console.
- Group
- If the sentry has Instance Groups defined, you can optionally pre-assign the instance to a particular group here.
- Agent data
- Agent-specific data for this instance (used by certain agents only). Example: the ProcessInfo agent can be passed a regular expression in this field to match process names.
The Instances of Sentry … window includes options to turn off instances or assign them to instance groups.
The Turn on an Turn off methods on the Instance menu simply turn on or off the selected instance. These options work on either cloning or explicit instance sentries. For example, for the cloning sentry Free_Space just add the instance for a particular filesystem and turn it off.
Assign to group lets you add selected instances to a previously defined instance group.
Maintain > Instance groups brings up the instance groups defined for this sentry.
When you have finished filling in the Instances form, click Accept to save the instances. When you have finished adding instances, press F3 in the Instances form to return to the Sentry Instance Details form.
- Note
- Changes to instances will not be processed until you exit the Instances window and restart the Host Monitor.
- Instance label
- If set, this will be used on the console as the name of the instance. You can use a raw variable to give a more meaningful label (for example: use $printer to label each instance with the printer name). If this field is blank the instance will not have a label.
- Separate Logs?
- Tick this checkbox to create a separate log file for each instance. Leave the checkbox blank to write entries for all instances to a combined log file.
When you have finished filling in the Sentry Instance Details form, click Return to complete the remaining fields in the main Add Sentry form.
Console fields
Define how the sentry should appear on the console:
- Text
- Enter a text string to provide extra information about the current state of the sentry. The string will be displayed in the status area of the console, and may contain both informative messages and the values of variables returned by the primary agent or a secondary agent. Example: HTTP hits: &HttpHits; HTTP errors: &HttpErrors If a variable name is prefixed with $, the raw or unformatted value will be displayed. If a variable name is prefixed with & (ampersand), the formatted value will be displayed. The formatted value appends the Units field, if specified, and converts some variable types such as dates from internal storage format to display format.
- Icon
- Click to choose an icon to represent this sentry on the console. To see what each icon looks like or to add new icons, see Add Icons.
- Indicator
- Select a type of overlay icon to represent the current state of the sentry or to give a rough indication (to within 10 percent) of the current data value returned by Variable.
- default
- Represent the current state of the sentry with the default overlay icon for that state or severity.
- pie chart
- Represent the current percentage value returned by Variable as a small pie chart.
- thermometer
- Represent the current percentage value returned by Variable as a thermometer.
- Variable
- Click to choose a variable to be represented next to the icon for this sentry. You specify in the Indicator field whether to show the value of the variable as a pie chart or a thermometer icon.
- Note
- For the indicator to work properly, the variable you choose must always be in the range 0 to 100.
- Default action
- Click to specify the action that is to be performed by default when an operator double-clicks this sentry on the console.
- Variables
- Display the contents of the variables returned by the agent Graph Draw a realtime graph. Click to choose a predefined graph.
- Action
- Run a predefined action command. Click to choose a predefined action.
- Logged_data
- Generate a logged data report. Click to choose a predefined report. Click Return to complete the remaining fields in the main Add Sentry form.
- Notes file
- Enter the file name only (without the path) of a notes file in the Sentinel3G doc directory. These notes will be available from the console to operators when monitoring or responding to alerts relating to this sentry.
Advanced options
Click the Advanced button to display some additional options relating to notification and data logging.
- Notification type
- Select none to turn off notification for this sentry. Select default to use the global NotifyList and
NotifySeverity settings (see Maintain Notification Settings). Select specify to use the Whom to notify and On severity fields to override the global settings.
- Whom to notify
- Choose the name of one or more users to be notified when this sentry changes state. This overrides the global NotifyList setting.
- On severity
- Select a threshold level at which the user(s) listed in the Notify field should be notified. Notification will happen when the sentry changes into a state with this severity or higher. This overrides the global NotifySeverity setting.
- Show variables
- Click to choose the variables collected by the agent that are used by this sentry. You can do this to shorten the list of variables that will be displayed when an operator double-clicks on the sentry, and the list of variables available for graphing and reporting, or for running actions. This is useful for an agent that returns a large number of variables,
such as sar performance statistics which may include CPU, memory, and network, all from the one agent. When an operator double-clicks on a CPU sentry, you can arrange for them to see only the variables relating to CPU statistics. Leave this field blank to show all variables associated with the primary agent.
Data Logging
The fields in the Data Logging frame are used to specify what data will be collected for the Logged Data Report. This report can be used to check the events leading up to an alert, and for longer-term trend analysis and capacity planning.
If you wish to be able to generate a Logged Data Report for this sentry, you need to specify now which numeric variables must be logged and how often. There are two logging methods: time-based (variables are logged at the specified time interval) and state-based (variables are logged after every poll while the sentry is in a specified state or higher).
- Note
- the period for time-based logging is approximate only. The value is actually taken from the next poll after the interval. It follows that there is no benefit to logging data more often than the polling frequency.
Avoid specifying too short a period, otherwise the log files can grow large very quickly. It’s better to log data only occasionally during periods of normal operation. then increase the logging frequency during alerts using the Or on severity field.
- Enable logging
- Select this option to enable collection of logging data for this sentry.
- Log variables
- Click to choose which numeric variables to be logged. A snapshot of the current values of all these variables will be added to the log file at the frequency specified in the Every field.
- Default settings?
- Select this option to use the global DefLogTime and DefLogSeverity settings. Leave this option unselected to
specify a period and minimum severity level manually.
- Every
- Enter a period in minutes. Examples: enter 10 to log data every 10 minutes; enter 60 to log data every hour. Enter 0 to log data every time the agent polls. This field overrides the global DefLogTime setting.
- Or on severity
- You can also log data while the sentry is in a particular state or any state of a higher severity. The variables are logged every time the agent polls. This is a way to selectively log more data during alerts. For example, select severe to log every poll while the sentry is in severe or critical state. To switch off state-based logging, select never. This field overrides the global DefLogSeverity setting.
Secondary agents
- Secondary agents
- If the sentry needs to use variables collected by an agent other than the Primary agent, click to specify these secondary agents.
From the Secondary Agents window, select Maintain > Add then enter the following fields:
- Agent
- Click to choose the secondary agent.
- Instance
- Enter the name of the instance. Leave this field blank to use the same instance as the sentry.
- Trigger sentry?
- Select this option to force the sentry’s state to be reevaluated when the agent returns new data.
Click Accept to save the details of this secondary agent.
When you have finished specifying secondary agents, press F3 to return to the Advanced Sentry Details form.
- Note
- Changes to secondary agents will not be processed until you exit the Secondary Agents window.
- No-data state
- Click to choose a default state to be used if the agent doesn’t return data for the sentry. An instance of this sentry will change to this state if the agent stops returning data for that instance. Example: when a filesystem is unmounted and df no longer returns details about that filesystem, then the sentry for that instance only will be put into the No-data state. A Delete state is often used in cases like this, where multiple instances of a sentry are created by “cloning”, and you wish to selectively suppress any instance while it is not returning data.
- Discovery pgm
- This is an optional command that is run during Host Monitor startup. Its job is to return an exit status of true or false based on the existence or status of a resource. If the discovery program returns false, this sentry will not be started.
When you have finished setting the advanced sentry options, click Accept to return to the main Add Sentry form.
If you have finished defining the sentry, click Accept to save it and return to the ‘All Sentries’ window.
Adding States
The next step is to define the states that the sentry can be in.
- If the console is not in Host View, select Go > Hosts.
- Select the host that the agent will run on.
- Select Configure > Host monitor. The ‘All Sentries’ window opens, showing details of all the sentries defined on this host.
- Select the sentry, then select Maintain > States. The ‘States for sentry <sentry_name>’ window opens.
- States are listed in the order in which the entry conditions are evaluated. By default, this is in order of decreasing severity, with the most severe at the top. The first state whose entry condition evaluates to true will cause the sentry to enter that state. It follows that states with a NULL entry condition (usually the ‘normal’ state) should be last, for example:
- critical $pct_free == 0
- severe $pct_free < 5
- alarm $pct_free < 10
- warning $pct_free < 15
- normal <null>
- You can the drag the states into a different order using the Order > Reorder menu option.
- States are listed in the order in which the entry conditions are evaluated. By default, this is in order of decreasing severity, with the most severe at the top. The first state whose entry condition evaluates to true will cause the sentry to enter that state. It follows that states with a NULL entry condition (usually the ‘normal’ state) should be last, for example:
- There are three ways in which you can add states:
- If the sentry was created by cloning, a separate copy of the original sentry’s states is made. If these states are exactly as required, you don’t need to do anything more. If you need to modify a state in some way, select it now and use Maintain > Change to make the changes (see Maintain State Details).
- If there is another sentry that already has a set of states that are similar or identical to what is required, you can copy that sentry’s states: select Maintain > Copy states, then choose the sentry.
- If the states are exactly as required, you don’t need to do anything more. If you need to modify a state in some way, select it now and use Maintain > Change to make the changes (see Maintain State Details).
- You can add each state manually: select Maintain > Add. The ‘Add States’ form opens, as shown in Figure 27.
- Note
- Copy states is only available if the sentry has no states defined. If the new sentry already has states and you wish use the states of another sentry instead, remove the states from the new sentry first.
Maintain State Details
To define the sentry’s attributes and appearance while in this state, enter these fields:
- State
- Enter a unique name for the state. The name must not contain spaces.
- Severity
- Select the severity level for this state. The severity determines how the sentry will look while in this state (that is, its color and the color and type of any associated overlay icon). Note that this will result in a notification message being sent if the new severity is at or above the notification level specified globally or for this sentry.
- The options are listed in order of increasing severity from normal to critical. disabled is a special severity that indicates the
sentry is ‘down’ or otherwise unavailable, but doesn’t require attention (for example, a device that has been taken offline for maintenance).
- Description
- Enter a description for the state.
- Entry condition
- Enter a conditional expression. This is a TCL expression made up of any combination of agent variables, constants, text strings, numbers, history variables, boolean values, and TCL functions.
Examples: $Status == "Off" && $PID != "-1" $pct_free < $LOW [hist_avg @cpu_idle]
If the entry condition is left blank it evaluates to true. For the correct syntax to refer to variables, see Expressions.
Console
These fields define how the sentry will appear on the console while in this state:
- Text
- Enter a text string to provide extra information about the current state of the sentry. The string will be displayed in the status area of the console, and may contain both informative messages and the values of variables. Example: HTTP hits: &HttpHits; HTTP errors: &HttpErrors…
where HttpHits and HttpErrors are the names of variables returned by the primary agent or a secondary agent.
If you prefix a variable name with & it will be formatted for output on the console.
For example, numeric variables will be displayed with the correct number of decimal places and with the units after the value, while date/time values will be formatted as a readable dates or times. If you prefix a variable name with $ the raw value will be displayed on the console.
You can change the appearance of the sentry’s icon while it is this state, by specifying a different icon or by overlaying the normal icon with an additional indicator icon to modify its appearance.
- Note
- To add new icons, see Add Icons.
- Icon
- Click to choose a different icon to represent this sentry while it is in this state. Example: the sentry normally has a 32x32 icon representing a remote system. When it is in network_down state, you choose to use another icon that has a red X indicating a problem with the network connection.
- If Icon is left blank the default sentry icon will be used.
- Indicator
- Click to choose a 16x16 indicator icon. This is an additional small icon that overlays the main icon in the top right hand corner.
- You can use this to modify the appearance of the icon while the sentry is in this state.
- If Indicator is left blank the overlay specified in the sentry (thermometer or pie chart) will be used. If neither overlay is specified in the sentry, the default overlay icon and color for the current severity is used (see Overlays and Indicator Icons).
- Notes file
- Enter the file name only (without the path) of a notes file in the Sentinel3G doc directory. These notes will be available from the console to operators when monitoring or responding to alerts relating to this sentry.
Trigger variables
Click next to the Trigger vars field to list the trigger variables for this sentry. Choose one or more variables whose values you want to reset when the sentry enters this state (see Trigger Variables).
Responses
The Responses button lets you define several responses that can be run automatically while the sentry is in this state. Responses Click to see the options for notification, escalation and running automatic responses.
The Response form contains three blocks of response fields which are run in turn if the sentry remains in this state for longer than the specified period. Each response block specifies a waiting period, then three actions Sentinel3G can take at the end of each period. The actions are: changing the severity level, which in turn changes the appearance of the sentry on the console; sending a notification message; and running a command. These options are not mutually exclusive; you can specify any or all actions in each response block.
The final group, the Escalation/acknowledgement block, specifies a period after which the sentry will be forced into a different state. See Escalation/acknowledgement.
A typical response is to run a command to fix the problem, and if it succeeds to return the sentry to a normal state. If the command doesn’t fix the problem you may choose to leave the sentry in that state, and specify a later response to run another command or to notify someone.
These responses are all optional. You can specify all three, or one, or even none. If no responses or escalation period are specified here, the sentry will remain in this state until and unless the evaluation condition evaluates to a different severity level.
If a sentry changes state while it is waiting to process a response, then all responses for that state are cancelled, and any responses for the new state are started.
- Wait
- The length of time to wait after running the previous response, or if this is the first response, after entering this state. Each period is cumulative. In other words the wait period for Response #2 is counted from the end of the period for Response #1.
- By default the wait time is in seconds. However an optional suffix of either secs, mins, hrs or polls can also be used to change the unit of time. For example, if the primary agent's poll time is 120 seconds, "2 polls" will wait for 240 seconds before activating.
- If the wait time of Response #1 is set to 0, the response will occur as soon as the sentry enters this state.
- New severity
- After the wait period, change the appearance of the sentry on the console to this new severity level. This would usually be done to trigger a global or sentry-level notification message or to increase the apparent urgency of the event by making the icon flash or change color. Note that changing the severity does not change the state of the sentry. Select unchanged if you wish to leave the severity level as it is.
- Notification
- Click to specify who will be notified if the sentry is still in this state at the end of the wait period.
- In the Type field, select one of the following:
- default
- Use the default notification list for this sentry.
- specify
- In the Who field, choose the names of one or more users.
- These are in addition to the default notification level for this sentry (see Advanced options).
- If you don’t want to do any additional notification for this response, select none.
- Command
- Enter a command to be run by the Host Monitor. This would usually attempt to fix the problem. Example: a ‘free disk space’ sentry could archive files to an offline storage device or remove files such as core and *.o that are deemed expendable.
- Fire agent
- Choose an agent to be run. This agent will be polled immediately at the end of the wait period. To poll a specific instance only, enter the instance name in brackets after the agent name. Example: Filesystem(/tmp). This is useful to cause the sentry's states to be re-evaluated quickly after running a response, rather than having to wait for the next poll.
Escalation/acknowledgement
Another way to respond to an alert is simply to wait for a while to see if the problem corrects itself or more information is received, then to change to another state at the end of that period.
Change to a more severe state if the problem would normally be expected to resolve itself either spontaneously or by the running of the automatic responses. If the sentry is still in this state at the end of the waiting period it suggests some other action must be taken.
Change to a less severe state (typically, normal state) if the problem appears to have been a one-off event. For example, the Bad_SU sentry goes into warning state if a failed su attempt is detected. If no further failed su attempts are detected by the end of the waiting period no action need be taken and the sentry can be returned to normal state.
The change of state may depend on manual confirmation from an operator (Acknowledgement) or it may happen automatically (Escalation).
- Wait
- The length of time to wait after running the previous response, or if there are no previous responses, after entering this state. The format and meaning of this field is the same as for Responses, above.
- Go to state
- Choose the new state to change to.
- Type
- Select acknowledge if the change of state depends on manual confirmation from an operator. Select escalation if the change of state should happen automatically at the end of the waiting period. Typically "acknowledgement" is a change to a less severe state, and "escalation" is to a more severe state.
When you have finished defining responses and escalation details, click Accept to return to the main Add State form.
If you have finished defining this state, click Accept to save it and return to the ‘State Details’ window.
Constants
Click next to the Constants field to maintain the list of constants and thresholds for this sentry. You can add a new constant, change the details of an existing constant, or adjust the threshold values at which the sentry changes from one state to another. For details about all these tasks, see Maintaining Constants and Thresholds.
Adding Instance Groups
A multi-instance sentry can optionally have a set of Instance Groups, a feature which allows certain instances to use different configurations from that defined in the sentry. The configurations that can be set at the Instance Group level are:
- How the instance appears on the console (icon and label)
- Constants / Threshold values used in state conditions
- The user(s) to notify for that instance
An instance is assigned to an Instance Group one of two ways:
- Statically, using the Group field in the Instance form, or
- Dynamically, using the "Assign if" condition.
Once assigned to an Instance Group, the configuration for that group is used, overriding those defined in the sentry.
Notes:
- The assignment to an instance group happens once only when the Host Monitor starts.
- If an instance is not assigned to a group, the default configuration from the sentry is used.
To maintain the Instance Groups of a sentry:
- From the ‘Sentries’ window, select the sentry.
- Select Maintain > Instance groups. The ‘Instance Groups of sentry <sentry_name>’ window opens.
- Select Maintain > Add. The ‘Add instance group (Sentry <sentry_name>)’ form opens.
- Group
- Enter the name of the Instance Group.
- Description
- Enter the description of the group.
- Icon
- To use a different icon on the Sentinel3G console for instances of this group, enter it here.
- Label
- To have a different label on the Sentinel3G console for instances of this group, enter it here. The label may contain the variables $Instance, $Group or any "raw" agent variable.
- Assign if
- If instances are to be dynamically assigned to the group, enter a TCL boolean expression. The expression is evaluated for each instance of the sentry, and when true, the instance is assigned to this group. The expression may contain the variable $Instance and/or and "raw" agent variable.
- Constants
- Click this field to override some or all of the constants (thresholds).
- Notification
- Click this field to override the users to be notified about instances in this group.
Click Accept to save the group.
Adding an Action or Report
A sentry can have several associated actions, which an operator can choose to run from the console. Actions may either be tied to particular states, or can be made available when the sentry is in any state. There are two types: actions typically are used to try to fix a problem; reports display output on the screen and help the operators to diagnose the problem. You can assist operators by explaining in the monitoring notes for the sentry or state when and how each action should be used.
When designing an action for a multi-instance sentry you can set it up to run for selected instances or for every instance in a parent folder. For example, you can set up an action so that the output for all instances is combined into one report.
- From the ‘Sentries’ window, select the sentry.
- Select Maintain > Actions. The ‘Actions for sentry <sentry_name>’ window opens.
- Select Maintain > Add. The ‘Add actions for sentry <sentry_name>’ form opens.(Tip: If a similar action has already been defined for a sentry that uses the same agent, it may be faster to use Maintain > Copy.)
- Action
- Enter a name for the action. This is the name that will appear in the list of actions that the operator can select from.
- Type
- Select whether or not you want the output from the command to be displayed on the operator’s screen:
- action
- Simply runs the command without displaying any output. Example: starting a service when it is stopped.
- report
- Displays the command’s output on the screen.
- Command
- Enter the command, using UNIX shell syntax. The command can make use of the variables $Sentry, $Host, and $Action,
which will be set in the environment when the action is run. For multi-instance sentries you can also refer to $Instance, which contains the instance name. To use agent variables in the command, select Uses agent data? below.
- This example shows how to define a simple report for a singleinstance sentry:
echo -n "Report '$Action' "; date; echo "
Sentry: $Sentry"; echo " Host: $Host";
- When the report is run it will display the name of this action, the date, and the name of the sentry and the host it runs on. For more details and examples of actions and reports, see Actions.
- Display command
- If Type is report, enter a command to display the output from the Command (examples: scroll, db_scroll, db_graph). The default is Sentinel3G’s own browser widget. If the action is run on several instances, all the output from all the
commands will be piped to the same display command.
- Display command is optional if Command handles the displaying of the data itself.
- Run as user
- Choose a user name from the password file on this host. The command will run with the privileges of this user. The default account is root. Example: some RDBMS packages require that certain administrative commands be run from a special DB admin
account.
- In state(s)
- You can make this action available in only certain states. Example: the Services sentry has Stop and Restart actions that are available when a service is in a state that indicates it is running, and a Start action when it is not running.
- Click to choose one or more states. Leave this field blank to make the action available at all times.
- Access role
- If set, only users with the specified role can perform this action. If blank, all Sentinel3G users who have the action capability may perform this action.
- Authenticate?
- Tick this field to ask for the operator’s password before running the action.
- Uses agent data?
- Tick this checkbox if you wish to use any of the agent variables in the Command or Display command fields. This gives the commands access to the same primary agent variables as the sentry.
- Reads from STDIN?
- If Uses agent data? is ticked, use this field to specify where the action can find the data:
- no
- Command will expect the variables to be set in the environment and accessed by name (e.g. $pct_free).
- yes
- The data will be passed from STDIN in Functional Database format. Use this option if you wish to manipulate the data using the Functional Toolset.
- Fire agent
- Tick this checkbox if you wish the sentry’s agent to be polled after the action has been run.
- Export to parent?
- Tick this checkbox if you wish the action to be available from the parent folder of this sentry. If the action is exported, operators will be able to choose this action both in relation to a selected sentry and for all sentries in the parent folder.
- Example: a Free Space report that displays details for a selected filesystem (single sentry) or all user filesystems on a host (parent folder).
Click Accept to save the action.
To test the action from the console, select the sentry and then select Sentry > Action.
Adding a Realtime Graph
Realtime graphs plot recent values returned by selected variables for a sentry.
- From the ‘All Sentries’ window, select the sentry.
- Select Maintain > Realtime graphs. The ‘Realtime graphs for sentry <sentry_name>’ window opens.
- Select Maintain > Add. The ‘Add realtime graphs for sentry <sentry_name>’ form opens.
- Tip
- If a similar graph has already been defined for a sentry that uses the same agent, it may be faster to use Maintain > Copy.
Specify attributes of the graph
- Graph type
- Select the type of graph you wish to use to display the data:
- line
- line graphs are useful for gauging trends.
- bar
- bar graphs are useful for comparing variables within one observation or comparing adjacent observations.
- stack
- stack graphs are typically used where all values add up 100%. Example: CPU usage = %user + %system + %idle
Figure 32 shows the same data presented using each type of graph:
- Polls displayed
- The number of values to display across the X-axis of the graph. For example, if you enter 3, values from the last three polls will be displayed.
The first two fields control the scale on the graph’s Y-axis. If Min value and Max value are not specified, the Y-axis will be sized to the current minimum and maximum data value. This means the scale may change as new values are graphed. To keep the scale constant, set both Min value and Max value.
Use close minimum and maximum values if you want to focus on relatively small differences among data values. For example, if a set of variables is mainly of interest when the values are clustered near 100%, a minimum value of 90 will help to separate them.
- Min value
- The minimum value to display next to the Y-axis. If the values will always be positive, set Min value=0.
- Max value
- The maximum value to display next to the Y-axis.
- Scale to max?
- (For stack charts only) Tick this checkbox if you want the values to be scaled so that their sum equals Max value. This is useful where the total adds up approximately to Max Value. Scaling ensures a flat top to the stack.
Specify variables
You can now choose the names of up to five agent variables whose values are to be graphed.
Click next to the Variable details field to specify the attributes of the chosen variables. For example, Figure 33 shows the details for two variables called MBfree and MBused.
For each variable, enter the following details:
- Color
- Select the color to be used to display this variable.
- Label
- If you wish you can change the default label displayed for this variable. For example, if you wish to scale down a value for free disk space by a factor of 1000, you could also change the label to read GB (gigabytes) instead of MB (megabytes).
- Scale by
- This is an optional scaling factor. The values displayed will be multiplied (scaled up) by this factor. Use this to convert very large or small numbers to more manageable units. Example: specify 0.001 to divide the reported values by 1000.
Click Return to save the variable details.
Specify threshold markers
You can now specify up to four markers to be superimposed over the data values.
Each marker is displayed as a colored horizontal line and represents a state threshold or other significant value. You can specify both constants associated with this sentry and enter arbitrary integers or floating-point numbers (such as 20, 40, 60, 80). Click next to the Threshold markers field to specify up to four markers. For each marker, specify these details:
- At value
- Enter a floating point number, or click to choose one of the constant values defined for this sentry. See Maintaining Constants and Thresholds.
- Color
- Select the color to be used to display this threshold. Remember to use a different color from those used to graph the variables. Use the Test graph option to see which colors show up best.
- If the threshold is equivalent to a boundary between states, it may be helpful to use the color of the severity level for the higher state. For example, if the constant LOW is the boundary between normal and warning state, and the sentry goes orange when it is in warning state, use orange as the color of the threshold marker.
When you have finished specifying markers, click Return to return to the ‘Add realtime graphs for sentry <sentry_name>’ form.
Test the graph
Now you can test the appearance of the graph. Click next to the Test graph field to generate a graph based on the settings in the form and the most recent data returned by the agent on this host.
- Note
- The host monitor must be running and the agent must be returning valid data.
You can display several graphs at once by experimenting with different settings and clicking Test graph again. If this is a multi-instance sentry you can test different instances.
When you are finished with each graph press F3 to dismiss it.
Save the graph details
- Graph name
- Enter a unique name to identify this graph.
- Description
- Enter a description that explains what this graph will show or when it should be used. This will help operators to select the correct graph to diagnose problems.
- Title
- The title that appears in the heading of the graph. It can contain plain text, a variable such as $Instance (for a multi-instance sentry, the name of this instance), or a combination of the two.
- Export to parent?
- Tick this checkbox if you wish the graph to be available from the parent folder of this sentry. If the graph is exported, operators will be able to choose this graph when they select the parent folder.
Click Accept to save the graph.
To test the graph from the console, restart the host monitor, select the sentry and then select Report > Realtime graph.
Maintaining Constants and Thresholds
Constants are like variables, but they are associated with a sentry rather than an agent. You can use constants in a state’s Entry condition field to define thresholds between states, and as a visual aid on realtime graphs.
Example of use: You create a sentry and its states. Some of the states have an entry condition that compares the current data value from the agent with a constant such as VERY_LOW. You clone the sentry. The same set of states is shared between the old and new sentry, but you set the constant VERY_LOW to different a value in each sentry.
To display the constants for a sentry
- From the ‘All Sentries’ window, select the sentry.
- Select Maintain > Constants. The ‘Constants for sentry <sentry_name>’ window opens.
To add a constant for a sentry
- Select Maintain > Add. The ‘Add Constants’ form opens.
- Enter the following fields:
- Constant
- Enter a name for the constant. The convention is to use uppercase letters and underscores only (e.g., HALF_FULL). The name must be different from other constants belonging to this sentry, though another sentry can have a constant with the same name.
- Value
- Set the value of the constant (examples: 3; 0.5; true).
- Comment
- Enter an optional comment.
- Group override?
- Can an instance group override the value of this constant? If this option is set to yes, the value of this constant always applies to any instance that uses it. If this option is set to no, the value set in an instance group can override the value set here.
- Click Accept to save the constant.
To adjust the values of a sentry’s constants
You can adjust the values of all the constants belonging to a sentry. You can use this to fine tune the thresholds at which a sentry changes from one state to another.
- From the ‘Constants for sentry <sentry_name>’ window, select Maintain > Change values.
- Change the values next to any of the constants.
- Click Accept to save the new values.
Note: If the sentry has Instance Groups defined, you may also need to change the values of constants defined there.
Running Agents or Sentries in Test Mode
When you are developing sentries, you leave them “switched off ” until you are ready to move them to production mode. You can use the commands "hostmon -A" to test agents, and "hostmon -T" command to test sentries (and their agents) even if they are off, or are in KBs that are off.
Running the sentries in test mode will attempt to start all the agents required by the selected sentries and display status messages including an error messages. You can correct any configuration problems and retest the sentries. When you are satisfied that the sentries will work correctly you can change their condition to on.
To test sentries, start a Sentinel3G shell then run hostmon -T <sentries…>.
Example:
cos sentinel -c bash hostmon -T Clients Swap_Size
To just test an agent to verify that it's variables are being set as expected, run hostmon -A <agent>
Example:
hostmon -A db_agent