Sentinel3G FAQ
From Documentation
Revision as of 05:43, 4 October 2006 Daniels (Talk | contribs) (→How do I fix this GLIBC error when starting sentinel3G under RedHat 9.0?) ← Previous diff |
Revision as of 05:46, 4 October 2006 Daniels (Talk | contribs) (→How do I print the contents of a window?) Next diff → |
||
Line 30: | Line 30: | ||
If you are able to print the contents of a window then a small printer icon will be displayed next to the search (magnifying glass) icon on the button bar of the window. If the printer icon does not display check the following: | If you are able to print the contents of a window then a small printer icon will be displayed next to the search (magnifying glass) icon on the button bar of the window. If the printer icon does not display check the following: | ||
- | :*Config > COSmanager configuration > Settings > Default printer - by default Default printer is set to either $PRINTER or $LPDEST (depending on platform) | + | :*<cite<Config > COSmanager configuration > Settings > Default printer</cite> - by default Default printer is set to either <em>$PRINTER</em> or <em>$LPDEST</em> (depending on platform) |
:* Check that the printer set by Default printer exists | :* Check that the printer set by Default printer exists | ||
- | :* If set to $PRINTER or $LPDEST check your environment that these variables are set. You may have to set them in .profile (or similar) | + | :* If set to <em>$PRINTER</em> or <em>$LPDEST</em> check your environment that these variables are set. You may have to set them in .profile (or similar) |
Once you know the global setting Default printer is set correctly exit COSmanager completely and rerun cos | Once you know the global setting Default printer is set correctly exit COSmanager completely and rerun cos |
Revision as of 05:46, 4 October 2006
General Questions
How do I fix this GLIBC error when starting sentinel3G under RedHat 9.0?
When starting sentinel3G GUI under RedHat 9.0 I get a message like :
couldn't load file /home/COS_4.2/lib/tnm2.1.10.so : /home/COS_4.2/lib/tnm2.1.10.so: symbol _res, version GLIBC_2.0 not defined in file libc.so.6 with link time reference
.
What causes this, and how do I fix it?
The variable _res in the system GLIBC 2.3.2 (kernel 2.4.20 and probably higher) is no longer global. Thus you get an error when starting the sentinel3G Host Monitor or Event Manager regarding an undefined symbol _res. This results in the Sentinel daemons failing to start, and a message in the sentinel3G logfile.
According to some documents found on the Web, the changes to the DNS code was done to make it 'thread safe’, and this means changing this global variable to a per-thread variable. This makes it incompatible with binaries linked with earlier libraries (as is the case with sentinel3G).
Luckilly there is a simple workaround:
- Set the evironment variable LD_ASSUME_KERNEL to 2.4.18 (or lower) and the libraries will dynamically link the ‘old’ way using the _res global variable.
- The simplest way to get this to happen in sentinel3G is to edit the file ~cosmos/bin/COSstartup, and near the top add the lines:
LD_ASSUME_KERNEL=2.4.18 export LD_ASSUME_KERNEL
Restart cos. You should now be able to start the sentinel3G daemons correctly.
How do I print the contents of a window?
If you are accessing sentinel3G from a PC running our GUI, you can export the contents of a window to the associated application (normally notepad for text files) and then print it.
If you are able to print the contents of a window then a small printer icon will be displayed next to the search (magnifying glass) icon on the button bar of the window. If the printer icon does not display check the following:
- <cite<Config > COSmanager configuration > Settings > Default printer</cite> - by default Default printer is set to either $PRINTER or $LPDEST (depending on platform)
- Check that the printer set by Default printer exists
- If set to $PRINTER or $LPDEST check your environment that these variables are set. You may have to set them in .profile (or similar)
Once you know the global setting Default printer is set correctly exit COSmanager completely and rerun cos
How do I print a graph?
With COSmanager version 4.2 graphs cannot be printed directly. If data is being collected and logged, this can be exported to a CVS file and imported to a spreadsheet application (eg Microsoft Excel) and a graph generated and printed.
If you are accessing sentinel3G from a PC running our GUI, you can export the contents of a window to the associated application (normally excel for csv files) and then either print the raw data, or use the application's graphing facility.
The export feature is also a great way to get information into your reports
How do I download icons to my PC or XWindows GUI?
An icon only needs to be downloaded if you see a sentry represented by a square box with a ‘?’ inside. This can occur when Knowledge Bases are installed on Host Monitors and not the Event Manager hosts, or when using PC GUI.
Config > COSmanager configuration > Other tables > icon - select the icon represented by the square box with a ‘?’, then Icon > Download.
How do I upload icons from the Host Monitor to the Event Manager?
If an icon looks incorrect on the sentinel3G console, it may be that it only exists on the Host Monitor but not the Event Manager. This could be due to an installed Knowledge Base that has supplied new icons.
From the console change into the Host View, new select the Host Monitor host, then click Configure > Global tables.
Double click on Other tables, then double click on the Icon table.
Highlight the icon you want to upload to the Event Manager, then click Icon > Upload and choose the Event host.
Note: The COSmanager GUI system is distributed to allow greater flexability and efficiency. If you know the icon exists on the Host Monitor host, but it still displays incorrectly it you may need to download it to your GUI host (normally your PC).
Why are my sentries not sorted by their instance name?
Sentries are displayed in the order the agent recieves its data at the time the Host Monitor (HM) is started. The order in which they are displayed on the console will not change even if the subsequent agent polls return their data in a different order..
To force sentries to be displayed in a desired order, ensure you agent command sorts the output correctly.
Remember, sorting can be a performance issue.
Why don't the threshold lines show when I select more than one sentry?
Threshold lines are ignored when graphing multiple sentries due to the potential for each sentry to have different threshold values
How do I configure SSH communications?
By default COSmanager uses the rsh protocol for network communication. As this is not always installed or enabled due to security concerns, ssh can be used as an alternative.
When COSmanager is installed a cosmos user is created, and this user needs to be configured at an OS level to use ssh. The end result is that as the cosmos user on HostA you can run ‘ssh HostB <command>’ (and vice versa) without entering a passphrase.
The COSmanager Framework version 4.1 and later optionally support ssh as their communications method rather than rsh (Remote Shell). However, its configuration requires more effort, because keys for the ‘cosmos‘ user must be generated and copied to the remote hosts. This process must be repeated for each host on which COSmanager is installed.
The following commands assume that you are running a reasonably modern version of ssh which supports ‘protocol version 2’, and the ‘DSA algorithm’. They also assume that the ssh package has been installed correctly on your hosts, and that the sshd daemon process is running on each.
To allow COSmanager on host ‘hostA’ to run remote COSmanager commands on ‘hostB’, follow the following instructions:
- On ‘hostB’, login as root, and run: su cosmos
- Type the command ‘id’ to make sure that your user ID is ‘cosmos’.
- Generate a dsa public key/private key pair: ssh-keygen -t dsa or ssh-keygen -d
If you get the message ‘not found’, check that the SSH ‘bin’ directory is in your shell's path, and if not add it.
This command should generate a ‘dsa’ private/public key pair for user ‘cosmos’. Hit ENTER to accept all default value when it asks for the file name to save the key, and also hit ENTER each time you are asked for a passphrase (we do NOT want to use a passphrase).
This should create the files:
- ~cosmos/.ssh/id_dsa
- ~cosmos/.ssh/id_dsa.pub
- ~cosmos/.ssh
Copy the file ‘id_dsa.pub’ to a temporary directory on ‘hostA’. This file will be accessed later when you log onto hostA.
Login in as your normal user ID (assumed to be registered to COSmanager as a ‘Manager’), and run: cos cosmos -C This should bring up the ‘COSmanager Global Configuration’ window.
Double-click on ‘Hosts’, and a list of COSmanager hosts should appear in another window. Double-click on ‘hostA’, and when the form appears, change the ‘Comm method’ field to ‘ssh’. Hit Accept to save the change. If ‘hostA’ is NOT in the list, select ‘Maintain > Add’, and add an entry for hostA, with the ‘Comm method’ field set to ‘ssh’.
Login to hostA as ROOT, and run: su cosmos Run the command: ssh-keygen -t dsa or ssh-keygen -d to generate the keys for hostA.
Change to the ~cosmos/.ssh directory and append the copy of id_dsa.pub copied from hostB (NOT the one in the local directory which was just created by the ssh-keygen command) to the file ‘authorized_keys2’. If ‘authorized_keys2’ does not exist simply copy the id_dsa.pub file from hostB to it. This allows the ‘cosmos’ user on hostB to run a command as user ‘cosmos’ on hostA.
ssh is very fussy about the permissions and ownerships of the files in the .ssh directory, and the ~cosmos & ~cosmos/.ssh directories themselves. Ensure that they are all owned by user "cosmos" and that the permissions are:
~cosmos drwxr-xr-x (755) ~cosmos/.ssh drwxr-xr-x (755) ~cosmos/.ssh/authorized_keys2 -rw------- (600) ~cosmos/.ssh/id_dsa -rw------- (600) ~cosmos/.ssh/id_dsa.pub -rw-r--r-- (644)
Copy the file id_dsa.pub from the .ssh directory to a temporary directory on hostB.
Now go back to hostB, login as ROOT, run: su cosmos and then try the command: ssh hostA pwd (of course the ‘ssh’ program must be in a directory in your search path). When run for the first times to a new host, ssh may say that the host cannot be authenticated, and will ask you if you want to connect. Reply "yes". ssh should add hostA to list of known hosts (file known_hosts2), and you should never be asked again.
If the ~cosmos directory name is NOT returned, then you have a problem! It may be that the public key was not copied into authorized_keys2 correctly, or that the permissions or ownerships of some ssh files are not correct. The easiest way of debugging is to kill the sshd process on the other host (hostA), and run in in DEBUG mode (as root): sshd -Dddd This will NOT start it as a daemon, and will give a lot of debugging information, which should help you pinpoint the problem. Note: sshd will probably NOT be in your search path. Some common locations for it are:
/sbin /usr/sbin /usr/local/sbin
You will probaly need to use the full pathname when running it. Note: killing sshd is probably not a good idea if other people or applications are using ssh to that host.
Finally, you must go back to hostA, and repeat steps e & f, this time using hostB rather than hostA. Then login as ROOT, run ‘su cosmos’ and repeat steps i through l, using hostB instead of hostA.
Congratulations, COSmanager is now configured to use ssh in both directions between hostA & hostB.
Note: If you are using an older version of ssh, which does not support protocol version 2, you should follow the above instructions, except the filenames under the .ssh directory are different:
id_dsa == identity id_dsa.pub == identity.pub authorized_keys2 == authorized_keys known_hosts2 == known_hosts
There is also an ssh configuration file often found in /etc/ssh/ssh_config. In this file you can effectively force ssh to use either protocol 1 or protocol 2 by specifying the identity file. Normally no identity file should be specified as ssh is smart enough to determine which to use in any given situation. If you are having problems configuring ssh to use protocol 2 (symptoms include falling back to password authentication even though the keys have been exchanged correctly), check this file and comment out the IdentityFile line:
# IdentityFile ~/.ssh/identity
Once this is complete you can configure COSmanager to use ssh:
- Start COSmanager configuration (cos cosmos -C from the command line, or click on Config > COSmanager configuration) and double click on the hosts icon.
- Select the remote host that you wish to access via ssh and double click it to modify its configuration.
- Change the ‘comm meth’ field to ssh.
- Test your connectivity by clicking on the Planet icon on the button bar and select ‘Remote‘.
- Choose the host you just configured. You should see the button bar for the remote host.
- Repeat this process to allow access from HostB back to HostA.
How can I see all actions available for a sentry folder?
To display all the actions for a sentry, select that sentry and run Configure > Actions.
Why does ‘no data to display’ show for quarterly and monthly service level reports, but not for today?
The monthly and quarterly reports finish on the last day of the last month just past.
The weekly and fortnightly reports end today.
The no data to display messages are showing because the logging only started this month, so a monthly or quarterly report quite rightly has nothing to display.
How do I stop a filesystem (or filesystems) from being monitored?
Change the agent to exclude the filesystem.
For example, you may be running web cache software such as Squid, which monitors its own disk usage. Let us assume Squid is running, and has two filesystems configured for its cache, /squid1 and /squid2.
From the console, select one of the filesystems you are monitoring Choose ‘Configure > Agent’ In the ‘Exclude’ field, add your filesystems (/squid1 and /squid2). The ‘Exclude’ field is just a list of instances to ignore, it is not a condition.
How do I monitor a particular filesystem using different thresholds to the others?
Clone the sentry and change it's cloning condition.
For example the /boot filesystem is typically static, so the normal threshholds do not apply.
- From the console, choose ‘Configure > Host monitor...’.
- Right click on the ‘Free Space’ sentry and choose ‘Clone’
- Change the description to ‘Free disk space (/boot static filesystem)’.
- Enter a condition in ‘Clone if’: $Filesystem == “/boot”
- Hit accept to write the changes.
You should now have two copies of the Free_Space sentry (hint: read the description to tell them apart).
- Right click on the original sentry and choose ‘Change’
- Change the description to ‘Free disk space (excluding /boot filesystem)’.
- Enter a condition in ‘Clone if’: $Filesystem != “/boot”
- Hit accept to write the changes.
- Exit the ‘Sentry Details’ window From the console
- Restart the Host Monitor and ensure all the filesystems appear correctly.
- If not, look at the ‘Host log’ for any error messages. The most common problem is entering the condition incorrectly. Remember the condition is case sensitive and the double-quotes are important.
Now the /boot filesystem is a different sentry, and you can modify its threshholds (Configure > Constants) without affecting the other filesystems.
The states for these two sentries are shared, so if you want to change them you will be prompted to first copy the states to the new sentry, or continue to share them.
You must restart the Host Monitor before any changes to the constants will be applied.
Note: If you had two filesystems that you wanted to separate out, all you need to do is change the conditions slightly.
For example, on some operating systems the /usr filesystem is also quite static.
To add /usr to the /boot sentry just change the "Clone if" conditions.
You should understand the boolean operators ‘||’ (logical or) and ‘&&’ (logical and) before attempting this.
On the sentry this is monitoring /boot and /usr the condition should be:
$Filesystem == “/boot” || == “/usr”
And on the sentry that is monitoring the other filesystems the condition should be:
$Filesystem != “/boot” && $Filesystem != “/usr”
Why is my customized notification not working?
If a COSmanager user is configured to use a notification method other than email, an address for that user must be specified - even if the method does not require an address.
If no address is specified, the default email method will be used.
Linux KB
Why is one or more of my services not showing in the services folder?
Services to be monitored are discovered by parsing the startup scripts in the system startup directory (either /etc/rc.d/init.d or /etc/init.d).
The header comments must contain the following directive to be monitored: # processname:
How do I monitor failed attempts to ‘su’ to another user ID?
By default, the Bad su sentry is turned off as different versions and flavours of Linux log failed su attempts differently.
By default the sentry uses the standard log file agent to monitor the messages file.
For more information on this problem, please click here. Click again to hide information By default, the Bad su sentry is turned off because different versions and flavours of Linux log failed su attempts differently. By default the sentry uses the standard log file agent to monitor the messages file (/var/log/messages) for entries like: Jun 26 10:43:08 bink PAM_pwdb[12444] : 1 authentication failure; marks (uid=667) -> root for su service It does this by matching the following regular expression: n failure.*su service and then it extracts columns using the () operators in another regular expression: uid=([^\)]*)\) -> (.*) for su service This sets the first column to the uid and the second column to the target user. Firstly you should check to see whether the default configuration works on your system:
- From the sentinel3G console, configure the host monitor on which you want the sentry to be run (Configure > Host Monitor)
- Find the Bad_SU sentry and turn it on
- Restart the host monitor
- Generate a failed su attempt (as your own user id, run su and enter an incorrect password)
Within a few seconds a new folder (Security) should appear under that host on the console, with an icon indication the user that failed to su. If this does not occur, you will need to reconfigure the agent:
- Identify which log file failed su attempts are written to (by default this will be the messages file: /var/log/messages) and find the message generated
- You will need to construct a regular expression (see the regex manual page on your system for more details) that will match the given line. For example, if instead the line was: Jun 26 10:43:08 bink PAM_pwdb[12444]: failed su; marks(uid=667) -> root our pattern could be as simple as: ‘failed su’
- Create a second regular expression to extract the data we are interested in with round brackets around the data we want to see. For our example it would be: uid="([^\)]*)\) -> (.*)"
To configure the agent, start from the console and follow: Configure > Host Monitor > Select Bad_SU sentry > Right Click > Agent > Agent Options. You then need to configure the select pattern and the variable assignment pattern.
Restart the host monitor and generate another failed su to test your configuration
Solaris KB
sentinel3G is saying I have negative RAM available
The fix is to change the definition of the mem_total variable from:
$SETTING(PhysicalPagSize) * $SETTING(PhysicalPages) / (1024 * 1024)
to:
$SETTING(PhysicalPagSize) / 1024.0 * $SETTING(PhysicalPages) / 1024.0
The latest version of the Solaris KB (2.2) with a build date of 20060227 includes this fix.
How do I add a service to be monitored?
sentinel3G has a full list of possible services that may run on Solaris. This list is far from exhaustive and you may need to add services to be monitored.
Select the Services folder and run action.
Select Add new service to monitor
Squid KB
Why am I having problems accessing squid statistics using SNMP?
The Squid knowledge base uses SNMP to monitor the proxy server. The installation notes for this KB describe how to configure Squid to enable SNMP, but sometimes there can be networking problems that interfere.
This problem is reported in the Host Monitor Log as ‘no SNMP response’
In the Squid configuration file (squid.conf under your squid installation directory) there is a set of access control lists which are used in the sentinel3G specific configuration. By default the localhost acl is as follows: acl localhost src 127.0.0.1/255.255.255.255 If the hostname localhost does not resolve to 127.0.0.1 on the system (for example localhost is configured with a different IP address in /etc/hosts), this acl will not be correctly recognized. The localhost logname should preferably resolve to 127.0.0.1, but if this is not desired, you can change the acl in the squid.conf file to match the correct IP address.
For example, if you run "ping localhost" and the IP address that localhost resolves to is 10.0.0.1 then you should change the acl in the squid.conf file to: acl localhost src 10.0.0.1/255.255.255.255 Then if you restart squid and the monitor, it should be monitored correctly
What does the message ‘snmp_port TAG incorrect or non-existent’ in the Host Monitor Log
By default the SNMP port for squid is configured to be 3041.
If the Squid server is running and SNMP is configured correctly, the snmp_port tag in the squid.conf file is not required.
However, if the Squid server is not running, or there are other problems with the SNMP configuration, you may see this message.
To correct this problem, specify the snmp_port tag in the squid.conf file.
In most cases this is as simple as uncommenting the line: #snmp_port 3041