Size: 1032
Comment:
|
Size: 2837
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
== Check nagios configuration == {{{#!highlisht bash /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg # status service on CentOS service nagios status # restart service on CentOS service nagios restart }}} == Sample service config == {{{#!highlisht bash define service{ use generic-service service_description Secure shell check_command check_ssh } define service{ use generic-service service_description Web server check_command check_http } define service{ use generic-service service_description RDP Windows check_command check_tcp!3389 } }}} |
|
Line 10: | Line 40: |
== Sample nrpe config file on CentOS 6.3 64 bit== | == Sample /etc/nagios/nrpe.cfg config file on CentOS 6.3 64 bit, host checked by Nagios == |
Line 31: | Line 61: |
== check_oracle_health shows warning when OK == [[http://forums.meulie.net/viewtopic.php?f=62&t=6282]] == Nagios shows Warning when is OK == Run the command locally on the target server with the user nrpe * su nrpe * mycommand.py * echo $? If there are problems of permissions or similar wqhen running with the user nrpe solve them Always test the commands under the user nrpe. If the command creates or uses files, delete them if the command tested with root, so they are created properly with the right permissions. == NRPE: Unable to read output (CentOS 6.3) == Make sure the following is setted: * setenforce 0 * nano /etc/sysconfig/selinux # change to SELINUX=disabled == Change number processes on check_total_procs command == If we are getting too many warning mesages regardig the total number of processes do the following: * ps uax | wc -l # count the number of active processes on the machine, nrProcs * Edit the file /etc/nagios/nrpe.cfg * Change command[check_total_procs] setting the warning -w to nrProcs + 20 and -c to nrProcs+20+30 * Save the file * service nrpe restart |
Nagios
Nagios is a powerful monitoring system that enables organizations to identify and resolve IT infrastructure problems before they affect critical business processes.
Check nagios configuration
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg # status service on CentOS service nagios status # restart service on CentOS service nagios restart
Sample service config
define service{ use generic-service service_description Secure shell check_command check_ssh } define service{ use generic-service service_description Web server check_command check_http } define service{ use generic-service service_description RDP Windows check_command check_tcp!3389 }
Check command
/usr/local/nagios/libexec/check_nrpe -H 192.168.1.1 -c check_metric
Sample /etc/nagios/nrpe.cfg config file on CentOS 6.3 64 bit, host checked by Nagios
log_facility=daemon pid_file=/var/run/nrpe/nrpe.pid server_port=5666 nrpe_user=nrpe nrpe_group=nrpe #nagios server 192.168.1.2 allowed_hosts=196.168.1.2,127.0.0.1 dont_blame_nrpe=0 debug=1 command_timeout=60 connection_timeout=300 include_dir=/etc/nrpe.d/ command[check_users]=/usr/lib64/nagios/plugins/check_users -w 5 -c 10 command[check_load]=/usr/lib64/nagios/plugins/check_load -w 15,10,5 -c 30,25,20 command[check_hda1]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1 command[check_zombie_procs]=/usr/lib64/nagios/plugins/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/lib64/nagios/plugins/check_procs -w 150 -c 200
check_oracle_health shows warning when OK
http://forums.meulie.net/viewtopic.php?f=62&t=6282
Nagios shows Warning when is OK
Run the command locally on the target server with the user nrpe
- su nrpe
- mycommand.py
- echo $?
If there are problems of permissions or similar wqhen running with the user nrpe solve them Always test the commands under the user nrpe.
If the command creates or uses files, delete them if the command tested with root, so they are created properly with the right permissions.
NRPE: Unable to read output (CentOS 6.3)
Make sure the following is setted:
- setenforce 0
- nano /etc/sysconfig/selinux # change to SELINUX=disabled
Change number processes on check_total_procs command
If we are getting too many warning mesages regardig the total number of processes do the following:
- ps uax | wc -l # count the number of active processes on the machine, nrProcs
- Edit the file /etc/nagios/nrpe.cfg
- Change command[check_total_procs] setting the warning -w to nrProcs + 20 and -c to nrProcs+20+30
- Save the file
- service nrpe restart