Sitworld: Adding Environmental Data to Action Command Emails

onehummingbird

John Alvord, IBM Corporation

jalvord@us.ibm.com

Inspiration

A customer situation was created to detect a dangerously full paging space  condition on an AIX system. The formula used the KPX Paging Space attribute and the test was like this (Used Pct > 90).

A situation action command created email sent to operations staff. They monitored such events by reading email on a smart phone. The email did not contain enough information to decide how to handle the issue. The attributes were system wide instead of specific. The smart phone did not have a remote terminal programs to logon to the system.

The customer needed to send environmental information in the body of the email.

Solution

Here is an action command to gather additional information in this case. It only applies to AIX since the svmon command is specific to that Unix version. The general scheme can be used in many more circumstances. The action command uses  Linux/Unix/Windows meta characters like ( … ) to create sub-shells. Some of the information comes from the Agent attributes and some from the environment. This command looks like this as a long string. Don’t get scared because a careful explanation follows.

(echo Paging Space on  &{KPX_PAGING_SPACE.Node} is at &{KPX_PAGING_SPACE.Used_Pct}%”\n\n”; (echo ”     Pid Command          Inuse      Pin     Pgsp  Virtual 64-bit Mthrd  16MB”; ps -ef | tail -n +3 | awk ‘{print $2;}’  > /tmp/procs.txt; svmon -Pt20 | grep -f /tmp/procs.txt | sort -r -n -k5)) | mailx -s “Paging Space WARNING Alert” [additional parameters to specify target etc]

To avoid eyestrain I have split the command line out into logical sections:

(

==> Begin sub-shell level 1

echo Paging Space on &{KPX_PAGING_SPACE.Node} is at &{KPX_PAGING_SPACE.Used_Pct}%”\n\n”;

==> First line of email body with an extra blank line

(

==> Begin sub-shell level 2

echo ”     Pid Command          Inuse      Pin     Pgsp  Virtual 64-bit Mthrd  16MB”;

==> Title line for svmon extract results

ps -ef | tail -n +3 | awk ‘{print $2;}’  > /tmp/procs.txt;

==> Extract all the Process IDs less the title and process id 1

svmon –Pt20 | grep -f /tmp/procs.txt | sort -r -n -k5

==> Get top 20 by virtual storage, select lines with a Process IDs, sort by Pgsp

)

==> End sub-shell level 2

) |

==> End sub-shell level 1 and pipe all standard output to next program,

mailx -s “Paging Space WARNING Alert”  [parameters to specify email target etc]

==> Example batch email command. The standard input is the body of the email. The -s is the subject line.

Example of added data

     Pid Command          Inuse      Pin     Pgsp  Virtual 64-bit Mthrd  16MB

  274600 STAFProc         52330    13153     4664    57060      Y     Y     N

  163934 shlap64          56297    13141     3993    60459      Y     N     N

  147590 aixmibd64        50288    13141     3777    54453      Y     N     N

  192632 snmpmibd64       50246    13141     3744    54399      Y     N     N

  172146 hostmibd64       50626    13141     3731    54730      Y     N     N

 1937562 kux_vmstat       50619    13141     3601    54480      Y     N     N

 1810656 kdsmain          87582    13171     3601    75734      Y     Y     N

 1736806 stat_daemon      52404    13141     3601    56310      Y     N     N

 1609822 nfs_stat         50490    13141     3601    54435      Y     N     N

 1245352 kuxagent         56406    13192     3601    60021      Y     Y     N

 1237192 mount_stat       50485    13141     3601    54430      Y     N     N

 1126448 ifstat           50517    13141     3601    54467      Y     N     N

  938208 java            124375    13217     3601   107358      Y     Y     N

  909508 KfwServices     178686    13226     3601   173988      Y     Y     N

  905322 kcawd            51341    13148     3601    55292      Y     Y     N

  847954 java             70769    13167     3601    69446      Y     Y     N

  630926 java             59948    13156     3601    59414      Y     Y     N

  425998 cms              50755    13144     3601    54619      Y     Y     N

 1777814 httpd            27199    13150     2014    28799      N     Y     N

Summary

The operations staff was now able to evaluate what team was needed to handle the problem condition without having to logon to the system reporting trouble.

This general scheme could be extended to running a shell script which could access local files and databases or almost anything you want inside a sub-shell.

Sitworld: Table of Contents

Note: A single hummingbird.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: