Sitworld: Summarization and Pruning Audit

Bandit_on_Keyboard

Version 0.51000 23 March 2020

John Alvord, IBM Corporation

jalvord@us.ibm.com

Follow on twitter

Inspiration

ITM has an marvelous facility to store historical data. This facility includes logic to discard old data and summarize data into longer time periods. It is quite easy to get more data recorded but balancing that with the limited database storage capacity is much harder. A recent customer had almost exceeded the 6 gigabytes available. If there was no more database storage, historical data at the agent would not be transferred to the database and the agent file systems were at risk of gradually filling up.

The S&P logs contain much of the needed information, but it is scattered about, runs multiple processes at the same time and since there can be hundreds or thousands of agents, manual extraction of the data is almost impossible.

Preparing for the install

Perl is usually pre-installed in Linux/Unix systems. . For Windows you may need to install from www.strawberry.com or any other source. The program only uses Perl core services and no CPAN modules are needed.

SP Audit has been tested with

This is perl 5, version 28, subversion 1 (v5.28.1) built for MSWin32-x64-multi-thread

zLinux with Perl v5.8.7

A zip file is found found spaudit.0.51000. There is one file spaud.pl. Install it somewhere convenient.

Run Time Options

Options:

-v Produce some progress messages in STDERR

The remaining parameter is a log file specification. This needs to be a single file like this

sutlpar71_sy_java_5a68cc39-01.log

Ideally this should be a selection of the log that represents a single S&P processing run. That can span several  diagnostic log sections. Alternatively a single log section can represent multiple processing runs. As of yet I have not found any way to automate this process, but am still looking. View the sy_java.inv inventory file to see which diagnostic logs are current – the top line is the most recent log.

To isolate a segment search for “Trace resumed” for the starting point of a run and “Trace paused” for the ending point. Save those into a separate file for processing.

SP Audit Reports

Three reports are all keyed by the attribute group name [ AIX_Network_Adapters]

sp_sum.csv – Summary Report – Sorted in descending order by Aggregate_size

Summarization and Pruning log Audit ReportTable,Nodes,Aggregate_Rows,Pruned,SQLs,Time,Aggregate_Size_Bytes,SizePC,TotSizePC,
Top_Memory_Processes,88,400360,0,923755,293742,746271040,12.75%,12.75%,
Network,88,403639,12096,198870,630066,645822400,11.03%,23.78%,

ReportTable: The attribute group name
Aggregate_Rows: gathered from “Rows read” lines
Pruned: gathered from “Rows pruned” lines
SQLs: gathered from “For table” lines
Time: gathered from “Elapsed time” lines
Aggregate_Size_Bytes: Aggregate lines * rowsize. [from built in table] This report is sorted with this.
SizePC: Per cent of this table size compared to total size
TotSizePC: Cumulative per cent of table sizes

sp_det.csv – Detail Report

Summarization and Pruning log Detail
Table,Nodes,Aggregate,Pruned,SQLs,Time,
KVA_NETWORK_ADAPTERS_TOTALS,8,63805,6912,1358,39109,1,1,1,1,1,1,1,1,1,1,7,4008,255730440,0,
KVA_NETWORK_ADAPTERS_TOTALS_Y,0,0,0,0,1294,1,1,1,1,1,1,1,1,1,1,7,4008,0,0,
KVA_NETWORK_ADAPTERS_TOTALS_Q,0,0,0,0,131,1,1,1,1,1,1,1,1,1,1,7,4008,0,0,

Mostly used to diagnose the summary report, described minimally here,

Table
Nodes
Aggregate
Pruned
SQLs
Time
various unnamed columns which include the type of sumarization and pruning and the row size.

sp_err.csv – Error Report – Track Down Issues

Summarization and Pruning Error Detail
AttributeGroup,Node,Line,SQL_exception,Batch_First_Exception,Exception,
KVA_PROCESSES_DETAIL,shtppvm01-vios1:VA,4633,SQL State = null , SQL Error Code = -4229,com.ibm.db2.jcc.am.SqlTransactionRollbackException: Error for batch element #1: The current transaction was rolled back because of error “-289”.. SQLCODE=-1476, SQLSTATE=40506, DRIVER=3.63.123,Failed to create aggregates for node: (shtppvm01-vios1:VA),

AttributeGroup: The attribute group name
Node: Agent Name
Line_exception: line number in diagnostic log
Batch_First_Exception: lots of details
Exception, Summary of action

Sometimes the errors are obvious, If not involve IBM support to resolve the issue. Sometimes it is a database issue and sometimes it is agent application support.

Summary

Report on Summarization and Pruning processing.

Versions:

This project is also maintained in github.com/jalvo2014/spaudit  and will sometimes be more up to date [and less tested] compared the the point releases. You can also use this github distribution to review history and propose changes via pull requests.

Here are recently published versions, In case there is a problem at one level you can always back up.

spaudit.0.51000

Correct spelling in titles

Sitworld: Table of Contents

Note: Bandit, a Maine Coon cat dreaming of a musical career

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: