Sitworld: Table of Contents

DaffidolsRescued

John Alvord, IBM Corporation

jalvord@us.ibm.com

Follow on twitter

Inspiration

After the number of blog posts increases, it is harder to find a way to locate ones of interest. The first section lists six posts I consider most important.

The second section is all the posts and very short comments.

Top 6 By Importance [My Prejudiced View]

ITM Database Health Checker

It is common to see a TEMS database [often called EIB] which has problems which cause confusion or sometimes lack of monitoring. This project identifies and documents 50+ advisories which will make things better.

Best Practice TEMS Database Backup and Recovery

The most costly support cases are when a customer does not have a proper backup. One memorable case was after a Storage Access Network device lost power and the most recent backup was over a year ago. I talk to people every day where TSM is used to make copies of the TEMS Databases and that almost every time is insufficient. This post was written jointly by a top L3 engineer and myself. If everyone did this the time to recover would drop substantially.

MS_Offline – Myth and Reality

MS_Offline type situations are extremely weighty and cause problems “at a distance”. For example a recent case with 9545 agents and 22 MS_Offline situations with 5 minute sampling interval has spawned multiple IBM Support interactions. They all come back to this one issue. When Persist>1 is set, the problems are much worse. The blog photo shows a California Condor [VERY LARGE VULTURE] lurking outside a window. Treat MS_Offline type situations as dangerous creatures and you will reduce your risk of injury and pain.

TEMS Audit Process and Tool

This has been available since 2012. It is a perfect way to examine the dynamic impact of workload [Situations, SOAP, real time data requests,etc] on a TEMS. With that knowledge you can make changes to avoid problem conditions. I have one customer who runs this on every TEMS each weekend and if “advisory messages” are present [noted via a non-zero exit code] sends the report to an analyst for review. The rate of emergency IBM Support meetings has dropped to near zero… at least for this area.

ITM Agent Health Survey

This tool provides a view of agents which are online but possibly non-responsive. Cases like this mean that real time data response is slow and partially missing, situations are not running, historical data is not being recorded. These are things everyone should worry about. This identifies the guard dog that doesn’t bark.

ITM Situation Audit

This tool performs a static analysis on all distributed situations and produces report of warning messages. It also reports which situations need TEMS filtering [instead of Agent filtering] which is a prime performance killer. Together with TEMS Audit you can really increase efficiency – reducing the cost of monitoring. This also gets early warning for situations with problems. Surprisingly, 50 of 51,000 situations studied actually had syntax errors – like VALUE instead of *VALUE. Anyway – I expect this to be an important tool over time.

Sitworld All Posts – Most recent first

Sitworld: Eliminating Duplicate Agents 5/29/2020 Eliminating Duplicate Agents
Sitworld: Summarization and Pruning Audit 3/23/2020 Summarization and Pruning Audit
Sitworld: ITM Permanent Configuration Best Practices 1/17/2020 ITM Permanent Configuration Best Practices
Sitworld: Scrubbing Out Windows Agent Malconfiguration Remotely 2/6/2019 Scrubbing Out Windows Agent Malconfiguration Remotely
Sitworld: Agent Diagnostic Log Communications Summary 8/20/2018 Agent Diagnostic Log Communications Summary
Sitworld: Adventures in Communications #1 7/2/2018 Adventures in Communications #1
Event History #15 High Results Situation to No Purpose 5/25/2018 High Results Situation to No Purpose
Event History #14 Lodging Problems 5/21/2018 Lodging Problems
Event History #13 Delay Delay Delay 5/10/2018 Delay Delay Delay
Event History #12 High Impact Situations And Much More 5/1/2018 High Impact Situations And Much More
Event History #11 Detailed Attribute differences on first two merged results 4/27/2018 Detailed Attribute differences on first two merged results
Event History #10 lost events because DisplayItem missing or null Atoms 4/24/2018 lost events because DisplayItem missing or null Atoms
Event History #9 Two Open Or Close Events In A Row 4/22/2018 Two Open Or Close Events In A Row
Event History #8 Situation Events Opening And Closing Frequently 4/21/2018 Situation Events Opening And Closing Frequently
Event History #7 Events Created But Not Forwarded 4/19/2018 Events Created But Not Forwarded
Event History #6 Lost events with Multiple Results with same DisplayItem at same TEMS second 4/17/2018 Lost events with Multiple Results with same DisplayItem at same TEMS second
Event History #5 Multiple Results Same DisplayItem Same Second 4/16/2018 Multiple Results Same DisplayItem Same Second
Event History #4 Conflict Between DisplayItem and Attributes 4/13/2018 Conflict Between DisplayItem and Attributes
Event History #3 Lost Events Because DisplayItem has Duplicate Atoms 4/13/2018 DisplayItem has Duplicate Atoms
Event History #2 Duplicate DisplayItems At Same Second 4/10/2018 Duplicate DisplayItems At Same Second
Event History #1 The Situation That Fired Oddly 4/4/2018 The Situation that cried Wolf
Event History Audit 4/3/2018 Examine Event History in detail
Policing the Hatfields and the Mccoys 6/5/2016 Advanced Base/Until Sits
TEMS Audit Tracing Guide Tracing Guide Appendix 7/7/2017 TEMS Audit Tracing
ITM 6 Interface Guide Using KDEB_INTERFACELIST 6/30/2017 Document usage of KDEB_INTERFACELIST
ITM Agent Historical Data Export Survey 5/4/2017 Detect historical export issues at agents
FTO Configuration Audit 3/9/2017 Detect FTO configuration issues
Portal Client [TEP] on Windows Using a Private Java Install 12/28/2016 Avoid issues with system Java updates
TEMS Database Repair 11/18/2016 Recover from some broken TEMS database files
The Encyclopedia of ITM Tracing and Trace Related Controls 9/19/2016 Document tracing controls
ITM2SQL Database Utility 6/19/2016 Create TEMS database table report files
Real Time Detection of Duplicate Agent Names 3/23/2016 Duplicate Agent Live Detection
Portal Client Java Web Start JNLP File Cloner 3/18/2016 Create JNLP clone files for different types of TEP users
TEPSI Interface Guide 3/18/2016 Learn about TEPS Interfaces
Diagnostic Snapshot Utility 1/4/2016 Capture diagnostics on the fly
tacmd logs summary 12/31/2015 Summarize tacmd diagnostic logs
Restore Usability to ITCAM YN Custom Situations 12/24/2015 Fix some user custom situation affinities
TEPS Audit 9/15/2015 Report on Potential Duplicate Agent names
Re-re-re-mem-ember Situation Status Cache Growth Analysis 8/1/2015 Identify pure situation w/changing DisplayItems
Attribute and Catalog Health Survey 4/19/2015 Check for missing or mis-used cat/atr files
ITM Database Health Checker 3/24/2015 Check TEMS database for issues
Suppressing Situation Events By Time Schedule 3/13/2015 Simple example of Until with timer schedule
Alerting on Daylight Savings Time Truants 2/27/2015 Situation alert when time differences
Report on Daylight Savings Time Truants 2/20/2015 Report on Daylight Savings Time problems
Situation Formula with Calculations 1/28/2015 How to effectively calculate a formula
ITM Agent Census Scorecard 11/24/2014 Report avoidable TEMA defects
ITM Protocol Usage and Protocol Modifiers 10/21/2014 How to increase SOAP ports and much more
Agent Workload Audit 10/08/2014 What is actually happening at Agents
Situation Distribution Report 7/11/2014 What Situations are running where
CPAN Library for Perl Projects 7/11/2014 Using Perl without changing system
ITM Virtual Table Termite Control Project 6/17/2014 Recover from Performance Issue
ITM TEMS Health Survey 6/9/2014 Verify TEMS central services are working
The Situation That Cried Wolf 6/1/2014 Craft a situation for good practical results
Statistics After 50,000 Views 5/19/2014 Summary to date
*MIN and *MAX – the Little Column Functions That Couldn’t 5/15/2014 Two broken Column function
A Situation By Any Other Name… 4/28/2014 Discovering situation names
Do It Yourself TEMS Table Display 4/28/2014 Do It Yourself – Run SQL
Running TEMS without SITMON 4/7/2014 Recovery when TEMS very broken
ITM Situation Audit 3/20/2014 Compiler or Lint for Situation Formulas
SOAP Flash Flood 2/1/2014 tacmd bulkexportsit -d stresses TEMS
Sample EIF Listener project 1/17/2014 Do It Yourself Event listener
Situation Limits 12/31/2013 Situations have many limits
Put Your Situations on a Diet Using Indexed Attribute 12/19/2013 Performance boost for some Situations
Sampled Situations and Until Situations 11/25/2013 Until Processing expose
TEMS Audit Process and Tool 11/16/2013 Measure Agent stress on TEMS
Detector/Recycler for ITM Windows OS Agent 11/2/2013 Windows OS Agent recycler high CPU
1997 Kasparov vs. Deep Blue Chess Match 9/17/2013 Virtual Table hub Update hidden issue
ITM Agent Health Survey 9/6/2013 Discover unhealthy agents
Sampled Situation Blinking Like a Neon Light 9/4/2013 When situation events auto-close
Sampling Interval and Time Tests 8/24/2013 Sampled situations and time to event
TEMS Audit Advisory Messages 8/13/2013 Included in TEMS Audit Process and Tool
Situations Caused Domain Name Server Overload 7/24/2013 Situation generated emails hurt DNS
Configuring a Stable SOAP Port 7/16/2013 Best Practice when SOAP is vital
Best Practice TEMS Database Backup and Recovery 7/12/2013 If you don’t have a backup plan read this
Action Command Wars – A New Beginning 7/9/2013 Running lots of action commands
Detecting and Recovering from High Agent CPU Usage 7/1/2013 Linux/Unix OS Agent High CPU recover
An Efficient Design for Starting a Background Process 6/20/2013 Elegant hack
Adding Environmental Data to Action Command Emails 6/12/2013 When attributes are not enough
Situation Managing Other Situations 6/5/2013 Situation creates MSL
Mixed Up Situations 5/28/2013 Multiple Attribute Situation issues
Efficient Situation for Two Missing Processes 5/22/2013 Elegant efficiency solution
Getting a Good Nights Sleep 5/15/2013 Creating events to keep operators happy
Rational Choices for Situation Sampling Intervals 5/8/2013 Best Practice Interval choices
The Derivative Log Pattern 5/1/2013 Two stage situation logic
Super Duper Situations 4/28/2013 Understanding _Z_ situations
MS_Offline – Myth and Reality 4/17/2013 Everything about MS_Offlines
Auditing TEMS for Improved Performance 4/4/2013 Included in TEMS Audit Process and Tool
ITM Silver Blaze – Agent Responsiveness Checker 3/28/2013 replace by ITM Agent Health Survey
ITM TEMS Stress Tester Experiment 3/20/2013 ITM Analytics experiment

Summary

Wonderful World of Situations Table of Contents.

Photo Note: Daffidols rescued from Big Sur house fire garden – February 2014

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: