John Alvord, IBM Corporation
A customer reported tacmd login failed. The tacmd functions were an important component of their ITM automated operations. As documented in the ITM Troubleshooting Manual here, I had them stop all ITM Processes, start up the hub TEMS and then start up the other ITM processes. The customer reported that all was now well but they were very concerned about the interruption to normal operations.
For the new solution skip directly to “A New Solution” below.
In the 1990s the Simple Object Access Protocol was developed – SOAP for short. In the early 2000s the acronym became just SOAP. SOAP can be used to access data over a soap service. The tacmd login and many other tacmd functions use the SOAP service running on the TEMS. Many customers have created automation solutions which use tacmd functions or SOAP in Perl or Java or any other language.
Every ITM process in the default configuration has an internal web server. The default listening ports are 1920 [HTTP] and 3661 [HTTPS]. The TEMS SOAP process is closely linked to the internal web server. If TEMS is the only ITM process running on a system, the results are straightforward. SOAP is available if the TEMS is running. SOAP is not available if the TEMS is not running. In the following only port 1920 is referenced but the same process works to use port 3661 or any other non-default port if configure that way.
When multiple ITM processes are running the situation is more complicated. Each individual web server attempts to bind to port 1920. One wins and owns the port. The other web servers fail and they connect to the winner and register information. If the 1920 owner stops, the socket connections to the winner fail and then all the processes attempt to get the port. Again there will be one winner and everyone other web server connects to the winner. When the original 1920 owner starts up again, it tries for 1920, fails, and then connects to the winner.
You can see this environment by browsing to the service index page http://server:1920. The list of services you see are the ones from the 1920 owner and include the registered data from the other web servers. If you rest the cursor on a link – like “IBM Tivoli Monitoring Web Services” for example you can see the URL in the status area. If that link has a port of 1920 – that service is running on the 1920 owner. If not it is running on one of the web services that have registered.
If there is no firewall rules involved, tacmd login works smoothly. The tacmd program reads an xml file equivalent to the service index page. Then it calculates what the actual SOAP port is and then uses it.
If there are firewall rules in place to limit which ports can be used, trouble arises. Lets say that only 1920 and 1918 are allowed ports. If the TEMS starts first, then SOAP is on port 1920 and all is well. If the TEMS is recycled port 1920 migrates to another ITM process and SOAP is on an ephemeral port. The tacmd figures out that new port and tries to use that new port. But firewall rules prevent that access.
This condition has existed since the beginning. See this topic in the Troubleshooting Guide.
A New Solution
Almost any solution is better then shutting down all ITM processes, including the hub TEMS, and then starting up again. Recently a good customer figured out a better solution: Configure the ITM processes so that only the hub TEMS uses port 1920.
First make sure the TEPS is not running on the same system as the hub TEMS. TEPS also needs the internal web server in default mode.
The hub TEMS configuration is not changed.
Look at an agent diagnostic log. This will have a name looking like this – <hostname>_<pc>_<taskname>_<hextime>-01.log, for example
Scan and look for a line like this:
(50E350B1.0013-1:kbbssge.c,52,”BSS1_GetEnv”) KDE_TRANSPORT=KDC_FAMILIES=”ip.pipe port:1918 ip use:n ip.spipe use:n sna use:n HTTP:1920“
For this purpose, we want to stop the agent from using port 1920 and so the communications string must look like this:
ip.pipe port:1918 ip use:n ip.spipe use:n sna use:n HTTP:0
This means that the internal web server will not listen on HTTP.
Linux/Unix Solution on ITM 623 and following
For every ITM process except the TEMS, create an environment file for a permanent override. If one already exists you can just use it. For example, if you ran a Unix OS Agent you would create ux.environment. The file must be in the <installdir>/config directory and must have the same attributes/owner/group as the ux.ini file. Add to that file this line
KDC_FAMILIES=ip.pipe port:1918 ip use:n ip.spipe use:n sna use:n HTTP:0
This is the ITM development design for permanent customer configuration changes. You will need one such file for each agent on the system running the hub TEMS.
Linux/Unix Solution on ITM 622 and earlier
For every ITM process except the TEMS, create an override file for an override. For example, if you ran a Unix OS Agent you would create ux.override. The file must be in the <installdir>/config directory and must have the same attributes/owner/group as the ux.ini file. The contents of the file would be
KDC_FAMILIES= ‘ip.pipe port:1918 ip use:n ip.spipe use:n sna use:n HTTP:0 ‘
The only difference with the ITM 623 solution are the single quotes. Also update the ini file adding a source include file like this.
That is a period followed by a space followed by the fully qualified name of the override file. The path name would be different if your installation directory was different. You could use the same file for multiple agents.
Solution on Windows
In Windows you use the Manage Tivoli Monitoring Services Application.
- Right click on the agent line
- Select Advanced
- Select Edit Variables…
- Click Add…
- In Variable enter KDC_FAMILES
- In Value enter ip.pipe port:1918 ip use:n ip.spipe use:n sna use:n HTTP:0
- OK out
Usage Notes and Variations
After the changes have been made and all ITM processes are recycled including TEMS, only the TEMS will be listening to port 1920. That will be the permanent SOAP port. If the hub TEMS is not running then the tacmd login will fail, but that is expected and reasonable.
For a FTO environment with two hub TEMS, make this change on both hub TEMSes.
Using this design you can still access the Service Index page by using HTTPS://server:3661.
Another alternative for the non-TEMS processes would be to set the following communications string:
http_server:n ip.pipe port:1918 ip use:n ip.spipe use:n sna use:n
which stops the internal web server from even starting.
I saw a recent case where the root userid which ran the agent had a .profile which included
In this case the KDE_TRANSPORT was used and the KDC_FAMILIES was ignored. The result was that all the KDC_FAMILIES changes were ignored. This was eventually spotted by reviewing the <installdir>/logs/ux.env file which had both KDE_TRANSPORT and KDC_FAMILIES included.
This post shows how to configure a hub TEMS with a stable permanent port to access SOAP services. This change will increase the availability of SOAP services.
Notes: Spider Web after Foggy Night – 14 June 2013