John Alvord, IBM Corporation
Draft #1 – 14 December 2020 – Level 1.00000
The ITM protocol modifier EPHEMERAL:Y is wonderful when it is used correctly. When used badly it can cause terrible issues such as communications malfunctioning and causing ITM processes like TEMS to fail and require a recycle to continue.
If you need a refresher on Protocol Modifiers, please see this document
And how to make such changes
The marvelous feature of a EPHEMERAL:Y configuration is that the ITM agent does not require any ITM listening ports. This greatly increases security and also insulates the ITM process from port scanning tests, which sometimes damage the ITM processing such that process recycles are needed.
Most ITM Agents require TCP listening ports. In a default configuration, an agent will make a TCP connection to the TEMS it is configured to. It will also make a TCP connection when sending historical data to a Warehouse Proxy Agent. Finally it will open a listening port to allow the TEMS to give it alerts [like location of WPA] and requests for real time data. These are the port requirements for ITM Agent processes. Of course an agent may have non-ITM communication ports which are not covered here.
EPHEMERAL:Y is added to the KDC_FAMILIES or [z/OS KDE_TRANSPORT] environment variable. Usually it is added at the beginning as a global modifier. In this mode, a single Agent to TEMS TCP socket connection is made. There is no connection to a WPA and there is no listening port.
This logic is performed like a software router. The three connections are multiplexed between the usages. In certain diagnostic messages, you can make out virtual ports which are used to manage the data transmission.
The Warehouse Proxy Agent must be installed on the same system as the TEMS the agent connects to. The agent is told to connect to a WPA at a certain ip_address and a certain port… however the ip_address is ignored and the communication goes to the TEMS system. That isn’t bad since large environment best practice is to install a WPA with every hub and remote TEMS. It provides a nice workload balance. The alternative is to collect historical data at the TEMS [not great since slower and a single point of failure if the TEMS is down or communications outage. You can always not collect historical data but many customers view that as critical.
A second issue, the rule for EPHEMERAL:Y is ALL or NONE. If you have mixed usage, you will experience communication failures.
A third issue, the TEPS and the TEMS cannot run with EPHEMERAL:Y. That is because such servers need to contact the agents requesting service. Because of the ALL or NONE rule, no other ITM process on the same system as the TEMS/TEPS can use EPHEMERAL:Y. In this case the only alternative is to not run the other processes on the same system. This is true event if the other ITM process uses a different installation directory.
The communication outages are not frequent, like every few minutes. Instead they usually show as an issue every couple of days or once a week. A recycle often temporarily clears things up… but it will always return. Follow the ALL or NONE rule and the issue is avoided.
EPHEMERAL:Y Bad Experience
This mixed usage mal-configuration often occurs when different teams separately install agents. The original agents are all without EPHEMERAL:Y. The new agents use that setting and processing gradually destabilize. There was one case with 5,500 Windows systems that needed one by one correction.
On Linux/Unix one easy way is to login to the system and do
grep -i KDC_FAMILIES *.env
If you see some with and some without EPHEMERAL:Y you know the trouble exists. On occasion this gives a false alert because the xxx.env file is from an agent no longer running.
On Windows you usually have to review the agent diagnostic log files.
There is a limited detection capability using TEMS Audit
If you set the following trace at the TEMS the agents are connecting to
./tacmd login -s ….. [to hub TEMS]
./tacmd settrace -m <remotetemsnodeid> -p KBB_RAS1 -o ‘error (comp:kde,unit:kdebp0r,Entry=”receive_vectors” all er)(comp:kde,unit:kdeprxi,Entry=”KDEP_ReceiveXID” all er)’
and to turn off or reset later
./tacmd login -s ….. [to hub TEMS]
./tacmd settrace -m <remotetemsnodeid> -p KBB_RAS1 -r
TEMS Audit will detect some of the cases. It usually depends on whether the agent(s) are making new connections or not.
In practical terms you will discover a few agents mal-configured and then need to determine why it happened [like another group installing agents] and then correct them at each agent system. And of course educating the teams doing agent installs
This documents how to correctly use EPHEMERAL:Y protocol modifier.
History and Earlier versions