Solution: Ambari Agent lost heartbeat after Java or JDK upgrade
Ambari agent lost heartbeat after JDK upgrade OR while installing a new cluster using JDK 8 agent do not get registered.
COMPONENT: Apache Ambari
VERSION: All Ambari Versions
Details for Agent log:
INFO 2018-07-17 15:28:44,399 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'machine1.cloud.com' using socket.getfqdn(). INFO 2018-07-17 15:28:44,463 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2018-07-17 15:28:44,465 main.py:437 - Connecting to Ambari server at https://machine1.cloud.com:8440 (18.104.22.168) INFO 2018-07-17 15:28:44,465 NetUtil.py:70 - Connecting to https://machine1.cloud.com:8440/ca ERROR 2018-07-17 15:28:44,467 NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:579) ERROR 2018-07-17 15:28:44,467 NetUtil.py:97 - SSLError: Failed to connect. Please check openssl library versions. Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details. WARNING 2018-07-17 15:28:44,467 NetUtil.py:124 - Server at https://machine1.cloud.com:8440 is not reachable, sleeping for 10 seconds...
PROBLEM: Recent JDK and Python updates have introduced behavior changes that can affect the Ambari Server to Ambari Agent registration process. The Ambari Server and Ambari Agent use TLS to register with each other securely. Recent changes have been made to the JDK and Python to increase security by eliminating the use of insecure cipher suites and protocols. These changes ensure that more secure protocols and cipher suites are used by the Ambari Server when setting up its TLS sockets. As a result, these changes also require the Ambari Agent’s Python client be configured to use later versions of the TLS protocol to communicate with the Ambari Server.
IMPACT: After upgrading your JDK to the latest Java 1.8 version, Ambari Agents may stop heart beating and registering themselves with the Ambari Server.
Solution For CentOS 7, Debian 7, Ubuntu 14 & 16, or SLES 12 (Python 2.7)
To solve this problem, simply configure the Ambari Agent to use TLSv1.2 when communicating with the Ambari Server by editing each Ambari Agent’s /etc/ambari-agent/conf/ambari-agent.ini file and adding the following configuration property to the security section:
After this configuration change has been made, the Ambari Agent must be restarted. After restarting, you should no longer see the ERROR’s in the Ambari Agent logs, and in the Ambari Server UI you’ll notice that all Ambari Agents are once again heartbeating.
Solution for CentOS 6, or SLES 11 (Python 2.6)
In this scenario, the only way forward is to edit the java.security file in the JDK being used by the Ambari Server and make the following changes:
find the jdk.tls.disabledAlgorithms property and remove the 3DES_EDE_CBC reference
Save the file, and restart the Ambari Server