dataproc/Check Port Exhaustion
Verify if the port exhaustion has happened.
Product: Cloud Dataproc
Step Type: COMPOSITE STEP
Description
None
Failure Reason
Found logs messages related to “{log}” on the cluster: {cluster_name}.
Failure Remediation
This issue occurs when Spark Jobs was not able to find available port after 1000 retries. COLSE_WAIT connections are possible cause of this issue. To identify any CLOSE_WAIT connections, please analyze the netstat output.
- netstat plant » open_connections.txt
- cat open_connections.txt | grep “CLOSE_WAIT”
If the blocked connections are due to a specific application, restarting that application is recommended. Alternatively, restarting the master node will also release the affected connections.
Success Reason
Didn’t find logs messages related to “{log}” on the cluster: {cluster_name}.