dataproc/WARN/2024_001

HDFS NameNode Safemode is disabled.

Product: Cloud Dataproc
Rule class: WARN - Something that is possibly wrong

Description

When HDFS NameNode Safemode is enabled, the HDFS filesystem is in read-only mode and no changes are allowed.

The NameNode enters the Safemode due to different reasons:

  • The DataNode(s) did not send the block report to the NameNode.
  • Not enough space in the NameNode directory specified in dfs.namenode.resource.du.reserved.
  • Not enough memory in the NameNode to load the FsImage and EditLog.
  • Unavailable HDFS blocks due to Datanode(s) down.
  • Corrupted HDFS blocks.
  • A user manually enabled the Safemode.

Remediation

Check the current Safemode status, use the following command from the Master node: hdfs dfsadmin -safemode get.

If Safemode is OFF, HDFS is working as expected.

If Safemode is ON:

  • Inspect the NameNode logs to understand the cause.
  • Check the HDFS filesystem status by running the following command from the Master node: hdfs fsck /.
  • Check HDFS related metrics such as dataproc.googleapis.com/cluster/hdfs/unhealthy_blocks and dataproc.googleapis.com/cluster/hdfs/storage_utilization.

After the cause has been identified and fixed, i.e. the NameNode and all the HDFS DataNodes and blocks are healthy (hdfs dfsadmin -report), leave the Safemode using the following command: hdfs dfsadmin -safemode leave.