datafusion/ERR/2022_009

Cloud Dataproc Service Account has a Cloud Data Fusion Runner role

Product: Cloud Data Fusion
Rule class: ERR - Something that is very likely to be wrong

Description

Granted to the Dataproc service account so that Dataproc is authorized to communicate the pipeline runtime information such as status, logs, and metrics to the Cloud Data Fusion services running in the tenant project. The service account used by the Dataproc needs to be granted with the roles/datafusion.runner role. The role is needed in order for the job in Dataproc be able to talk back to Data Fusion

Remediation

Add an IAM policy binding to a Cloud Dataproc service account by specifying a role. The Service Account cannot be created without a role. For example, this can be done using the GCP Console or by running the following gcloud tool command :

gcloud projects add-iam-policy-binding PROJECT_ID --member='serviceAccount:SERVICE_ACCOUNT' --role='roles/datafusion.runner'

where SERVICE_ACCOUNT could be either Compute Engine default service account or the App Engine default service account or instead a user-specified service account.

Further information