dataflow/WARN/2023_001

Dataflow job does not have a hot key

Product: Dataflow
Rule class: WARN - Something that is possibly wrong

Description

A Dataflow job might have hot key which can limit the ability of Dataflow to process elements in parallel, which increases execution time.

You can search in the Logs Explorer for such jobs with the logging query:

resource.type="dataflow_step"
log_id("dataflow.googleapis.com/worker") OR log_id("dataflow.googleapis.com/harness")
severity>=WARNING
textPayload=~"A hot key(\s''.*'')? was detected in step" OR "A hot key was detected"

Remediation

To resolve this issue, check that your data is evenly distributed. If a key has disproportionately many values, consider the following courses of action:

Further information