vertex/Workbench Instance Stuck In Provisioning
Product: Vertex AI Kind: Debugging Tree
Description
This runbook investigates root causes for the Workbench Instance to be stuck in provisioning state
Areas Examined:
-
Workbench Instance State: Checks the instance’s current state ensuring that it is stuck in provisioning status and not stopped or active.
-
Workbench Instance Compute Engine VM Boot Disk Image: Checks if the instance has been created with a custom container, the official ‘workbench-instances’ images, deep learning VMs images, or unsupported images that might cause the instance to be stuck in provisioning state.
-
Workbench Instance Custom Scripts: Checks if the instance is not using custom scripts that may affect the default configuration of the instance by changing the Jupyter port or breaking dependencies that might cause the instance to be stuck in provisioning state.
-
Workbench Instance Environment Version: Checks if the instance is using the latest environment version by checking its upgradability. Old versions sometimes are the root cause for the instance to be stuck in provisioning state.
-
Workbench Instance Compute Engine VM Performance: Checks the VM’s current performance, ensuring that it is not impaired by high CPU usage, insufficient memory, or disk space issues that might disrupt normal operations.
-
Workbench Instance Compute Engine Serial Port Logging: Checks if the instance has serial port logs which can be analyzed to ensure Jupyter is running on port 127.0.0.1:8080 which is mandatory.
-
Workbench Instance Compute Engine SSH and Terminal access: Checks if the instance’s compute engine vm is running so the user can ssh and open a terminal to check for space usage in ‘home/jupyter’. If no space is left, that may cause the instance to be stuck in provisioning state.
-
Workbench Instance External IP Disabled: Checks if the external IP disabled. Wrong networking configurations may cause the instance to be stuck in provisioning state.
Executing this runbook
gcpdiag runbook vertex/workbench-instance-stuck-in-provisioning \
-p project_id=value \
-p instance_name=value \
-p zone=value \
-p start_time_utc=value \
-p end_time_utc=value
Parameters
Name | Required | Default | Type | Help |
---|---|---|---|---|
project_id |
True | None | str | The Project ID of the resource under investigation |
instance_name |
True | str | Name of the Workbench Instance | |
zone |
True | us-central1-a | str | Zone of the Workbench Instance. e.g. us-central1-a |
start_time_utc |
False | None | datetime | Start time of the issue |
end_time_utc |
False | None | datetime | End time of the issue |
Get help on available commands
gcpdiag runbook --help