gke/BP/2021_001
GKE logging and monitoring enabled.
GKE logging and monitoring enabled.
GKE clusters are regional.
GKE clusters are using unique subnets.
GKE cluster is not near to end of life
GKE clusters should have HTTP load balancing enabled to use GKE ingress.
GKE network policy enforcement
Stateful workloads not run on preemptible node
GKE clusters are VPC-native.
Enable gateway resources through Gateway API.
Google Groups for RBAC enabled.
GKE maintenance windows are defined
GKE clusters are private clusters.
GKE nodes service account permissions for logging.
GKE nodes service account permissions for monitoring.
App-layer secrets encryption is activated and Cloud KMS key is enabled.
GKE nodes aren’t reporting connection issues to apiserver.
GKE nodes aren’t reporting connection issues to storage.google.com.
GKE Autoscaler isn’t reporting scaleup failures.
GKE service account permissions.
Google APIs service agent has Editor role.
Version skew between cluster and node pool
Check internal peering forwarding limits which affect GKE.
ip-masq-agent not reporting errors
Node pool service account exists and is not disabled.
GKE cluster firewall rules are configured.
GKE masters of private clusters can reach the nodes.
GKE connectivity: node to pod communication.
GKE connectivity: pod to pod communication.
GKE nodes of private clusters can access Google APIs and services.
GKE connectivity: load balancer to node communication (ingress).
Missing request for memory resources.
Container File System API quota not exceeded
GKE private clusters are VPC-native.
containerd config.toml is valid.
GKE ingresses are well configured.
Workloads not reporting misconfigured CNI plugins
GKE Gateway controller reporting misconfigured annnotations in Gateway resource
GKE Gateway controller reporting missing or invalid resource references in Gateway resource
Missing request for CPU resources.
GKE Cluster does not have any pods in Crashloopbackoff state.
NodeLocal DNSCache timeout errors.
GKE Metadata Server isn’t reporting errors for pod IP not found
Checking for no Pod Security Admission violations in the project.
GKE Webhook failures can seriously impact GKE Cluster.
GKE nodes don’t use the GCE default service account.
GKE Workload Identity is enabled
GKE master version available for new clusters.
GKE nodes version available for new clusters.
GKE cluster size close to maximum allowed by pod range
GKE system workloads are running stable.
GKE nodes have good disk performance.
GKE nodes aren’t reporting conntrack issues.
GKE nodes have enough free space on the boot disk.
Istio/ASM version not deprecated nor close to deprecation in GKE
GKE nodes use a containerd image.
GKE clusters with workload identity are regional.
GKE metadata concealment is not in use
GKE service account permissions to manage project firewall rules.
Cloud Logging API enabled when GKE logging is enabled
NVIDIA GPU device drivers are installed on GKE nodes with GPU
GKE NAP nodes use a containerd image.
GKE nodes need Storage API access scope to retrieve build artifacts
GKE connectivity: possible dns timeout in some gke versions.
Container File System has the required scopes for Image Streaming
GKE workload timeout to Compute Engine metadata server.
Cloud Monitoring API enabled when GKE monitoring is enabled
A Node Pool doesn’t have too low maxPodsPerNode
number
GKE Node Auto Provisioning scales nodes to match workload demands.
Number of KSAs in the workload Identity-enabled clusters.
Ingress creation is successful if service is correctly mapped