google-cloud-observability — Google Cloud observability tooling

View on GitHub

applications/google-cloud-observability Application template

Type

Helm

Namespace

google-cloud-observability

Argo CD Project

monitoring

Environments

idfdev

values

Argo CD

idfint

values

Argo CD

idfprod

values

Argo CD

roundtable-dev

values

Argo CD

roundtable-prod

values

Argo CD

Google provides a managed service for Prometheus. In Phalanx environments provisioned on GKE, we’d like to use this for as much as we can to avoid the effort of running our own metrics and monitoring infrastructure. Unfortunately, the managed kube-state-metrics package does not provide kube_pod_container_status_last_terminated_reason or kube_pod_container_status_restarts_total, both of which are needed to alert on container OOM kills in the most reliable way. This app installs our own kube-state-metrics and configures the Google Cloud managed service for Prometheus to scrape it.

Prerequisites

  • Managed service for Prometheus is installed in the GKE cluster. This is probably configured in the idf_deploy repo.

Guides