Write a Helm chart for an application#

Argo CD manages applications in the Rubin Science Platform through a set of Helm charts. Which Helm charts to deploy in a given environment is controlled by the values.yaml and values-environment.yaml files in /environments.

The applications directory defines templates in its templates directory and values to resolve those templates in values.yaml and values-environment.yaml files to customize the application for each environment. For first-party charts, the templates directory is generally richly populated.

Here are instructions for writing a Helm chart for a newly-developed application. If you are using an external third-party chart to deploy part of the application, also see Adding an external Helm chart.

In some cases where there is a lot of internal duplication between multiple Phalanx applications, those applications should share a subchart that encapsulates that duplication. See Sharing subcharts between applications if you think that may be the case for your application.

Start from a template#

Ensure that your local Phalanx development environment is set up following the instructions in Setting up a Phalanx development environment.

Then, create the files for the new application, including the start of a Helm chart:

phalanx application create <application>

Replace <application> with the name of your new application, which will double as the name of the Helm chart. The application name must start with a lowercase letter and consist of lowercase letters, numbers, and hyphen (-).

By default, this will create a Helm chart for a FastAPI web service. Use the --starter flag to specify a different Helm chart starter. There are two options:

web-service

Use this starter if the new Helm application is a web service, such as a new Safir FastAPI service. This is the default.

empty

Use this starter for any other type of application. This will create an empty Helm chart, to which you can add resources or external charts.

You will be prompted for a short description of the application. Keep it succinct, ideally just a few words, and do not add a period at the end.

Write the Chart.yaml#

After the previous step, there will be a new Helm chart for your application in applications/application. A basic Chart.yaml was created by phalanx application create. A few additional fields need to be filled out.

Chart versioning#

For charts that deploy a Rubin-developed application, set appVersion to the application’s Docker image tag (which is typically the version tag). For charts that do not deploy an application (for example, charts that are only used to manage subcharts as described in Adding an external Helm chart), delete the appVersion field.

Note

The chart also has a version field, which will be set to 1.0.0. This field does not need to be changed. The top level of charts defined in the applications directory are used only by Argo CD and are never published as stand-alone Helm charts. Their versions are therefore irrelevant, so we use 1.0.0 for all such charts.

Write the Kubernetes resource templates#

Put all Kubernetes resource templates that should be created by your chart in the templates subdirectory. See the Helm chart template developer’s guide for general instructions on how to write Helm templates.

Three aspects of writing a Helm chart are specific to Phalanx:

  • Some values will be automatically injected by Argo CD in the global.* namespace. See Values injected by Argo CD for more information.

  • All secrets must come from VaultSecret resources, not Kubernetes Secret resources. You should use the value of the global.vaultSecretsPath configuration option followed by a slash and the name of your application. Phalanx’s secret management requires that you use a Vault secret with exactly this name. global.vaultSecretsPath will be injected by Argo CD with the correct value for the environment in which your application is deployed. See Define the application secrets for more information about secrets.

  • Applications providing a web API should be protected by Gafaelfawr and require an appropriate scope. This normally means using a GafaelfawrIngress object rather than an Ingress object. If you use the web service starter, this is set up for you by the template using a GafaelfawrIngress resource in templates/ingress.yaml, but you will need to customize the scope required for access, and may need to add additional configuration. You will also need to customize the path under which your application should be served. See the Gafaelfawr documentation for more details.

/tmp#

The web-service starter creates a deployment with an entirely read-only file system. This is ideal for security, since it denies an attacker the ability to create new local files, which makes some attacks harder.

Some applications, however, need working scratch space. For those applications, you may need to mount a writable /tmp file system. Here is how to do that:

  1. Add a volumes section to the spec part of the Deployment (or add a new element one is not already there) that creates a volume for temporary files:

    deployment.yaml#
    volumes:
      - name: "tmp"
        emptyDir: {}
    
  2. Mount that volume by adding a volumeMounts section to the main container in the Deployment (or add it to the volume mounts if there already are others):

    deployment.yaml#
    volumeMounts:
      - name: "tmp"
        mountPath: "/tmp"
    

Warning

Files written to this temporary directory are stored in node ephemeral storage, which is shared between all pods running on that node. Writing excessive amounts of data to this directory may exhaust node resources and cause problems for other applications in the cluster.

This type of temporary directory should therefore only be used for small files. Applications that need large amounts of temporary space should allocate and mount a persistent volume instead.

Pull secrets#

If your application image resides at a Docker repository which requires authentication (either to pull the image at all or to raise the pull rate limit), then you must tell any pods deployed by your application to use a pull secret named pull-secret, and you must create a VaultSecret resource for that pull secret.

If your container image is built through GitHub Actions and stored at ghcr.io (the recommended approach), there is no rate limiting (as long as your container image is built from a public repository, which it should be). There is therefore no need for a pull secret and you can skip the rest of this section.

If your container image is stored at Docker Hub, you should use a pull secret, because we have been (and will no doubt continue to be) rate-limited at Docker Hub. Strongly consider moving your container image to the GitHub Container Registry (ghcr.io) instead.

If your container image is pulled from a private repository, you may need authentication and therefore a pull secret.

If you do need a pull secret, add a block like the following to the pod specification for any resource that creates pods.

deployment.yaml#
imagePullSecrets:
  - name: "pull-secret"

If you are using an external chart, see its documentation for how to configure pull secrets.

Then, add the following VaultSecret to your application templates to put a copy of pull-secret in your application’s namespace:

vault-secrets.yaml#
apiVersion: ricoberger.de/v1alpha1
kind: VaultSecret
metadata:
  name: pull-secret
  labels:
    {{- include "<application>.labels" . | nindent 4 }}
spec:
  path: "{{- .Values.global.vaultSecretsPath }}/pull-secret"
  type: kubernetes.io/dockerconfigjson

Replace <application> with the name of your application. If you already have another VaultSecret resource, put a line containing only --- between them. (This is the standard YAML syntax for putting mutiple objects in the same file.)

The pull secret itself is managed globally for the environment, usually by the environment administrator. See Updating the pull secret for an environment for details on how to modify the pul secret if necessary.

Restarting deployments when config maps change#

If your application is configured using a ConfigMap resource, you normally should arrange to restart the application when the ConfigMap changes. The easiest way to do this is to add a checksum of the config map to the annotations of the deployment, thus forcing a change to the deployment that will trigger a restart.

For more details, see Automatically roll deployments in the Helm documentation.

Tying service accounts to workload identity#

If your application will access Google Cloud services when running on Google Kubernetes Engine, it should use workload identity to authenticate to those services. This allows applications running in Kubernetes pods to authenticate as Google service accounts without worrying about key management or separate secrets.

To use workload identity, your application must run as a specific, named Kubernetes service account. Do not use the default service account created for each namespace.

Start by creating a Kubernetes service account for your application:

serviceaccount.yaml#
apiVersion: v1
kind: ServiceAccount
metadata:
  name: <application>
  labels:
    {{- include "<application>.labels" . | nindent 4 }}
  annotations:
    iam.gke.io/gcp-service-account: {{ required "serviceAccount must be set to a valid Google service account" .Values.serviceAccount | quote }}

Replace <application> with the name of your application.

Note the annotation. This tells Kubernetes which Google service account your application will be authenticating as. Once the Google service account has been created, you will add the appropriate service account name for each environment to your values-environment.yaml files.

Then, in your Deployment, and any other Kubernetes resource that creates pods that need to talk to Google services, configure Kubernetes to run the pod with that service account:

deployment.yaml#
template:
  spec:
    serviceAccountName: <application>

Also ensure that automountServiceAccountToken is set to true or not set. If the application uses the Google Cloud libraries, no further application configuration is required. The Google Cloud libraries will automatically recognize that workload identity is in use and will make the necessary API calls to get Google Cloud credentials.

These examples configure the application to use workload identity unconditionally. If the application may be deployed either under Google Kubernetes Engine or in other Kubernetes deployments, you will want to make workload identity conditional. Do that by adding {{- if .Values.serviceAccount }} or similar conditional blocks around both the ServiceAccount resource and around the serviceAccountName setting.

The above examples use serviceAccountName as the values.yaml setting. If the service account is only for CloudSQL, normal practice is to name the setting cloudsql.serviceAccountName and make workload identity conditional on whether cloudsql.enabled is true. If your application uses workload identity for other purposes, you can either use a top-level values setting as shown here, or put the setting wherever seems most appropriate (associated with one specific part of your application, for instance).

Finally, for each environment where you want to use workload identity, the Phalanx environment administrator must create a Google service account for your application and associate it with the namespace and Kubernetes service account name used by your application. See Set up workload identity. They will then tell you what service account name to use for each environment.

This is a simple configuration for an application that uses only one service account. If you have a more complex application that also needs Kubernetes permissions, you may need multiple service accounts with more specific names than just the name of your overall application. For an example of a more complicated configuration with multiple service accounts, see giftless’s Helm chart.

Write the values.yaml file#

The values.yaml file contains the customizable settings for your application. As a general rule, only use values.yaml settings for things that may vary between Phalanx environments. If something is the same in every Phalanx environment, it can be hard-coded into the Kubernetes resource templates.

If your application uses workload identity (see Tying service accounts to workload identity), remember to add a setting to configure the Google service account to use.

Injected values#

Three values will always be injected by Argo CD into your application automatically as globals, and therefore do not need to be set for each environment. These are global.baseUrl, global.host, and global.vaultSecretsPath and are taken from the global settings for each environment.

These should be mentioned for documentation purposes at the bottom of your values.yaml file with empty defaults. This is done automatically for you by the chart starters.

It is possible to inject other values from the environment configuration. For more details, see Values injected by Argo CD.

Documentation#

Phalanx uses helm-docs to automate generating documentation for the values.yaml settings.

For this to work correctly, each setting must be immediately preceded by a comment that starts with # -- and is followed by documentation for that setting in Markdown. This documentation may be wrapped to multiple lines.

Add a blank line between settings, before the helm-docs comment for the next setting.

The default value is included in the documentation. The documentation of the default value can be overridden with a comment starting with # @default --. This can be helpful when the default value in values.yaml is not useful (if, for instance, it’s a placeholder). For example:

# -- Tag of Gafaelfawr image to use
# @default -- The appVersion of the chart
tag: ""

For large default values or default values containing a lot of structure, the default behavior of helm-docs is to reproduce the entire JSON-encoded default in the generated documentation. This is often not useful and can break the HTML formatting of the resulting table. Therefore, for settings with long or complex values, use the following convention in a comment immediately before the setting:

# -- Description of the field.
# @default -- See the `values.yaml` file.
setting:
  - Some long complex value

Referring to Docker images#

To allow automated dependency updates to work, ensure that any Docker image deployed by your Helm chart uses values.yaml settings for the repository and current tag. These fields must be named repository and tag, respectively, and are conventionally nested under a key named image along with any other image properties that may need to be customized (such as pullPolicy).

Using this format will allow Mend Renovate to detect newer versions and create PRs to update Phalanx.

The main deployment (or stateful set, or cron job, etc.) for a Helm chart should use the appVersion in Chart.yaml as the default value for the image tag. This is done in the Kubernetes resource template. For example:

image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .ChartAppVersion }}"

Checking the chart#

Most of the testing of your chart will have to be done by deploying it in a test Kubernetes environment. See Add a new application to Phalanx for more details about how to do that. However, you can check the chart for basic syntax and some errors in Helm templating before deploying it.

To check your chart, run:

phalanx application lint <application>

Replace <application> with the name of your new application. Multiple applications may be listed to lint all of them.

This will run helm lint on the chart with the appropriate values files and injected settings for each environment for which it has a configuration and report any errors. helm lint does not check resources against their schemas, alas, but it will at least diagnose YAML and Helm templating syntax errors.

You can limit the linting to a specific environment by specifying an environment with the --environment (or -e or --env) flag.

This lint check will also be done via GitHub Actions when you create a Phalanx PR, and the PR cannot be merged until this lint check passes.

You can also ask for the fully-expanded Kubernetes resources that would be installed in the cluster when the chart is installed. Do this with:

phalanx application template <application> <environment>

Replace <application> with the name of your application and <environment> with the name of the environment for which you want to generate its resources. This will print to standard output the expanded YAML Kubernetes resources that would be created in the cluster by this chart.

Examples#

Existing Helm charts that are good examples to read or copy are:

  • hips (fairly simple)

  • mobu (also simple)

  • gafaelfawr (complex, including CRDs and multiple pods)

Next steps#

Be aware that Phalanx tests will not pass until you have done both of these steps as well.