Science Platform operations¶
The Rubin Science Platform is described in LDM-542. This document contains operational notes of interest to administrators of the Science Platform and maintainers of services deployed via the Science Platform, but not of interest to users.
For user documentation of the Notebook Aspect of the Rubin Science Platform, see nb.lsst.io.
The Science Platform uses Argo CD to manage its Kubernetes resources. The Argo CD configuration and this documentation are maintained on GitHub.
A phalanx is a SQuaRE deployment (Science Quality and Reliability Engineering, the team responsible for the Rubin Science Platform). Phalanx is how we ensure that all of our services work together as a unit.
Overview¶
For service maintainers¶
General development and operations¶
- Create a new service
- Add a secret with 1Password and VaultSecret
- Updating a secret stored in 1Password and VaultSecret
- Add a new service to Phalanx
- Adding an external Helm chart
- Set up a local development environment with minikube
- Syncing Argo CD
- Upgrading a service
- Changing charts and phalanx together
Specific tasks¶
For science platform administrators¶
Services¶
Bootstrapping¶
Infrastructure¶
Troubleshooting¶
- Troubleshooting the Rubin Science Platform
- PostgreSQL cannot mount its persistent volume
- Spawner menu missing images, cachemachine stuck pulling the same image
- Spawner menu shows empty parentheses after recommended rather than image tag
- Spawning a notebook fails with a pending error
- User gets permission denied from services
- You need privileged access to the filestore
- User pods don’t spawn, reporting “permission denied” from Moneypenny