K3s-cluster/README.md

13 KiB

K3s cluster

Name Usage Accessibility Host DB type Additional data Backup configuration Loki integration Prometheus integration Secret management Status Standalone migration
Traefik Reverse proxy and load balancer Public & Private Socrates & Pythagoras-b - - - Configured Configured - Completed5 Backbone
ArgoCD Declarative GitOPS CD Private Pythagoras-b - - - Configured Not configured - Partial Backbone
Vaultwarden Password manager Public Pythagoras-b PostgreSQL - 4AM K8s CronJob Configured Not available Configured Completed Completed
Gitlab Version control system Public Pythagoras-b PostgreSQL User created content 5AM internal CronJob Configured Configured Not configured Partial4 Awaiting
Radarr Movie collection manager Private Plato PostgreSQL - - Configured Configured Not configured Partial Awaiting
Flaresolverr Cloudflare proxy Private Plato - - - - - - Completed Awaiting
Prometheus Metrics aggregator Private Pythagoras-b TBD - Not configured Configured Configured Not configured Partial Awaiting
Loki Log aggregator Private Pythagoras-b TBD - Not configured Configured Configured Not configured Partial Awaiting
Grafana Graph visualizer Public Pythagoras-b - - Not configured Configured Configured Configured Partial Awaiting
Sonarr TV shows collection manager Private Plato SQLite - Not configured Configured Configured Not configured Partial Awaiting
Prowlarr Torrent indexer Private Plato PostgreSQL - Not configured Configured Not available Not configured Partial Awaiting
Jellyfin Media streaming Public Archimedes SQLite** - - Configured Configured Configured6 Completed Awaiting
Jellyseerr Media requesting WebUI Public Pythagoras-b - - - Not configured Not available Configured7 Awaiting configuration Awaiting
Adguard DNS ad blocker and custom DNS server Private Socrates - - - Not configured Not configured Not configured Pending configuration1 Awaiting
Owncloud Infinity Scale File hosting webUI Public Plato ? Drive files Not configured Configured Not available Not configured Pending configuration2 Awaiting
Synapse Matrix server - Message centralizer Public Pythagoras-b PostgreSQL User medias 4AM K8s CronJob Configured Configured Not configured Pending configuration3 Awaiting
therbron.com Personal website Public Socrates - - - Not configured Not configured - Awaiting configuration Awaiting
Home assistant Home automation and monitoring Private Pythagoras-a MariaDB - Not configured Not configured Not configured Not configured Awaiting configuration Awaiting
Vikunja To-do and Kanban boards Public Pythagoras-b - - - Not configured Not configured - Migrate to Gitlab Awaiting
Wiki Documentation manager Public Pythagoras-b - - - Not configured Not configured - Migrate to VuePress and Gitlab Awaiting
PaperlessNG PDF viewer and organiser Public Pythagoras-b PostgreSQL - - Not configured Not configured - Research migration into OCIS Awaiting
Deluge Torrent client Private Plato - ? - Not configured Not configured Not configured Awaiting configuration Awaiting
Minecraft Vanilla minecraft server for friends Public Archimedes - Game map Not configured Not configured Not configured - Awaiting configuration Awaiting
Satisfactory Satisfactory server for friends Public Archimedes - Game map Not configured Not configured Not configured - Not needed for v1 Awaiting
Space engineers Space engineers server for friends Public Archimedes - Game map Not configured Not configured Not configured - Not needed for v1 Awaiting
Raspsnir Bachelor memorial website Public Pythagoras-b PostgreSQL - Not configured Not configured Not configured - Not needed for v1 Awaiting

* Configuration panel only available internally
** Current implementation only support SQLite, making manual backups a necessity
1 Missing automated configuration pipeline for environment variable injection
2 Missing configuration for NAS volume mounting (over network)
3 Missing Longhorn scheduling for saving media_store and secret management
4 Backup management is not handled by k3s but by an internal cronjob rule (Change image name when putting to production)
5 Missing dashboard configuration

Backup management

Databases

All services needing a database to function come with a sidecar pod running a crontab to automate individual database backups. These backups are saved into a longhorn volume, to benefit from general snapshots later one. Each sidecar pod can only mount the backup folder it has been linked with, and cannot see other services' backups.

Additional data

All additional data needing to be backed up is mounted to a longhorn volume, to also benefit from scheduled backups.

Example :

longhorn
└───backups
    └───vaultwarden
    │   └───<backup_date>.sql
    │   │   ...
    └───gitlab
        └───<backup_date>.sql
        │   ...

TODO

  • Migrate Vaultwarden & Homeassistant to PostgreSQL instead of MariaDB
  • Deploy PostgresQL cluster using operator for database HA and easy maintenance
  • Change host/deployment specific variables to use environment variables (using Kustomize)
  • Write CI/CD pipeline to create environment loaded files Done with Kustomize migration
  • Write CI/CD pipeline to deploy cluster
  • Setup internal traefik with nodeport as reverse proxy for internal only services Done through double ingress class and LB
  • Setup DB container sidecars for automated backups to Longhorn volume
  • Setup secrets configuration through CI/CD variable injection (using Kustomzie)
  • Explore permission issues when issuing OVH API keys (not working for wildcard and beta.halia.dev subdomain)
  • Setup default users for deployments
  • Setup log and metric monitoring
  • Define namespaces through yaml files
  • Look into CockroachDB for redundant database Judged too complicated, moving to a 1 to 1 relationship between services and databases
  • Configure IP range accessibility through Traefik (Internal vs external services) Impossible because of flannel ip-masq
  • Schedule longhorn S3 backups
  • Move secrets to separate, private Git repository ?
  • Configure NFS connection for media library
  • Research IPv6 configuration for outsider node Impossible in Denmark while using YouSee as an ISP for now (no IPv6 support)

Notes

Cluster base setup

Setup the cluster's backbone

kubectl apply -k environment/dev

DO NOT FORGET TO INSTALL THE SOPS PART

NOTE: It might be required to update the metallb IP range as well as traefik LoadBalancerIPs

Convert helm chart to k3s manifest

helm template chart stable/chart --output-dir ./chart

Gitlab backup process

Because gitlab does not offer the possibility to backup a container's data from an external container, a cronjob has been implemented in the custom image used for deployment.

VPN configuration for Deluge

Instead of adding an extra networking layer to the whole cluster, it seems like a better idea to just integrate a wireguard connection inside of the deluge image, and self-build everything within Gitlab registry. This image could utilize kubernetes secrets, including a "torrent-vpn" secret produces by the initial wireguard configuration done via Ansible. This ansible script could create one (or more) additional client(s) depending on the inventory configuration, and keep the "torrent-vpn" configuration file within a k3s formated file, inside of the auto-applied directory on CP.
Cf : https://docs.k3s.io/advanced#auto-deploying-manifests

Development domains

To access a service publicly when developing, the domain name should be *.beta.halia.dev To only expose a service internally, the domain name should be *.beta.entos

Ingresses

To split between external and internal services, two traefik ingresses are implemented through the ingressclass annotation. traefik-external will only allow external access to a given service, while traefik-internal restrict to an internal only access.

Secret management

All secrets are encrypted using SOPS and stored in a private secret repository. Secrets are decrypted on the fly when applied to the kluster using the SOPS Operator.

Inject the AGE key in the cluster to allow the operator to decrypt secrets :

kubectl create secret generic age-key --from-file=<path_to_file> -n sops