Compare commits
16 Commits
base_setup
...
Upgrade/Ku
Author | SHA1 | Date | |
---|---|---|---|
82e41f8edf | |||
6fb2c5f27f | |||
48cf4b3aec | |||
8ac8d541f1 | |||
f14675935d | |||
801c504757 | |||
b23f4087b3 | |||
e0d3141580 | |||
e3acd566c8 | |||
3053904aa2 | |||
b5508eab97 | |||
5f3c3b0e91 | |||
b367990028 | |||
4606dd3cf5 | |||
96fd561257 | |||
950137040f |
12
Migrations/02-Say_HI_to_Proxmox/PrometheusVolume.yaml
Normal file
12
Migrations/02-Say_HI_to_Proxmox/PrometheusVolume.yaml
Normal file
@ -0,0 +1,12 @@
|
||||
kind: PersistentVolumeClaim
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
name: prometheus-storage
|
||||
namespace: observability
|
||||
spec:
|
||||
storageClassName: slow-nfs-01
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
690
Migrations/02-Say_HI_to_Proxmox/README.md
Normal file
690
Migrations/02-Say_HI_to_Proxmox/README.md
Normal file
@ -0,0 +1,690 @@
|
||||
|
||||
- This time I won't be doing a "walkthrough" from the process, but instead a progress list.
|
||||
|
||||
The plan is to replace the `srv` server that is currently used as standalone docker/NFS server, with a Proxmox instance as it would allow some more flexibility.
|
||||
|
||||
My current requirements are:
|
||||
|
||||
- I need a NFS server (Proxmox can do that)
|
||||
|
||||
- Jenkins agent
|
||||
|
||||
## NFS
|
||||
|
||||
Meanwhile I configure the NFS entries, the Kubernetes services will be down.
|
||||
|
||||
## Jenkins
|
||||
|
||||
The idea is to replace Jenkins with ArgoCD eventually, so as per the moment will be a 🤷
|
||||
|
||||
## Core Services
|
||||
|
||||
They will be moved to the Kubernetes cluster.
|
||||
|
||||
### Jellyfin
|
||||
|
||||
Will need to wait until:
|
||||
|
||||
- NFS are set up
|
||||
- Kubernetes worker node is set up / set up to only ARM64 arch.
|
||||
|
||||
### Home DHCP
|
||||
|
||||
I'm so good that I already was building an image with DHCP both for `amd64` and `arm64`.
|
||||
|
||||
### Registry
|
||||
|
||||
- Wait until NFS is set up
|
||||
|
||||
### Tube
|
||||
|
||||
- Wait until NFS is set up
|
||||
- Kubernetes worker node is set up / set up to only ARM64 arch.
|
||||
|
||||
### QBitTorrent
|
||||
|
||||
- Wait until NFS is set up
|
||||
|
||||
### CoreDNS
|
||||
|
||||
- Will be deleted.
|
||||
|
||||
### Gitea
|
||||
|
||||
- Wait until NFS is set up
|
||||
|
||||
## Extra notes
|
||||
|
||||
Could create a new NFS pool for media related, specially when some data could b stored in an HDD and other could be stored in a SSD.
|
||||
|
||||
# Steps
|
||||
|
||||
## Make the DHCP server work in/from the Kubernetes cluster
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Confirm how can I create a NFS server in Proxmox
|
||||
|
||||
https://www.reddit.com/r/Proxmox/comments/nnkt52/proxmox_host_as_nfs_server_or_guest_container_as/
|
||||
|
||||
https://forum.level1techs.com/t/how-to-create-a-nas-using-zfs-and-proxmox-with-pictures/117375
|
||||
|
||||
## Reorganize the local Network distribution/update the DHCP server
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Update the DHCP server with the new arrangement
|
||||
|
||||
- [x] Ready
|
||||
- [x] Done
|
||||
|
||||
## Update the DNS server with the new arrangement
|
||||
|
||||
- [x] Ready
|
||||
- [x] Done
|
||||
|
||||
## Delete External service points for the Klussy deployments
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Install Proxmox
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Install NFS service on the Proxmox host
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Configure NFS mount vols on the NFS server
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Move directory from old NFS to new NFS server
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Configure NFS mount vols on the klussy cluster to match the new NFS server
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Deploy "old" external services (if possible) + their NFS mounts
|
||||
|
||||
- [x] Gitea
|
||||
- [x] Tube (older version)
|
||||
- [x] Registry # Maybe replace Registry for Harbor in the future
|
||||
|
||||
https://ruzickap.github.io/k8s-harbor/part-04/#install-harbor-using-helm
|
||||
|
||||
## Deploy new slave node on the Proxmox server
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Update Cluster to latest version cause it's about time.
|
||||
|
||||
Made this Ansible script:
|
||||
- https://gitea.filterhome.xyz/ofilter/ansible_update_cluster
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Deploy remaining services + their NFS mounts
|
||||
|
||||
- [x] Jellyfin
|
||||
- [x] QBitTorrent
|
||||
- [x] Filebrowser
|
||||
|
||||
|
||||
## [EXTRA] Deploy new slave node on the Proxmox server (slave04)
|
||||
|
||||
Decided to add ANOTHER VM as a slave to allow some flexibility between x64 nodes.
|
||||
|
||||
- [x] Created the VM and installed the OS
|
||||
- [x] Set up GPU pass through for the newly created VM
|
||||
- [x] Created a Kubernetes Node
|
||||
- [x] Done
|
||||
|
||||
|
||||
## Set up the GPU available in the Kubernetes Node
|
||||
|
||||
Very much what the title says. Steps below.
|
||||
|
||||
- [x] Done
|
||||
|
||||
|
||||
### Install nvidia drivers
|
||||
|
||||
> **Note:**
|
||||
> - Steps were performed in the VM Instance (Slave04). \
|
||||
> - Snapshots were performed on the Proxmox node, taking a snapshot of the affected VM. \
|
||||
> - `Kubectl` command(s) were performed on a computer of mine external to the Kubernetes Cluster/Nodes to interact with the Kubernetes Cluster.
|
||||
|
||||
#### Take snapshot
|
||||
|
||||
- [x] Done
|
||||
|
||||
#### Repo thingies
|
||||
|
||||
Enable `non-free` repo for debian.
|
||||
|
||||
aka. idk you do that
|
||||
|
||||
`non-free` and `non-free-firmware` are different things, so if `non-free-firmware` is already listed, but `non-free` not, slap that bitch in + `contrib`.
|
||||
|
||||
```md
|
||||
FROM:
|
||||
deb http://ftp.au.debian.org/debian/ buster main
|
||||
TO:
|
||||
deb-src http://ftp.au.debian.org/debian/ buster main non-free contrib
|
||||
```
|
||||
|
||||
In my case that was enabled during the installation.
|
||||
|
||||
Once repos set up, use:
|
||||
|
||||
```shell
|
||||
apt update && apt install nvidia-detect -y
|
||||
```
|
||||
|
||||
##### [Error] Unable to locate package nvidia-detect
|
||||
|
||||
Ensure both `non-free` and `contrib` are in the repo file.
|
||||
|
||||
(File /etc/apt/sources.list)
|
||||
|
||||
####
|
||||
```shell
|
||||
nvidia-detect
|
||||
```
|
||||
```text
|
||||
Detected NVIDIA GPUs:
|
||||
00:10.0 VGA compatible controller [0300]: NVIDIA Corporation GM206 [GeForce GTX 960] [10de:1401] (rev a1)
|
||||
|
||||
Checking card: NVIDIA Corporation GM206 [GeForce GTX 960] (rev a1)
|
||||
Your card is supported by all driver versions.
|
||||
Your card is also supported by the Tesla drivers series.
|
||||
Your card is also supported by the Tesla 470 drivers series.
|
||||
It is recommended to install the
|
||||
nvidia-driver
|
||||
package.
|
||||
```
|
||||
|
||||
### Install nvidia driver
|
||||
|
||||
```shell
|
||||
apt install nvidia-driver
|
||||
```
|
||||
|
||||
We might receive a complaint regarding "conflicting modules".
|
||||
|
||||
Just restart the VM.
|
||||
|
||||
#### Reboot VM
|
||||
|
||||
```shell
|
||||
reboot
|
||||
```
|
||||
|
||||
#### nvidia-smi
|
||||
|
||||
VM has access to the Nvidia drivers/GPU
|
||||
|
||||
```shell
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
```text
|
||||
Fri Dec 15 00:00:36 2023
|
||||
+-----------------------------------------------------------------------------+
|
||||
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 |
|
||||
|-------------------------------+----------------------+----------------------+
|
||||
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
|
||||
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|
||||
| | | MIG M. |
|
||||
|===============================+======================+======================|
|
||||
| 0 NVIDIA GeForce ... On | 00000000:00:10.0 Off | N/A |
|
||||
| 0% 38C P8 11W / 160W | 1MiB / 4096MiB | 0% Default |
|
||||
| | | N/A |
|
||||
+-------------------------------+----------------------+----------------------+
|
||||
|
||||
+-----------------------------------------------------------------------------+
|
||||
| Processes: |
|
||||
| GPU GI CI PID Type Process name GPU Memory |
|
||||
| ID ID Usage |
|
||||
|=============================================================================|
|
||||
| No running processes found |
|
||||
+-----------------------------------------------------------------------------+
|
||||
```
|
||||
|
||||
### Install Nvidia Container Runtime
|
||||
|
||||
#### Take snapshot
|
||||
|
||||
- [x] Done
|
||||
|
||||
#### Install curl
|
||||
|
||||
```shell
|
||||
apt-get install curl
|
||||
```
|
||||
|
||||
#### Add repo
|
||||
|
||||
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-apt
|
||||
|
||||
```shell
|
||||
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
|
||||
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
|
||||
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
|
||||
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
||||
```
|
||||
|
||||
```shell
|
||||
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
|
||||
```
|
||||
|
||||
### Update Containerd config
|
||||
|
||||
#### Select nvidia-container-runtime as new runtime for Containerd
|
||||
|
||||
> No clue if this is a requirement! as afterward also did more changes to the configuration.
|
||||
|
||||
```shell
|
||||
sudo sed -i 's/runtime = "runc"/runtime = "nvidia-container-runtime"/g' /etc/containerd/config.toml
|
||||
```
|
||||
|
||||
#### Reboot Containerd service
|
||||
|
||||
```shell
|
||||
sudo systemctl restart containerd
|
||||
```
|
||||
|
||||
#### Check status from Containerd
|
||||
|
||||
Check if Containerd has initialized correctly after restarting the service.
|
||||
|
||||
```shell
|
||||
sudo systemctl status containerd
|
||||
```
|
||||
|
||||
### Test nvidia runtime
|
||||
|
||||
#### Pull nvidia cuda image
|
||||
|
||||
I used the Ubuntu based container since I didn't find one specific for Debian.
|
||||
|
||||
```shell
|
||||
sudo ctr images pull docker.io/nvidia/cuda:12.3.1-base-ubuntu20.04
|
||||
```
|
||||
|
||||
```text
|
||||
docker.io/nvidia/cuda:12.3.1-base-ubuntu20.04: resolved |++++++++++++++++++++++++++++++++++++++|
|
||||
index-sha256:0654b44e2515f03b811496d0e2d67e9e2b81ca1f6ed225361bb3e3bb67d22e18: done |++++++++++++++++++++++++++++++++++++++|
|
||||
manifest-sha256:7d8fdd2a5e96ec57bc511cda1fc749f63a70e207614b3485197fd734359937e7: done |++++++++++++++++++++++++++++++++++++++|
|
||||
layer-sha256:25ad149ed3cff49ddb57ceb4418377f63c897198de1f9de7a24506397822de3e: done |++++++++++++++++++++++++++++++++++++++|
|
||||
layer-sha256:1698c67699a3eee2a8fc185093664034bb69ab67c545ab6d976399d5500b2f44: done |++++++++++++++++++++++++++++++++++++++|
|
||||
config-sha256:d13839a3c4fbd332f324c135a279e14c432e90c8a03a9cedc43ddf3858f882a7: done |++++++++++++++++++++++++++++++++++++++|
|
||||
layer-sha256:ba7b66a9df40b8a1c1a41d58d7c3beaf33a50dc842190cd6a2b66e6f44c3b57b: done |++++++++++++++++++++++++++++++++++++++|
|
||||
layer-sha256:c5f2ffd06d8b1667c198d4f9a780b55c86065341328ab4f59d60dc996ccd5817: done |++++++++++++++++++++++++++++++++++++++|
|
||||
layer-sha256:520797292d9250932259d95f471bef1f97712030c1d364f3f297260e5fee1de8: done |++++++++++++++++++++++++++++++++++++++|
|
||||
elapsed: 4.2 s
|
||||
```
|
||||
|
||||
#### Start container
|
||||
|
||||
Containerd already has access to the nvidia gpu/drivers
|
||||
|
||||
```shell
|
||||
sudo ctr run --rm --gpus 0 docker.io/nvidia/cuda:12.3.1-base-ubuntu20.04 nvidia-smi nvidia-smi
|
||||
```
|
||||
|
||||
```text
|
||||
Thu Dec 14 23:18:55 2023
|
||||
+-----------------------------------------------------------------------------+
|
||||
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.3 |
|
||||
|-------------------------------+----------------------+----------------------+
|
||||
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
|
||||
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|
||||
| | | MIG M. |
|
||||
|===============================+======================+======================|
|
||||
| 0 NVIDIA GeForce ... On | 00000000:00:10.0 Off | N/A |
|
||||
| 0% 41C P8 11W / 160W | 1MiB / 4096MiB | 0% Default |
|
||||
| | | N/A |
|
||||
+-------------------------------+----------------------+----------------------+
|
||||
|
||||
+-----------------------------------------------------------------------------+
|
||||
| Processes: |
|
||||
| GPU GI CI PID Type Process name GPU Memory |
|
||||
| ID ID Usage |
|
||||
|=============================================================================|
|
||||
| No running processes found |
|
||||
+-----------------------------------------------------------------------------+
|
||||
```
|
||||
|
||||
### Set the GPU available in the Kubernetes Node
|
||||
|
||||
We `still` don't have the GPU added/available in the Node.
|
||||
|
||||
```shell
|
||||
kubectl describe nodes | tr -d '\000' | sed -n -e '/^Name/,/Roles/p' -e '/^Capacity/,/Allocatable/p' -e '/^Allocated resources/,/Events/p' | grep -e Name -e nvidia.com | perl -pe 's/\n//' | perl -pe 's/Name:/\n/g' | sed 's/nvidia.com\/gpu:\?//g' | sed '1s/^/Node Available(GPUs) Used(GPUs)/' | sed 's/$/ 0 0 0/' | awk '{print $1, $2, $3}' | column -t
|
||||
```
|
||||
|
||||
```text
|
||||
Node Available(GPUs) Used(GPUs)
|
||||
pi4.filter.home 0 0
|
||||
slave01.filter.home 0 0
|
||||
slave02.filter.home 0 0
|
||||
slave03.filter.home 0 0
|
||||
slave04.filter.home 0 0
|
||||
```
|
||||
|
||||
#### Update
|
||||
|
||||
Set Containerd config with the following settings.
|
||||
|
||||
Obv do a backup of the config before proceeding to modify the file.
|
||||
|
||||
```toml
|
||||
# /etc/containerd/config.toml
|
||||
version = 2
|
||||
[plugins]
|
||||
[plugins."io.containerd.grpc.v1.cri"]
|
||||
[plugins."io.containerd.grpc.v1.cri".containerd]
|
||||
default_runtime_name = "nvidia"
|
||||
|
||||
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
|
||||
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
|
||||
privileged_without_host_devices = false
|
||||
runtime_engine = ""
|
||||
runtime_root = ""
|
||||
runtime_type = "io.containerd.runc.v2"
|
||||
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
|
||||
BinaryName = "/usr/bin/nvidia-container-runtime"
|
||||
```
|
||||
#### Restart containerd (again)
|
||||
|
||||
```shell
|
||||
sudo systemctl restart containerd
|
||||
```
|
||||
|
||||
#### Check status from Containerd
|
||||
|
||||
Check if Containerd has initialized correctly after restarting the service.
|
||||
|
||||
```shell
|
||||
sudo systemctl status containerd
|
||||
```
|
||||
|
||||
#### Set some labels to avoid spread
|
||||
|
||||
We will deploy Nvidia CRDs so will tag the Kubernetes nodes that **won't** have a GPU available to avoid running GPU related stuff on them.
|
||||
|
||||
```shell
|
||||
kubectl label nodes slave0{1..3}.filter.home nvidia.com/gpu.deploy.operands=false
|
||||
```
|
||||
|
||||
#### Deploy nvidia operators
|
||||
|
||||
"Why this `--set` flags?"
|
||||
|
||||
- Cause that's what worked out for me. Don't like it? Want to explore? Just try which combination works for you idk.
|
||||
|
||||
```shell
|
||||
helm install --wait --generate-name \
|
||||
nvidia/gpu-operator \
|
||||
--set operator.defaultRuntime="containerd"\
|
||||
-n gpu-operator \
|
||||
--set driver.enabled=false \
|
||||
--set toolkit.enabled=false
|
||||
```
|
||||
|
||||
### Check running pods
|
||||
|
||||
Check all the pods are running (or have completed)
|
||||
|
||||
```shell
|
||||
kubectl get pods -n gpu-operator -owide
|
||||
```
|
||||
```text
|
||||
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
|
||||
gpu-feature-discovery-4nctr 1/1 Running 0 9m34s 172.16.241.67 slave04.filter.home <none> <none>
|
||||
gpu-operator-1702608759-node-feature-discovery-gc-79d6bb94h6fht 1/1 Running 0 9m57s 172.16.176.63 slave03.filter.home <none> <none>
|
||||
gpu-operator-1702608759-node-feature-discovery-master-64c5nwww4 1/1 Running 0 9m57s 172.16.86.110 pi4.filter.home <none> <none>
|
||||
gpu-operator-1702608759-node-feature-discovery-worker-72wqk 1/1 Running 0 9m57s 172.16.106.5 slave02.filter.home <none> <none>
|
||||
gpu-operator-1702608759-node-feature-discovery-worker-7snt4 1/1 Running 0 9m57s 172.16.86.111 pi4.filter.home <none> <none>
|
||||
gpu-operator-1702608759-node-feature-discovery-worker-9ngnw 1/1 Running 0 9m56s 172.16.176.5 slave03.filter.home <none> <none>
|
||||
gpu-operator-1702608759-node-feature-discovery-worker-csnfq 1/1 Running 0 9m56s 172.16.241.123 slave04.filter.home <none> <none>
|
||||
gpu-operator-1702608759-node-feature-discovery-worker-k6dxf 1/1 Running 0 9m57s 172.16.247.8 slave01.filter.home <none> <none>
|
||||
gpu-operator-fcbd9bbd7-fv5kb 1/1 Running 0 9m57s 172.16.86.116 pi4.filter.home <none> <none>
|
||||
nvidia-cuda-validator-xjfkr 0/1 Completed 0 5m37s 172.16.241.126 slave04.filter.home <none> <none>
|
||||
nvidia-dcgm-exporter-q8kk4 1/1 Running 0 9m35s 172.16.241.125 slave04.filter.home <none> <none>
|
||||
nvidia-device-plugin-daemonset-vvz4c 1/1 Running 0 9m35s 172.16.241.127 slave04.filter.home <none> <none>
|
||||
nvidia-operator-validator-8899m 1/1 Running 0 9m35s 172.16.241.124 slave04.filter.home <none> <none>
|
||||
```
|
||||
|
||||
### Done!
|
||||
|
||||
```shell
|
||||
kubectl describe nodes | tr -d '\000' | sed -n -e '/^Name/,/Roles/p' -e '/^Capacity/,/Allocatable/p' -e '/^Allocated resources/,/Events/p' | grep -e Name -e nvidia.com | perl -pe 's/\n//' | perl -pe 's/Name:/\n/g' | sed 's/nvidia.com\/gpu:\?//g' | sed '1s/^/Node Available(GPUs) Used(GPUs)/' | sed 's/$/ 0 0 0/' | awk '{print $1, $2, $3}' | column -t
|
||||
```
|
||||
|
||||
```text
|
||||
Node Available(GPUs) Used(GPUs)
|
||||
pi4.filter.home 0 0
|
||||
slave01.filter.home 0 0
|
||||
slave02.filter.home 0 0
|
||||
slave03.filter.home 0 0
|
||||
slave04.filter.home 1 0
|
||||
```
|
||||
|
||||
### vGPU
|
||||
|
||||
I could use vGPU and split my GPU among multiple VMs, but, it would also mean that the GPU no longer posts to the Physical Monitor attached to the Proxmox PC/Server, which I would like to avoid.
|
||||
|
||||
Meanwhile, it's certainly not a requirement (and I only use the monitor on emergencies/whenever I need to touch the BIOS/Install a new OS), I **still** don't own a Serial connector, therefore I will consider making the change to use vGPU **in the future** (whenever I receive the package from Aliexpress, and I confirm it works).
|
||||
|
||||
|
||||
|
||||
[//]: # (```shell)
|
||||
|
||||
[//]: # (kubectl events pods --field-selector status.phase!=Running -n gpu-operator)
|
||||
|
||||
[//]: # (```)
|
||||
|
||||
[//]: # ()
|
||||
[//]: # (```shell)
|
||||
|
||||
[//]: # (kubectl get pods --field-selector status.phase!=Running -n gpu-operator | awk '{print $1}' | tail -n +2 | xargs kubectl events -n gpu-operator pods)
|
||||
[//]: # (```)
|
||||
|
||||
|
||||
## Jellyfin GPU Acceleration
|
||||
|
||||
- [x] Configured Jellyfin with GPU acceleration
|
||||
|
||||
## Make Cluster HA
|
||||
|
||||
- [ ] Done
|
||||
- [x] Aborted
|
||||
|
||||
Since it would mostly require to recreate the cluster, I would like to have the DNS/DHCP service externalized to the cluster, or a Load Balancer external to the cluster, etc etc.
|
||||
|
||||
So, I rather have a cluster with 2 points of failure:
|
||||
|
||||
- Single control plane
|
||||
- No HA NFS/NAS
|
||||
|
||||
Then to having an Uroboros for Cluster.
|
||||
|
||||
I also just thought on having a DNS failover
|
||||
|
||||
But it's not the current case, as
|
||||
|
||||
## Update rest of the stuff/configs as required to match the new Network distribution
|
||||
|
||||
Which stuff?
|
||||
|
||||
IDK. It's an OS in case I'm forgetting something
|
||||
|
||||
- [x] Done Aka. everything seems to be running correctly
|
||||
|
||||
## Migrade Jenkins
|
||||
|
||||
https://devopscube.com/jenkins-build-agents-kubernetes/
|
||||
|
||||
https://www.jenkins.io/doc/book/installing/kubernetes/
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Skaffold
|
||||
|
||||
- Learned to use Skaffold, yet requires manual execution.
|
||||
|
||||
- It's great tho
|
||||
|
||||
https://skaffold.dev/docs/references/yaml/
|
||||
|
||||
https://skaffold.dev/docs/builders/cross-platform/
|
||||
|
||||
## CI/CD Container creation
|
||||
|
||||
I have decided dump my old Jenkins architecture and rely on Skaffold, it's great.
|
||||
|
||||
I will work on integrating it with Jenkins.
|
||||
|
||||
# EXTRA EXTRA
|
||||
|
||||
## Secondary NFS provisioner
|
||||
|
||||
I will add a **secondary NFS Provisioner** as a new storage class.
|
||||
|
||||
This storage class will be targeting a **"slow"/HDD** directory/drive.
|
||||
|
||||
Mainly intended for storing a bunch of logs, files, videos, or whatever.
|
||||
|
||||
Looking at you Prometheus 👀👀.
|
||||
|
||||
NFS server: nfs.filter.home
|
||||
|
||||
Target directory: **/resources/slow_nfs_provisioner** (this is made up, I don't want to share it.)
|
||||
|
||||
## NFS Server
|
||||
|
||||
### Create the directory
|
||||
|
||||
- [x] Done
|
||||
|
||||
### Update NFS service config to allow such directory to be used.
|
||||
|
||||
- [x] Done
|
||||
|
||||
## Deploy new NFS provisioner
|
||||
|
||||
```shell
|
||||
NFS_SERVER=nfs.filter.home
|
||||
NFS_EXPORT_PATH=/resources/slow_nfs_provisioner
|
||||
```
|
||||
|
||||
```shell
|
||||
helm -n nfs-provisioner install slow-nfs-01 nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
|
||||
--set nfs.server=${NFS_SERVER} \
|
||||
--set nfs.path=${NFS_EXPORT_PATH} \
|
||||
--set storageClass.defaultClass=true \
|
||||
--set replicaCount=2 \
|
||||
--set storageClass.name=slow-nfs-01 \
|
||||
--set storageClass.provisionerName=slow-nfs-01
|
||||
```
|
||||
```text
|
||||
NAME: slow-nfs-provisioner-01
|
||||
LAST DEPLOYED: Fri Jan 12 23:32:25 2024
|
||||
NAMESPACE: nfs-provisioner
|
||||
STATUS: deployed
|
||||
REVISION: 1
|
||||
TEST SUITE: None
|
||||
```
|
||||
|
||||
## Migrate some volumes to new dir
|
||||
|
||||
### Prometheus
|
||||
|
||||
(because he's the one filling my SSD.)
|
||||
|
||||
Copy files from (maintaining permissions):
|
||||
|
||||
**/resources/slow_nfs_provisioner/prometheus_generated_vol** to **/resources/slow_nfs_provisioner/prometheus_tmp**
|
||||
|
||||
This is mainly to "have them" already on the destination drive, folder name can be whatever.
|
||||
|
||||
### Create/Provision new PV
|
||||
|
||||
Since `path` value is immutable after creation, it will require to create a new volume, move the contents to the new volume, update the configs to match the new volume, recreate the workloads, then delete the old one.
|
||||
|
||||
Since this is my homelab, and I'm not bothered by some minutes of lost logs, I will instead, delete the old volume, delete the used deployment, create a new volume, then rename the folder `prometheus_tmp` I created on the previous step to replace the volume created (since the new volume is empty).
|
||||
|
||||
Then restart the Kubernetes deployment.
|
||||
|
||||
## Delete PVC
|
||||
|
||||
```shell
|
||||
kubectl delete pvc -n observability prometheus-storage --force
|
||||
```
|
||||
|
||||
This can take a bit since there are like 40GB of logs + it's still being used by the deployment.
|
||||
|
||||
```shell
|
||||
kubectl get pvc -n observability prometheus-storage
|
||||
```
|
||||
|
||||
```text
|
||||
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
|
||||
prometheus-storage Terminating pvc-698cf837-14a3-43ee-990a-5a34e1a396de 1Gi RWX nfs-01 94d
|
||||
```
|
||||
|
||||
### Delete Deployment
|
||||
|
||||
```shell
|
||||
kubectl delete deployment -n observability prometheus
|
||||
```
|
||||
|
||||
```text
|
||||
deployment.apps "prometheus" deleted
|
||||
```
|
||||
|
||||
### Delete PV
|
||||
|
||||
```shell
|
||||
kubectl delete pv pvc-698cf837-14a3-43ee-990a-5a34e1a396de
|
||||
```
|
||||
|
||||
```text
|
||||
persistentvolume "pvc-698cf837-14a3-43ee-990a-5a34e1a396de" deleted
|
||||
```
|
||||
|
||||
### Create new volume.
|
||||
|
||||
```shell
|
||||
kubectl create -f PrometheusVolume.yaml
|
||||
```
|
||||
|
||||
```text
|
||||
persistentvolumeclaim/prometheus-storage created
|
||||
```
|
||||
|
||||
I later did some cleanup from the existent data cause 41GB was kind of too much for the usage I do (aka noticed that the container `prometheus-server` was taking forever to parse all the data).
|
||||
|
||||
Later will change the configurations to reduce the retention + data stored.
|
||||
|
||||
### Redeployed Prometheus
|
||||
|
||||
It's been a while since I did the deployment.
|
||||
|
||||
```bash
|
||||
kubectl get deployment -n observability prometheus
|
||||
```
|
||||
|
||||
```text
|
||||
NAME READY UP-TO-DATE AVAILABLE AGE
|
||||
prometheus 1/1 1 1 3h24m
|
||||
```
|
||||
|
||||
# Interesting
|
||||
|
||||
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#cross-namespace-data-sources
|
25
Migrations/02-Say_HI_to_Proxmox/dhcp_notes.md
Normal file
25
Migrations/02-Say_HI_to_Proxmox/dhcp_notes.md
Normal file
@ -0,0 +1,25 @@
|
||||
# Initial notes
|
||||
```
|
||||
.1 Gateway
|
||||
|
||||
.2/3 DHCP-DNS
|
||||
|
||||
9-6 Kubernetes masters.
|
||||
10-15 Kubernetes slaves.
|
||||
|
||||
20 Public Ingress
|
||||
21 Local Ingress
|
||||
22-38 Kubernetes LBs/Deployments/Services
|
||||
39 Egress gateway
|
||||
|
||||
50-60 Standalone Hosts
|
||||
61-70 Proxmox
|
||||
|
||||
100-120 VMs
|
||||
|
||||
140-149 Handpicked client hosts
|
||||
|
||||
150-200 DHCP range
|
||||
|
||||
250-255 Wifi and stuff
|
||||
```
|
392
Migrations/03-Virtualize_MasterK/README.md
Normal file
392
Migrations/03-Virtualize_MasterK/README.md
Normal file
@ -0,0 +1,392 @@
|
||||
# Description
|
||||
|
||||
Very much what the title says.
|
||||
|
||||
0. Search.
|
||||
1. Create Proxmox VM and install OS on it.
|
||||
2. Install cluster thingies to the VM.
|
||||
3. Backup Cluster/Master Node
|
||||
4. Stop Old Master Node
|
||||
5. Restore Cluster on New Master Node
|
||||
6. Update New Master Node IP to Use the Old Master Node IP
|
||||
7. Rejoin All Nodes to the "New Cluster"
|
||||
|
||||
|
||||
# Notes
|
||||
|
||||
## Possible issues?
|
||||
|
||||
- Master node name might present some discrepancies, will need to test.
|
||||
|
||||
- When the cluster is restored in the New Master Node, grant access to the client in that NFS server.
|
||||
|
||||
## Virtual Master Hardware
|
||||
|
||||
- 2 CPU Cores
|
||||
|
||||
- 8 GB of RAM
|
||||
|
||||
# Procedure
|
||||
|
||||
- [x] VM Created
|
||||
- [x] SO (Debian) Installed
|
||||
- [x] Edit Cluster Setup installer Ansible script into allowing not proceeding further after installing the packages/stuff necessary.
|
||||
- [x] Install guest agent in all the VMs (I did kinda forgot about that)
|
||||
- [x] Backup VM
|
||||
- [x] Follow the guide from bellow
|
||||
- [ ] Perform another backup to the control plane VM
|
||||
|
||||
# Links
|
||||
|
||||
I'm going to be following this:
|
||||
|
||||
https://serverfault.com/questions/1031093/migration-of-kubernetes-master-node-from-1-server-to-another-server
|
||||
|
||||
[//]: # ()
|
||||
[//]: # (# Backup ETCD Kubernetes)
|
||||
|
||||
[//]: # ()
|
||||
[//]: # (https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/)
|
||||
|
||||
[//]: # ()
|
||||
[//]: # ()
|
||||
|
||||
# Verify your etcd data directory
|
||||
|
||||
SSH into the masterk node.
|
||||
|
||||
```shell
|
||||
kubectl get pods -n kube-system etcd-pi4.filter.home -oyaml | less
|
||||
```
|
||||
|
||||
```yaml
|
||||
...
|
||||
volumeMounts:
|
||||
- mountPath: /var/lib/etcd
|
||||
name: etcd-data
|
||||
- mountPath: /etc/kubernetes/pki/etcd
|
||||
name: etcd-certs
|
||||
...
|
||||
volumes:
|
||||
- hostPath:
|
||||
path: /etc/kubernetes/pki/etcd
|
||||
type: DirectoryOrCreate
|
||||
name: etcd-certs
|
||||
- hostPath:
|
||||
path: /var/lib/etcd
|
||||
type: DirectoryOrCreate
|
||||
name: etcd-data
|
||||
|
||||
```
|
||||
|
||||
# Copy from old_master to new_master
|
||||
|
||||
> Why **bakup** instead of ba**ck**up? Because I want to use the K as Kubernetes.
|
||||
|
||||
## On new_master
|
||||
|
||||
```shell
|
||||
mkdir bakup
|
||||
```
|
||||
|
||||
## on OLD_master
|
||||
|
||||
```shell
|
||||
sudo scp -r /etc/kubernetes/pki master2@192.168.1.173:~/bakup/
|
||||
```
|
||||
|
||||
```console
|
||||
healthcheck-client.key 100% 1679 577.0KB/s 00:00
|
||||
server.crt 100% 1216 1.1MB/s 00:00
|
||||
server.key 100% 1679 1.1MB/s 00:00
|
||||
peer.crt 100% 1216 440.5KB/s 00:00
|
||||
ca.crt 100% 1094 461.5KB/s 00:00
|
||||
healthcheck-client.crt 100% 1159 417.8KB/s 00:00
|
||||
ca.key 100% 1679 630.8KB/s 00:00
|
||||
peer.key 100% 1679 576.4KB/s 00:00
|
||||
front-proxy-client.crt 100% 1119 859.7KB/s 00:00
|
||||
front-proxy-ca.key 100% 1679 672.4KB/s 00:00
|
||||
ca.crt 100% 1107 386.8KB/s 00:00
|
||||
sa.pub 100% 451 180.7KB/s 00:00
|
||||
front-proxy-client.key 100% 1679 1.4MB/s 00:00
|
||||
apiserver-etcd-client.key 100% 1675 1.3MB/s 00:00
|
||||
apiserver.crt 100% 1294 819.1KB/s 00:00
|
||||
ca.key 100% 1679 1.3MB/s 00:00
|
||||
sa.key 100% 1679 1.5MB/s 00:00
|
||||
apiserver-kubelet-client.crt 100% 1164 908.2KB/s 00:00
|
||||
apiserver-kubelet-client.key 100% 1679 1.2MB/s 00:00
|
||||
apiserver-etcd-client.crt 100% 1155 927.9KB/s 00:00
|
||||
apiserver.key 100% 1675 1.4MB/s 00:00
|
||||
front-proxy-ca.crt 100% 1123 939.7KB/s 00:00
|
||||
```
|
||||
|
||||
## Remove "OLD" certs from the backup created
|
||||
|
||||
### on new_master
|
||||
|
||||
```shell
|
||||
rm ~/bakup/pki/{apiserver.*,etcd/peer.*}
|
||||
```
|
||||
|
||||
```console
|
||||
removed '~/bakup/pki/apiserver.crt'
|
||||
removed '~/bakup/pki/apiserver.key'
|
||||
removed '~/bakup/pki/etcd/peer.crt'
|
||||
removed '~/bakup/pki/etcd/peer.key'
|
||||
```
|
||||
|
||||
## Move backup Kubernetes to the kubernetes directory (new_master)
|
||||
|
||||
```shell
|
||||
cp -r ~/bakup/pki /etc/kubernetes/
|
||||
```
|
||||
|
||||
```console
|
||||
'~/bakup/pki' -> '/etc/kubernetes/pki'
|
||||
'~/bakup/pki/etcd' -> '/etc/kubernetes/pki/etcd'
|
||||
'~/bakup/pki/etcd/healthcheck-client.key' -> '/etc/kubernetes/pki/etcd/healthcheck-client.key'
|
||||
'~/bakup/pki/etcd/server.crt' -> '/etc/kubernetes/pki/etcd/server.crt'
|
||||
'~/bakup/pki/etcd/server.key' -> '/etc/kubernetes/pki/etcd/server.key'
|
||||
'~/bakup/pki/etcd/ca.crt' -> '/etc/kubernetes/pki/etcd/ca.crt'
|
||||
'~/bakup/pki/etcd/healthcheck-client.crt' -> '/etc/kubernetes/pki/etcd/healthcheck-client.crt'
|
||||
'~/bakup/pki/etcd/ca.key' -> '/etc/kubernetes/pki/etcd/ca.key'
|
||||
'~/bakup/pki/front-proxy-client.crt' -> '/etc/kubernetes/pki/front-proxy-client.crt'
|
||||
'~/bakup/pki/front-proxy-ca.key' -> '/etc/kubernetes/pki/front-proxy-ca.key'
|
||||
'~/bakup/pki/ca.crt' -> '/etc/kubernetes/pki/ca.crt'
|
||||
'~/bakup/pki/sa.pub' -> '/etc/kubernetes/pki/sa.pub'
|
||||
'~/bakup/pki/front-proxy-client.key' -> '/etc/kubernetes/pki/front-proxy-client.key'
|
||||
'~/bakup/pki/apiserver-etcd-client.key' -> '/etc/kubernetes/pki/apiserver-etcd-client.key'
|
||||
'~/bakup/pki/ca.key' -> '/etc/kubernetes/pki/ca.key'
|
||||
'~/bakup/pki/sa.key' -> '/etc/kubernetes/pki/sa.key'
|
||||
'~/bakup/pki/apiserver-kubelet-client.crt' -> '/etc/kubernetes/pki/apiserver-kubelet-client.crt'
|
||||
'~/bakup/pki/apiserver-kubelet-client.key' -> '/etc/kubernetes/pki/apiserver-kubelet-client.key'
|
||||
'~/bakup/pki/apiserver-etcd-client.crt' -> '/etc/kubernetes/pki/apiserver-etcd-client.crt'
|
||||
'~/bakup/pki/front-proxy-ca.crt' -> '/etc/kubernetes/pki/front-proxy-ca.crt'
|
||||
```
|
||||
|
||||
## ETCD snapshot on OLD_master
|
||||
|
||||
|
||||
### from Kubectl
|
||||
|
||||
Check etcd api version.
|
||||
|
||||
```shell
|
||||
kubectl exec -it etcd-pi4.filter.home -n kube-system -- etcdctl version
|
||||
```
|
||||
|
||||
```console
|
||||
etcdctl version: 3.5.10
|
||||
API version: 3.5
|
||||
```
|
||||
|
||||
### Create snapshot through etcd pod
|
||||
|
||||
```shell
|
||||
kubectl exec -it etcd-pi4.filter.home -n kube-system -- etcdctl --endpoints https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key snapshot save /var/lib/etcd/snapshot1.db
|
||||
```
|
||||
|
||||
```console
|
||||
{"level":"info","ts":"2024-03-10T04:38:23.909625Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/var/lib/etcd/snapshot1.db.part"}
|
||||
{"level":"info","ts":"2024-03-10T04:38:23.942816Z","logger":"client","caller":"v3@v3.5.10/maintenance.go:212","msg":"opened snapshot stream; downloading"}
|
||||
{"level":"info","ts":"2024-03-10T04:38:23.942946Z","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://127.0.0.1:2379"}
|
||||
{"level":"info","ts":"2024-03-10T04:38:24.830242Z","logger":"client","caller":"v3@v3.5.10/maintenance.go:220","msg":"completed snapshot read; closing"}
|
||||
{"level":"info","ts":"2024-03-10T04:38:25.395294Z","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://127.0.0.1:2379","size":"19 MB","took":"1 second ago"}
|
||||
{"level":"info","ts":"2024-03-10T04:38:25.395687Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/var/lib/etcd/snapshot1.db"}
|
||||
Snapshot saved at /var/lib/etcd/snapshot1.db
|
||||
```
|
||||
|
||||
### Transfer snapshot to the new_master node
|
||||
|
||||
### on the OLD_master
|
||||
|
||||
```shell
|
||||
scp /var/lib/etcd/snapshot1.db master2@192.168.1.173:~/bakup
|
||||
```
|
||||
|
||||
```text
|
||||
snapshot1.db 100% 19MB 44.0MB/s 00:00
|
||||
```
|
||||
|
||||
### Update kubeadm.config
|
||||
|
||||
### on the OLD_master
|
||||
|
||||
```shell
|
||||
kubectl get cm -n kube-system kubeadm-config -oyaml
|
||||
```
|
||||
|
||||
```text
|
||||
apiVersion: v1
|
||||
data:
|
||||
ClusterConfiguration: |
|
||||
apiServer:
|
||||
extraArgs:
|
||||
authorization-mode: Node,RBAC
|
||||
timeoutForControlPlane: 4m0s
|
||||
apiVersion: kubeadm.k8s.io/v1beta3
|
||||
certificatesDir: /etc/kubernetes/pki
|
||||
clusterName: kubernetes
|
||||
controllerManager: {}
|
||||
dns: {}
|
||||
etcd:
|
||||
local:
|
||||
dataDir: /var/lib/etcd
|
||||
imageRepository: registry.k8s.io
|
||||
kind: ClusterConfiguration
|
||||
kubernetesVersion: v1.28.7
|
||||
networking:
|
||||
dnsDomain: cluster.local
|
||||
serviceSubnet: 10.96.0.0/12
|
||||
scheduler: {}
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
creationTimestamp: "2024-02-22T21:45:42Z"
|
||||
name: kubeadm-config
|
||||
namespace: kube-system
|
||||
resourceVersion: "234"
|
||||
uid: c56b87b1-691d-4277-b66c-ab6035cead6a
|
||||
```
|
||||
|
||||
### on the new_master
|
||||
|
||||
#### Create kubeadm-config.yaml
|
||||
|
||||
```shell
|
||||
touch kubeadm-config.yaml
|
||||
```
|
||||
|
||||
I have used the information from the previously displayed cm to create the following file (basically filling the default kubeadmin-config file):
|
||||
|
||||
Note that the token used differs.
|
||||
|
||||
```yaml
|
||||
apiVersion: kubeadm.k8s.io/v1beta3
|
||||
bootstrapTokens:
|
||||
- groups:
|
||||
- system:bootstrappers:kubeadm:default-node-token
|
||||
token: abcdef.abcdef0123456789
|
||||
ttl: 24h0m0s
|
||||
usages:
|
||||
- signing
|
||||
- authentication
|
||||
kind: InitConfiguration
|
||||
localAPIEndpoint:
|
||||
advertiseAddress: 192.168.1.9
|
||||
bindPort: 6443
|
||||
nodeRegistration:
|
||||
criSocket: unix:///var/run/containerd/containerd.sock
|
||||
imagePullPolicy: IfNotPresent
|
||||
name: masterk
|
||||
taints: null
|
||||
---
|
||||
apiServer:
|
||||
timeoutForControlPlane: 4m0s
|
||||
apiVersion: kubeadm.k8s.io/v1beta3
|
||||
certificatesDir: /etc/kubernetes/pki
|
||||
clusterName: kubernetes
|
||||
controllerManager: {}
|
||||
dns: {}
|
||||
etcd:
|
||||
local:
|
||||
dataDir: /var/lib/etcd
|
||||
imageRepository: registry.k8s.io
|
||||
kind: ClusterConfiguration
|
||||
kubernetesVersion: 1.29.0
|
||||
networking:
|
||||
dnsDomain: cluster.local
|
||||
serviceSubnet: 10.96.0.0/12
|
||||
scheduler: {}
|
||||
```
|
||||
|
||||
### Install etcdctl
|
||||
|
||||
https://github.com/etcd-io/etcd/releases/tag/v3.5.12
|
||||
|
||||
### Restore from snapshot into new_master
|
||||
|
||||
This time I will be using the `etcdctl` cli tool.
|
||||
|
||||
```shell
|
||||
mkdir /var/lib/etcd
|
||||
```
|
||||
|
||||
```shell
|
||||
ETCDCTL_API=3 /tmp/etcd-download-test/etcdctl --endpoints https://127.0.0.1:2379 snapshot restore './bakup/snapshot1.db' && mv ./default.etcd/member/ /var/lib/etcd/
|
||||
```
|
||||
|
||||
```console
|
||||
Deprecated: Use `etcdutl snapshot restore` instead.
|
||||
|
||||
2024-03-10T06:09:17+01:00 info snapshot/v3_snapshot.go:260 restoring snapshot {"path": "./bakup/snapshot1.db", "wal-dir": "default.etcd/member/wal", "data-dir": "default.etcd", "snap-dir": "default.etcd/member/snap"}
|
||||
2024-03-10T06:09:17+01:00 info membership/store.go:141 Trimming membership information from the backend...
|
||||
2024-03-10T06:09:18+01:00 info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"]}
|
||||
2024-03-10T06:09:18+01:00 info snapshot/v3_snapshot.go:287 restored snapshot {"path": "./bakup/snapshot1.db", "wal-dir": "default.etcd/member/wal", "data-dir": "default.etcd", "snap-dir": "default.etcd/member/snap"}
|
||||
```
|
||||
|
||||
### Do shenanigans to replace the OLD_node by the new_node
|
||||
|
||||
Aka replace the IP maneuvers.
|
||||
|
||||
### Start new node
|
||||
|
||||
```shell
|
||||
kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd --config kubeadm-config.yaml
|
||||
```
|
||||
|
||||
```console
|
||||
kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd --config kubeadm-config.yaml
|
||||
[init] Using Kubernetes version: v1.29.0
|
||||
[preflight] Running pre-flight checks
|
||||
[WARNING DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
|
||||
[preflight] Pulling images required for setting up a Kubernetes cluster
|
||||
[preflight] This might take a minute or two, depending on the speed of your internet connection
|
||||
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
|
||||
W0310 06:42:10.268972 1600 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
|
||||
[certs] Using certificateDir folder "/etc/kubernetes/pki"
|
||||
```
|
||||
|
||||
|
||||
## Join "old nodes" into the "new masterk"
|
||||
|
||||
For my surprise I didn't need to rejoin nodes, only remove the old control plane.
|
||||
|
||||
```shell
|
||||
kubectl get nodes
|
||||
```
|
||||
```console
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
masterk.filter.home Ready control-plane 4m59s v1.29.2
|
||||
pi4.filter.home NotReady control-plane 16d v1.29.2
|
||||
slave01.filter.home Ready <none> 10d v1.29.2
|
||||
slave02.filter.home Ready <none> 16d v1.29.2
|
||||
slave03.filter.home Ready <none> 16d v1.29.2
|
||||
slave04.filter.home Ready <none> 16d v1.29.2
|
||||
```
|
||||
|
||||
```shell
|
||||
kubectl delete node pi4.filter.home
|
||||
```
|
||||
|
||||
```console
|
||||
node "pi4.filter.home" deleted
|
||||
```
|
||||
|
||||
|
||||
```shell
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
|
||||
```console
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
masterk.filter.home Ready control-plane 5m20s v1.29.2
|
||||
slave01.filter.home Ready <none> 10d v1.29.2
|
||||
slave02.filter.home Ready <none> 16d v1.29.2
|
||||
slave03.filter.home Ready <none> 16d v1.29.2
|
||||
slave04.filter.home Ready <none> 16d v1.29.2
|
||||
```
|
||||
|
||||
|
||||
So very much done, since I didn't need to rejoin I will be paying extra attention to the nodes for a while.
|
||||
|
178
Migrations/04-Upgrade_Kubeadm_1-29/01-Istio/README.md
Normal file
178
Migrations/04-Upgrade_Kubeadm_1-29/01-Istio/README.md
Normal file
@ -0,0 +1,178 @@
|
||||
# Istio supported versions
|
||||
|
||||
https://istio.io/latest/docs/releases/supported-releases/
|
||||
|
||||
1.24 -> Supports up to 1.28, so the current version is it's included.
|
||||
|
||||
# Current info
|
||||
|
||||
```shell
|
||||
➜ bin kubectl get pod -n istio-system -oyaml | grep image | grep pilot
|
||||
image: docker.io/istio/pilot:1.20.3
|
||||
image: docker.io/istio/pilot:1.20.3
|
||||
imageID: docker.io/istio/pilot@sha256:aadac7d3a0ca402bcbc961a5419c786146aab5f335892c166223fa1c025dda6e
|
||||
```
|
||||
|
||||
# Changelogs
|
||||
|
||||
# Upgrade process
|
||||
|
||||
https://istio.io/latest/docs/setup/upgrade/in-place/
|
||||
|
||||
## 1.20
|
||||
|
||||
https://istio.io/latest/news/releases/1.20.x/announcing-1.20/upgrade-notes/#upcoming-externalname-support-changes
|
||||
|
||||
## 1.21
|
||||
|
||||
https://istio.io/latest/news/releases/1.21.x/announcing-1.21/upgrade-notes/#externalname-support-changes
|
||||
https://istio.io/latest/news/releases/1.21.x/announcing-1.21/upgrade-notes/#default-value-of-the-feature-flag-verify_cert_at_client-is-set-to-true
|
||||
|
||||
## 1.22
|
||||
|
||||
https://istio.io/latest/news/releases/1.22.x/announcing-1.22/upgrade-notes/#default-value-of-the-feature-flag-enhanced_resource_scoping-to-true
|
||||
|
||||
## 1.23
|
||||
|
||||
> If you do not use Istio APIs from Go (via istio.io/api or istio.io/client-go) or Protobuf (from istio.io/api), this change does not impact you.
|
||||
|
||||
## 1.24
|
||||
|
||||
https://istio.io/latest/news/releases/1.24.x/announcing-1.24/upgrade-notes/#updated-compatibility-profiles
|
||||
https://istio.io/latest/news/releases/1.24.x/announcing-1.24/upgrade-notes/#compatibility-with-cert-managers-istio-csr
|
||||
|
||||
Seems fine so far.
|
||||
|
||||
## Upgrade to 1.21
|
||||
|
||||
```shell
|
||||
export ISTIO_VERSION=1.21.0
|
||||
cd /tmp
|
||||
curl -L https://istio.io/downloadIstio | TARGET_ARCH=x86_64 sh -
|
||||
cd istio-${ISTIO_VERSION}/bin || exit
|
||||
./istioctl x precheck
|
||||
./istioctl upgrade -f /home/goblin/Home_Config/Istio/Operators/IstioOperator_IstioConfig.yaml
|
||||
```
|
||||
|
||||
https://cloud.ibm.com/docs/containers?topic=containers-istio&interface=ui#istio_minor
|
||||
|
||||
```shell
|
||||
...
|
||||
WARNING: Istio 1.21.0 may be out of support (EOL) already: see https://istio.io/latest/docs/releases/supported-releases/ for supported releases
|
||||
This will install the Istio 1.21.0 "minimal" profile (with components: Istio core and Istiod) into the cluster. Proceed? (y/N) This will install the Istio 1.21.0 "minimal" profile (with components: Istio core and Istiod) into the cluster. Proceed? (y/N) y
|
||||
Error: failed to install manifests: errors occurred during operation: creating default tag would conflict:
|
||||
Error [IST0139] (MutatingWebhookConfiguration istio-sidecar-injector) Webhook overlaps with others: [istio-revision-tag-default/namespace.sidecar-injector.istio.io]. This may cause injection to occur twice.
|
||||
Error [IST0139] (MutatingWebhookConfiguration istio-sidecar-injector) Webhook overlaps with others: [istio-revision-tag-default/object.sidecar-injector.istio.io]. This may cause injection to occur twice.
|
||||
Error [IST0139] (MutatingWebhookConfiguration istio-sidecar-injector) Webhook overlaps with others: [istio-revision-tag-default/rev.namespace.sidecar-injector.istio.io]. This may cause injection to occur twice.
|
||||
Error [IST0139] (MutatingWebhookConfiguration istio-sidecar-injector) Webhook overlaps with others: [istio-revision-tag-default/rev.object.sidecar-injector.istio.io]. This may cause injection to occur twice.
|
||||
```
|
||||
|
||||
|
||||
```shell
|
||||
➜ ~ kubectl delete mutatingwebhookconfigurations istio-sidecar-injector
|
||||
mutatingwebhookconfiguration.admissionregistration.k8s.io "istio-sidecar-injector" deleted
|
||||
➜ ~ kubectl delete mutatingwebhookconfigurations istio-revision-tag-default
|
||||
mutatingwebhookconfiguration.admissionregistration.k8s.io "istio-revision-tag-default" deleted
|
||||
```
|
||||
|
||||
```shell
|
||||
➜ ~ kubectl get mutatingwebhookconfigurations
|
||||
NAME WEBHOOKS AGE
|
||||
cert-manager-webhook 1 217d
|
||||
istio-revision-tag-default 4 19s
|
||||
istio-sidecar-injector 4 20s
|
||||
```
|
||||
|
||||
## Upgrade to 1.22
|
||||
|
||||
```shell
|
||||
export ISTIO_VERSION=1.22.0
|
||||
cd /tmp
|
||||
curl -L https://istio.io/downloadIstio | TARGET_ARCH=x86_64 sh -
|
||||
cd istio-${ISTIO_VERSION}/bin || exit
|
||||
./istioctl x precheck
|
||||
./istioctl upgrade -f /home/goblin/Home_Config/Istio/Operators/IstioOperator_IstioConfig.yaml
|
||||
```
|
||||
|
||||
```text
|
||||
WARNING: Istio 1.22.0 may be out of support (EOL) already: see https://istio.io/latest/docs/releases/supported-releases/ for supported releases
|
||||
WARNING: Istio is being upgraded from 1.21.0 to 1.22.0.
|
||||
Running this command will overwrite it; use revisions to upgrade alongside the existing version.
|
||||
Before upgrading, you may wish to use 'istioctl x precheck' to check for upgrade warnings.
|
||||
This will install the Istio 1.22.0 "minimal" profile (with components: Istio core and Istiod) into the cluster. Proceed? (y/N) y
|
||||
✔ Istio core installed
|
||||
✔ Istiod installed
|
||||
✔ Installation complete
|
||||
Made this installation the default for injection and validation.
|
||||
```
|
||||
|
||||
## Upgrade to 1.23
|
||||
|
||||
```shell
|
||||
export ISTIO_VERSION=1.23.0
|
||||
cd /tmp
|
||||
curl -L https://istio.io/downloadIstio | TARGET_ARCH=x86_64 sh -
|
||||
cd istio-${ISTIO_VERSION}/bin || exit
|
||||
./istioctl x precheck
|
||||
./istioctl upgrade -f /home/goblin/Home_Config/Istio/Operators/IstioOperator_IstioConfig.yaml
|
||||
```
|
||||
|
||||
```text
|
||||
✔ No issues found when checking the cluster. Istio is safe to install or upgrade!
|
||||
To get started, check out https://istio.io/latest/docs/setup/getting-started/
|
||||
➜ bin ./istioctl upgrade -f /home/goblin/Home_Config/Istio/Operators/IstioOperator_IstioConfig.yaml
|
||||
WARNING: Istio is being upgraded from 1.22.0 to 1.23.0.
|
||||
Running this command will overwrite it; use revisions to upgrade alongside the existing version.
|
||||
Before upgrading, you may wish to use 'istioctl x precheck' to check for upgrade warnings.
|
||||
This will install the Istio 1.23.0 "minimal" profile (with components: Istio core and Istiod) into the cluster. Proceed? (y/N) y
|
||||
✔ Istio core installed ⛵️
|
||||
✔ Istiod installed 🧠
|
||||
✔ Installation complete
|
||||
Made this installation the default for cluster-wide operations.
|
||||
```
|
||||
|
||||
## Upgrade to 1.24
|
||||
|
||||
```shell
|
||||
export ISTIO_VERSION=1.24.0
|
||||
cd /tmp
|
||||
curl -L https://istio.io/downloadIstio | TARGET_ARCH=x86_64 sh -
|
||||
cd istio-${ISTIO_VERSION}/bin || exit
|
||||
./istioctl x precheck
|
||||
./istioctl upgrade -f /home/goblin/Home_Config/Istio/Operators/IstioOperator_IstioConfig.yaml
|
||||
```
|
||||
|
||||
```text
|
||||
➜ bin ./istioctl upgrade -f /home/goblin/Home_Config/Istio/Operators/IstioOperator_IstioConfig.yaml
|
||||
WARNING: Istio is being upgraded from 1.23.0 to 1.24.0.
|
||||
Running this command will overwrite it; use revisions to upgrade alongside the existing version.
|
||||
Before upgrading, you may wish to use 'istioctl x precheck' to check for upgrade warnings.
|
||||
This will install the Istio 1.24.0 profile "minimal" into the cluster. Proceed? (y/N) y
|
||||
✔ Istio core installed ⛵️
|
||||
✔ Istiod installed 🧠
|
||||
✔ Installation complete
|
||||
```
|
||||
|
||||
## Upgrade to 1.24.3
|
||||
|
||||
```shell
|
||||
export ISTIO_VERSION=1.24.3
|
||||
cd /tmp
|
||||
curl -L https://istio.io/downloadIstio | TARGET_ARCH=x86_64 sh -
|
||||
cd istio-${ISTIO_VERSION}/bin || exit
|
||||
./istioctl x precheck
|
||||
./istioctl upgrade -f /home/goblin/Home_Config/Istio/Operators/IstioOperator_IstioConfig.yaml
|
||||
```
|
||||
|
||||
```text
|
||||
✔ No issues found when checking the cluster. Istio is safe to install or upgrade!
|
||||
To get started, check out https://istio.io/latest/docs/setup/getting-started/.
|
||||
➜ bin ./istioctl upgrade -f /home/goblin/Home_Config/Istio/Operators/IstioOperator_IstioConfig.yaml
|
||||
WARNING: Istio is being upgraded from 1.24.0 to 1.24.3.
|
||||
Running this command will overwrite it; use revisions to upgrade alongside the existing version.
|
||||
Before upgrading, you may wish to use 'istioctl x precheck' to check for upgrade warnings.
|
||||
This will install the Istio 1.24.3 profile "minimal" into the cluster. Proceed? (y/N) y
|
||||
✔ Istio core installed ⛵️
|
||||
✔ Istiod installed 🧠
|
||||
✔ Installation complete
|
||||
```
|
38
Migrations/04-Upgrade_Kubeadm_1-29/02-Calico/README.md
Normal file
38
Migrations/04-Upgrade_Kubeadm_1-29/02-Calico/README.md
Normal file
@ -0,0 +1,38 @@
|
||||
# Current info
|
||||
|
||||
```shell
|
||||
➜ ~ kubectl get -n kube-system pod calico-kube-controllers-6d88486588-ghxg9 -oyaml | grep image | grep calico
|
||||
image: docker.io/calico/kube-controllers:v3.27.0
|
||||
image: docker.io/calico/kube-controllers:v3.27.0
|
||||
```
|
||||
|
||||
# Supported versions
|
||||
|
||||
3.29 is supported until 1.29
|
||||
|
||||
https://docs.tigera.io/calico/latest/getting-started/kubernetes/requirements#supported-versions
|
||||
|
||||
# Docs
|
||||
|
||||
https://docs.tigera.io/calico/latest/operations/upgrading/kubernetes-upgrade#upgrading-an-installation-that-uses-manifests-and-the-kubernetes-api-datastore
|
||||
|
||||
## 3.28
|
||||
|
||||
```shell
|
||||
cd /tmp
|
||||
curl https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/calico.yaml -o upgrade.yaml && kubectl apply -f upgrade.yaml
|
||||
```
|
||||
|
||||
## 3.29
|
||||
|
||||
```shell
|
||||
cd /tmp
|
||||
curl https://raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/calico.yaml -o upgrade.yaml && kubectl apply -f upgrade.yaml
|
||||
```
|
||||
|
||||
## 3.29.2
|
||||
|
||||
```shell
|
||||
cd /tmp
|
||||
curl https://raw.githubusercontent.com/projectcalico/calico/v3.29.2/manifests/calico.yaml -o upgrade.yaml && kubectl apply -f upgrade.yaml
|
||||
```
|
256
Migrations/04-Upgrade_Kubeadm_1-29/03-Cert_Manager/README.md
Normal file
256
Migrations/04-Upgrade_Kubeadm_1-29/03-Cert_Manager/README.md
Normal file
@ -0,0 +1,256 @@
|
||||
# Current info
|
||||
|
||||
```shell
|
||||
➜ /tmp helm list
|
||||
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
|
||||
cert-manager cert-manager 1 2024-07-20 02:42:30.280467317 +0200 CEST deployed cert-manager-v1.15.1 v1.15.1
|
||||
cert-manager-ovh cert-manager 1 2024-07-20 04:41:22.277311169 +0200 CEST deployed cert-manager-webhook-ovh-0.3.1 0.3.1
|
||||
cert-manager-porkbun cert-manager 1 2024-07-20 05:17:54.537102326 +0200 CEST deployed porkbun-webhook-0.1.4 1.0
|
||||
```
|
||||
|
||||
|
||||
# Supported versions
|
||||
|
||||
1.17 is supported until 1.29
|
||||
|
||||
https://cert-manager.io/docs/releases/#currently-supported-releases
|
||||
|
||||
# Upgrade
|
||||
|
||||
## Cert manager
|
||||
### v1.16
|
||||
|
||||
```terraform
|
||||
# helm_release.cert-manager will be updated in-place
|
||||
~ resource "helm_release" "cert-manager" {
|
||||
id = "cert-manager"
|
||||
~ metadata = [
|
||||
- {
|
||||
- app_version = "v1.15.1"
|
||||
- chart = "cert-manager"
|
||||
- first_deployed = 1721436150
|
||||
- last_deployed = 1721436150
|
||||
- name = "cert-manager"
|
||||
- namespace = "cert-manager"
|
||||
- notes = <<-EOT
|
||||
cert-manager v1.15.1 has been deployed successfully!
|
||||
|
||||
In order to begin issuing certificates, you will need to set up a ClusterIssuer
|
||||
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).
|
||||
|
||||
More information on the different types of issuers and how to configure them
|
||||
can be found in our documentation:
|
||||
|
||||
https://cert-manager.io/docs/configuration/
|
||||
|
||||
For information on how to configure cert-manager to automatically provision
|
||||
Certificates for Ingress resources, take a look at the `ingress-shim`
|
||||
documentation:
|
||||
|
||||
https://cert-manager.io/docs/usage/ingress/
|
||||
EOT
|
||||
- revision = 1
|
||||
- values = jsonencode(
|
||||
{
|
||||
- crds = {
|
||||
- enabled = true
|
||||
- keep = true
|
||||
}
|
||||
}
|
||||
)
|
||||
- version = "v1.15.1"
|
||||
},
|
||||
] -> (known after apply)
|
||||
name = "cert-manager"
|
||||
~ version = "v1.15.1" -> "v1.16.0"
|
||||
# (25 unchanged attributes hidden)
|
||||
|
||||
# (2 unchanged blocks hidden)
|
||||
}
|
||||
|
||||
Plan: 0 to add, 1 to change, 0 to destroy.
|
||||
```
|
||||
|
||||
```text
|
||||
➜ Cert_Manager helm list
|
||||
cert-manager cert-manager 2 2025-02-22 21:41:27.702757947 +0100 CET deployed cert-manager-v1.16.0 v1.16.0
|
||||
```
|
||||
|
||||
|
||||
### v1.17
|
||||
|
||||
```terraform
|
||||
Terraform will perform the following actions:
|
||||
|
||||
# helm_release.cert-manager will be updated in-place
|
||||
~ resource "helm_release" "cert-manager" {
|
||||
id = "cert-manager"
|
||||
~ metadata = [
|
||||
- {
|
||||
- app_version = "v1.16.0"
|
||||
- chart = "cert-manager"
|
||||
- first_deployed = 1721436150
|
||||
- last_deployed = 1740256887
|
||||
- name = "cert-manager"
|
||||
- namespace = "cert-manager"
|
||||
- notes = <<-EOT
|
||||
cert-manager v1.16.0 has been deployed successfully!
|
||||
|
||||
In order to begin issuing certificates, you will need to set up a ClusterIssuer
|
||||
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).
|
||||
|
||||
More information on the different types of issuers and how to configure them
|
||||
can be found in our documentation:
|
||||
|
||||
https://cert-manager.io/docs/configuration/
|
||||
|
||||
For information on how to configure cert-manager to automatically provision
|
||||
Certificates for Ingress resources, take a look at the `ingress-shim`
|
||||
documentation:
|
||||
|
||||
https://cert-manager.io/docs/usage/ingress/
|
||||
EOT
|
||||
- revision = 2
|
||||
- values = jsonencode(
|
||||
{
|
||||
- crds = {
|
||||
- enabled = true
|
||||
- keep = true
|
||||
}
|
||||
}
|
||||
)
|
||||
- version = "v1.16.0"
|
||||
},
|
||||
] -> (known after apply)
|
||||
name = "cert-manager"
|
||||
~ version = "v1.16.0" -> "v1.17.0"
|
||||
# (25 unchanged attributes hidden)
|
||||
|
||||
# (2 unchanged blocks hidden)
|
||||
}
|
||||
|
||||
Plan: 0 to add, 1 to change, 0 to destroy.
|
||||
```
|
||||
|
||||
```text
|
||||
➜ /tmp helm list
|
||||
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
|
||||
cert-manager cert-manager 3 2025-02-22 21:44:31.291530476 +0100 CET deployed cert-manager-v1.17.0 v1.17.0
|
||||
```
|
||||
|
||||
|
||||
### v1.17.1
|
||||
|
||||
```terraform
|
||||
Terraform will perform the following actions:
|
||||
|
||||
# helm_release.cert-manager will be updated in-place
|
||||
~ resource "helm_release" "cert-manager" {
|
||||
id = "cert-manager"
|
||||
~ metadata = [
|
||||
- {
|
||||
- app_version = "v1.17.0"
|
||||
- chart = "cert-manager"
|
||||
- first_deployed = 1721436150
|
||||
- last_deployed = 1740257071
|
||||
- name = "cert-manager"
|
||||
- namespace = "cert-manager"
|
||||
- notes = <<-EOT
|
||||
cert-manager v1.17.0 has been deployed successfully!
|
||||
|
||||
In order to begin issuing certificates, you will need to set up a ClusterIssuer
|
||||
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).
|
||||
|
||||
More information on the different types of issuers and how to configure them
|
||||
can be found in our documentation:
|
||||
|
||||
https://cert-manager.io/docs/configuration/
|
||||
|
||||
For information on how to configure cert-manager to automatically provision
|
||||
Certificates for Ingress resources, take a look at the `ingress-shim`
|
||||
documentation:
|
||||
|
||||
https://cert-manager.io/docs/usage/ingress/
|
||||
EOT
|
||||
- revision = 3
|
||||
- values = jsonencode(
|
||||
{
|
||||
- crds = {
|
||||
- enabled = true
|
||||
- keep = true
|
||||
}
|
||||
}
|
||||
)
|
||||
- version = "v1.17.0"
|
||||
},
|
||||
] -> (known after apply)
|
||||
name = "cert-manager"
|
||||
~ version = "v1.17.0" -> "v1.17.1"
|
||||
# (25 unchanged attributes hidden)
|
||||
|
||||
# (2 unchanged blocks hidden)
|
||||
}
|
||||
|
||||
Plan: 0 to add, 1 to change, 0 to destroy.
|
||||
```
|
||||
|
||||
```text
|
||||
➜ /tmp helm list
|
||||
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
|
||||
cert-manager cert-manager 4 2025-02-22 21:46:06.835196123 +0100 CET deployed cert-manager-v1.17.1 v1.17.1
|
||||
```
|
||||
|
||||
## Plugins
|
||||
|
||||
### Update local repos
|
||||
|
||||
```bash
|
||||
➜ Cert_Manager git clone https://github.com/baarde/cert-manager-webhook-ovh.git ./tmp/cert-manager-webhook-ovh
|
||||
Cloning into './tmp/cert-manager-webhook-ovh'...
|
||||
remote: Enumerating objects: 435, done.
|
||||
remote: Counting objects: 100% (137/137), done.
|
||||
remote: Compressing objects: 100% (40/40), done.
|
||||
remote: Total 435 (delta 113), reused 97 (delta 97), pack-reused 298 (from 2)
|
||||
Receiving objects: 100% (435/435), 338.31 KiB | 2.94 MiB/s, done.
|
||||
Resolving deltas: 100% (235/235), done.
|
||||
➜ Cert_Manager git clone https://github.com/mdonoughe/porkbun-webhook ./tmp/cert-manager-webhook-porkbun
|
||||
Cloning into './tmp/cert-manager-webhook-porkbun'...
|
||||
remote: Enumerating objects: 308, done.
|
||||
remote: Counting objects: 100% (7/7), done.
|
||||
remote: Compressing objects: 100% (6/6), done.
|
||||
remote: Total 308 (delta 2), reused 1 (delta 1), pack-reused 301 (from 2)
|
||||
Receiving objects: 100% (308/308), 260.79 KiB | 1.76 MiB/s, done.
|
||||
Resolving deltas: 100% (129/129), done.
|
||||
```
|
||||
|
||||
### Apply terraform
|
||||
|
||||
```terraform
|
||||
Terraform will perform the following actions:
|
||||
|
||||
# helm_release.cert-manager-porkbun will be updated in-place
|
||||
~ resource "helm_release" "cert-manager-porkbun" {
|
||||
id = "cert-manager-porkbun"
|
||||
name = "cert-manager-porkbun"
|
||||
~ version = "0.1.4" -> "0.1.5"
|
||||
# (25 unchanged attributes hidden)
|
||||
|
||||
# (2 unchanged blocks hidden)
|
||||
}
|
||||
|
||||
Plan: 0 to add, 1 to change, 0 to destroy.
|
||||
|
||||
Do you want to perform these actions?
|
||||
Terraform will perform the actions described above.
|
||||
Only 'yes' will be accepted to approve.
|
||||
|
||||
Enter a value: yes
|
||||
```
|
||||
|
||||
```text
|
||||
➜ Cert_Manager helm list
|
||||
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
|
||||
cert-manager cert-manager 4 2025-02-22 21:46:06.835196123 +0100 CET deployed cert-manager-v1.17.1 v1.17.1
|
||||
cert-manager-ovh cert-manager 1 2024-07-20 04:41:22.277311169 +0200 CEST deployed cert-manager-webhook-ovh-0.3.1 0.3.1
|
||||
cert-manager-porkbun cert-manager 2 2025-02-22 21:50:59.096319059 +0100 CET deployed porkbun-webhook-0.1.5 1.0
|
||||
```
|
@ -0,0 +1,37 @@
|
||||
# Current info
|
||||
|
||||
```shell
|
||||
➜ 05-Metrics-Server git:(Upgrade/Kubeadm-1.29) ✗ helm list -n metrics-server
|
||||
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
|
||||
metrics-server metrics-server 1 2024-02-29 03:13:29.569864255 +0100 CET deployed metrics-server-3.12.0 0.7.0
|
||||
```
|
||||
|
||||
## Upgrade
|
||||
|
||||
```shell
|
||||
➜ non-core-Operators helm upgrade --install metrics-server metrics-server/metrics-server -n metrics-server \
|
||||
--set replicas=3 \
|
||||
--set apiService.insecureSkipTLSVerify=true \
|
||||
--set "args={'--kubelet-insecure-tls'}" \
|
||||
--version 3.12.2
|
||||
Release "metrics-server" has been upgraded. Happy Helming!
|
||||
```
|
||||
|
||||
```text
|
||||
NAME: metrics-server
|
||||
LAST DEPLOYED: Sat Feb 22 22:32:33 2025
|
||||
NAMESPACE: metrics-server
|
||||
STATUS: deployed
|
||||
REVISION: 2
|
||||
TEST SUITE: None
|
||||
NOTES:
|
||||
***********************************************************************
|
||||
* Metrics Server *
|
||||
***********************************************************************
|
||||
Chart version: 3.12.2
|
||||
App version: 0.7.2
|
||||
Image tag: registry.k8s.io/metrics-server/metrics-server:v0.7.2
|
||||
***********************************************************************
|
||||
```
|
||||
|
||||
|
187
Migrations/04-Upgrade_Kubeadm_1-29/06-Kubernetes/README.md
Normal file
187
Migrations/04-Upgrade_Kubeadm_1-29/06-Kubernetes/README.md
Normal file
@ -0,0 +1,187 @@
|
||||
### Links
|
||||
|
||||
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
|
||||
|
||||
Used this ansible script.
|
||||
|
||||
https://gitea.fihome.xyz/ofilter/ansible_update_cluster
|
||||
|
||||
### 1.29.14 to 1.30.0
|
||||
|
||||
I didn't save the output from that.
|
||||
|
||||
### 1.30 to 1.30.10
|
||||
|
||||
|
||||
|
||||
```text
|
||||
PLAY RECAP **********************************************************************************************************************************************************************************************************************
|
||||
masterk.filter.home : ok=26 changed=6 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
|
||||
slave01.filter.home : ok=24 changed=5 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
slave02.filter.home : ok=24 changed=6 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
slave03.filter.home : ok=24 changed=5 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
```
|
||||
|
||||
### 1.30.10 1.31.0
|
||||
|
||||
```shell
|
||||
root@masterk:/# kubeadm upgrade plan
|
||||
[preflight] Running pre-flight checks.
|
||||
[preflight] Some fatal errors occurred:
|
||||
[ERROR CoreDNSUnsupportedPlugins]: start version '1.11.3' not supported
|
||||
[ERROR CoreDNSMigration]: CoreDNS will not be upgraded: start version '1.11.3' not supported
|
||||
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
|
||||
To see the stack trace of this error execute with --v=5 or higher
|
||||
```
|
||||
|
||||
Kubeadm 1.30.10 updated CoreDNS to the 1.11.3, meanwhile Kubeadm 1.31 is expecting 1.11.1.
|
||||
|
||||
https://github.com/kubernetes/kubernetes/pull/126796/files#diff-b84c5a65e31001a0bf998f9b29f7fbf4e2353c86ada30d39f070bfe8fd23b8e7L136
|
||||
|
||||
Downgrading to 1.11.1 allowed this message to not occur again/to `kubeadm upgrade plan` -> `kubeadm upgrade apply v1.31.0` to work.
|
||||
|
||||
```text
|
||||
[upgrade/versions] Target version: v1.31.6
|
||||
[upgrade/versions] Latest version in the v1.30 series: v1.30.10
|
||||
|
||||
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
|
||||
COMPONENT NODE CURRENT TARGET
|
||||
kubelet masterk.filter.home v1.30.10 v1.31.6
|
||||
kubelet slave01.filter.home v1.30.10 v1.31.6
|
||||
kubelet slave02.filter.home v1.30.10 v1.31.6
|
||||
kubelet slave03.filter.home v1.30.10 v1.31.6
|
||||
|
||||
Upgrade to the latest stable version:
|
||||
|
||||
COMPONENT NODE CURRENT TARGET
|
||||
kube-apiserver masterk.filter.home v1.30.10 v1.31.6
|
||||
kube-controller-manager masterk.filter.home v1.30.10 v1.31.6
|
||||
kube-scheduler masterk.filter.home v1.30.10 v1.31.6
|
||||
kube-proxy 1.30.10 v1.31.6
|
||||
CoreDNS v1.11.1 v1.11.1
|
||||
etcd masterk.filter.home 3.5.16-0 3.5.15-0
|
||||
|
||||
You can now apply the upgrade by executing the following command:
|
||||
|
||||
kubeadm upgrade apply v1.31.6
|
||||
|
||||
Note: Before you can perform this upgrade, you have to update kubeadm to v1.31.6.
|
||||
```
|
||||
|
||||
Notice how the etcd is "ahead" of the target.
|
||||
|
||||
As well, how meanwhile the kubeadm package is installed as 1.31.0, the version displayed after the upgrade is 1.31.6
|
||||
|
||||
```text
|
||||
root@masterk:/home/klussy# kubeadm version
|
||||
kubeadm version: &version.Info{Major:"1", Minor:"31", GitVersion:"v1.31.0", GitCommit:"9edcffcde5595e8a5b1a35f88c421764e575afce", GitTreeState:"clean", BuildDate:"2024-08-13T07:35:57Z", GoVersion:"go1.22.5", Compiler:"gc", Platform:"linux/amd64"}
|
||||
```
|
||||
|
||||
```text
|
||||
➜ ~ kubectl get nodes -owide
|
||||
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
|
||||
masterk.filter.home Ready control-plane 349d v1.31.6 192.168.1.9 <none> Debian GNU/Linux 12 (bookworm) 6.1.0-31-amd64 containerd://1.7.25
|
||||
```
|
||||
|
||||
#### Result
|
||||
|
||||
```text
|
||||
PLAY RECAP **********************************************************************************************************************************************************************************************************************
|
||||
masterk.filter.home : ok=26 changed=9 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
|
||||
slave01.filter.home : ok=24 changed=9 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
slave02.filter.home : ok=24 changed=9 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
slave03.filter.home : ok=24 changed=8 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
```
|
||||
|
||||
```text
|
||||
➜ ~ kubectl get nodes -owide
|
||||
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
|
||||
masterk.filter.home Ready control-plane 349d v1.31.6 192.168.1.9 <none> Debian GNU/Linux 12 (bookworm) 6.1.0-31-amd64 containerd://1.7.25
|
||||
slave01.filter.home Ready <none> 359d v1.31.6 192.168.1.10 <none> Armbian 25.2.1 bookworm 5.10.160-legacy-rk35xx containerd://1.7.25
|
||||
slave02.filter.home Ready <none> 365d v1.31.6 192.168.1.11 <none> Armbian 24.11.1 jammy 5.10.160-legacy-rk35xx containerd://1.7.25
|
||||
slave03.filter.home Ready <none> 365d v1.31.6 192.168.1.12 <none> Debian GNU/Linux 12 (bookworm) 6.1.0-31-amd64 containerd://1.7.25
|
||||
```
|
||||
|
||||
### "1.31.0" to 1.32.0
|
||||
|
||||
|
||||
```text
|
||||
PLAY RECAP **********************************************************************************************************************************************************************************************************************
|
||||
masterk.filter.home : ok=26 changed=12 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
|
||||
slave01.filter.home : ok=24 changed=12 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
slave02.filter.home : ok=24 changed=12 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
slave03.filter.home : ok=24 changed=12 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
```
|
||||
|
||||
```text
|
||||
➜ ~ kubectl get nodes -owide
|
||||
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
|
||||
masterk.filter.home Ready control-plane 349d v1.32.2 192.168.1.9 <none> Debian GNU/Linux 12 (bookworm) 6.1.0-31-amd64 containerd://1.7.25
|
||||
slave01.filter.home Ready <none> 359d v1.32.2 192.168.1.10 <none> Armbian 25.2.1 bookworm 5.10.160-legacy-rk35xx containerd://1.7.25
|
||||
slave02.filter.home Ready <none> 365d v1.32.2 192.168.1.11 <none> Armbian 24.11.1 jammy 5.10.160-legacy-rk35xx containerd://1.7.25
|
||||
slave03.filter.home Ready <none> 365d v1.32.2 192.168.1.12 <none> Debian GNU/Linux 12 (bookworm) 6.1.0-31-amd64 containerd://1.7.25
|
||||
```
|
||||
|
||||
version 1.32.2
|
||||
|
||||
### "1.32.0" to 1.32.2
|
||||
|
||||
```text
|
||||
"[preflight] Running pre-flight checks.",
|
||||
"[upgrade/config] Reading configuration from the \"kubeadm-config\" ConfigMap in namespace \"kube-system\"...",
|
||||
"[upgrade/config] Use 'kubeadm init phase upload-config --config your-config.yaml' to re-upload it.",
|
||||
"[upgrade] Running cluster health checks",
|
||||
"[upgrade] Fetching available versions to upgrade to",
|
||||
"[upgrade/versions] Cluster version: 1.32.0",
|
||||
"[upgrade/versions] kubeadm version: v1.32.2",
|
||||
"[upgrade/versions] Target version: v1.32.2",
|
||||
"[upgrade/versions] Latest version in the v1.32 series: v1.32.2",
|
||||
"",
|
||||
"Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':",
|
||||
"COMPONENT NODE CURRENT TARGET",
|
||||
"",
|
||||
"Upgrade to the latest version in the v1.32 series:",
|
||||
"",
|
||||
"COMPONENT NODE CURRENT TARGET",
|
||||
"kube-apiserver masterk.filter.home v1.32.0 v1.32.2",
|
||||
"kube-controller-manager masterk.filter.home v1.32.0 v1.32.2",
|
||||
"kube-scheduler masterk.filter.home v1.32.0 v1.32.2",
|
||||
"kube-proxy 1.32.0 v1.32.2",
|
||||
"CoreDNS v1.11.3 v1.11.3",
|
||||
"etcd masterk.filter.home 3.5.16-0 3.5.16-0",
|
||||
"",
|
||||
"You can now apply the upgrade by executing the following command:",
|
||||
"",
|
||||
"\tkubeadm upgrade apply v1.32.2",
|
||||
"",
|
||||
"_____________________________________________________________________",
|
||||
"",
|
||||
"",
|
||||
"The table below shows the current state of component configs as understood by this version of kubeadm.",
|
||||
"Configs that have a \"yes\" mark in the \"MANUAL UPGRADE REQUIRED\" column require manual config upgrade or",
|
||||
"resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually",
|
||||
"upgrade to is denoted in the \"PREFERRED VERSION\" column.",
|
||||
"",
|
||||
"API GROUP CURRENT VERSION PREFERRED VERSION MANUAL UPGRADE REQUIRED",
|
||||
"kubeproxy.config.k8s.io v1alpha1 v1alpha1 no",
|
||||
"kubelet.config.k8s.io v1beta1 v1beta1 no",
|
||||
"_____________________________________________________________________"
|
||||
```
|
||||
|
||||
```text
|
||||
PLAY RECAP **********************************************************************************************************************************************************************************************************************
|
||||
masterk.filter.home : ok=26 changed=9 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
|
||||
slave01.filter.home : ok=24 changed=8 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
slave02.filter.home : ok=24 changed=8 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
slave03.filter.home : ok=24 changed=8 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
|
||||
```
|
||||
|
||||
```text
|
||||
➜ ~ kubectl get nodes -owide
|
||||
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
|
||||
masterk.filter.home Ready control-plane 349d v1.32.2 192.168.1.9 <none> Debian GNU/Linux 12 (bookworm) 6.1.0-31-amd64 containerd://1.7.25
|
||||
slave01.filter.home Ready <none> 359d v1.32.2 192.168.1.10 <none> Armbian 25.2.1 bookworm 5.10.160-legacy-rk35xx containerd://1.7.25
|
||||
slave02.filter.home Ready <none> 365d v1.32.2 192.168.1.11 <none> Armbian 24.11.1 jammy 5.10.160-legacy-rk35xx containerd://1.7.25
|
||||
slave03.filter.home Ready <none> 365d v1.32.2 192.168.1.12 <none> Debian GNU/Linux 12 (bookworm) 6.1.0-31-amd64 containerd://1.7.25
|
||||
```
|
||||
|
12
Migrations/04-Upgrade_Kubeadm_1-29/GPU-Operator/README.md
Normal file
12
Migrations/04-Upgrade_Kubeadm_1-29/GPU-Operator/README.md
Normal file
@ -0,0 +1,12 @@
|
||||
|
||||
# Current info
|
||||
|
||||
```text
|
||||
root@masterk:/# containerd -v
|
||||
containerd containerd.io 1.7.25
|
||||
```
|
||||
|
||||
Supports Containerd 1.7
|
||||
|
||||
https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/platform-support.html#supported-container-runtimes
|
||||
|
17
Migrations/04-Upgrade_Kubeadm_1-29/NFS/README.md
Normal file
17
Migrations/04-Upgrade_Kubeadm_1-29/NFS/README.md
Normal file
@ -0,0 +1,17 @@
|
||||
# Current info
|
||||
|
||||
```text
|
||||
➜ Cert_Manager helm list
|
||||
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
|
||||
fast-nfs-provisioner-01 nfs-provisioner 7 2024-02-23 02:58:53.916523899 +0100 CET deployed nfs-subdir-external-provisioner-4.0.18 4.0.2
|
||||
slow-nfs-provisioner-01 nfs-provisioner 1 2024-02-29 02:18:28.753512876 +0100 CET deployed nfs-subdir-external-provisioner-4.0.18 4.0.2
|
||||
```
|
||||
Docs doesn't specify much about kubernetes version support, so should be treated like a normal app.
|
||||
|
||||
|
||||
https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner/blob/master/CHANGELOG.md
|
||||
|
||||
```text
|
||||
# v4.0.3
|
||||
- Upgrade k8s client to v1.23.4
|
||||
```
|
48
Migrations/04-Upgrade_Kubeadm_1-29/README.md
Normal file
48
Migrations/04-Upgrade_Kubeadm_1-29/README.md
Normal file
@ -0,0 +1,48 @@
|
||||
# Current info
|
||||
|
||||
```shell
|
||||
➜ bin kubectl version
|
||||
Client Version: v1.32.1
|
||||
Kustomize Version: v5.5.0
|
||||
Server Version: v1.29.0
|
||||
WARNING: version difference between client (1.32) and server (1.29) exceeds the supported minor version skew of +/-1
|
||||
```
|
||||
|
||||
# Things to upgrade
|
||||
|
||||
- Kubernetes from 1.29 to 1.29.14 ✅
|
||||
- Istio ✅
|
||||
- Calico ✅
|
||||
- Cert Manager ✅
|
||||
- Metrics Server ✅
|
||||
- Kubernetes ✅
|
||||
- GPU Operator
|
||||
- NFS provisioner
|
||||
|
||||
## Kubernetes from 1.29 to 1.29.14
|
||||
|
||||
✅
|
||||
|
||||
https://gitea.fihome.xyz/ofilter/ansible_update_cluster
|
||||
|
||||
## Istio
|
||||
|
||||
Upgraded from 1.20 to 1.24.3 ✅
|
||||
|
||||
## Calico
|
||||
|
||||
Upgraded from 3.27 to 3.29 ✅
|
||||
|
||||
## Cert Manager
|
||||
|
||||
Upgraded from 1.15.1 to 1.17 ✅
|
||||
|
||||
## Metrics Server
|
||||
|
||||
✅
|
||||
|
||||
## Kubernetes
|
||||
|
||||
Upgraded from 1.29.14 to 1.32.2 ✅
|
||||
|
||||
|
177
README.md
177
README.md
@ -3,7 +3,21 @@ gitea: none
|
||||
include_toc: true
|
||||
---
|
||||
|
||||
## Older patch notes/version
|
||||
|
||||
Select different tags.
|
||||
|
||||
## TLDR Changelog
|
||||
|
||||
- Replaced the old standalone Docker/NFS server for a Proxmox/NFS instance.
|
||||
|
||||
- Added 2 VMs as worker nodes to the cluster, they will be used/are intended for x64 bit images.
|
||||
|
||||
- One of the new added worker VMs receives a GPU through Proxmox PCI pass through.
|
||||
|
||||
- Some services might have been removed or added.
|
||||
|
||||
# Devices
|
||||
|
||||
## List of current devices:
|
||||
|
||||
@ -11,122 +25,91 @@ include_toc: true
|
||||
|
||||
```yaml
|
||||
Gateway: 192.168.1.1
|
||||
Pi4: 192.168.1.2
|
||||
Srv: 192.168.1.3
|
||||
Proxmox/NFS: somwhere.
|
||||
```
|
||||
|
||||
### Kluster
|
||||
|
||||
> Kubernetes Cluster
|
||||
|
||||
A set of Orange PI 5, so far all of them are the 4GB of RAM version.
|
||||
- Pi 4 with 4GB running as a Master. (Masterk/Pi4)
|
||||
|
||||
- A pair of Orange PI 5, so far all of them are the 8GB of RAM version. (Slave01-2)
|
||||
|
||||
- Proxmox VMs, both with 3 CPU cores and 8GB of RAM (Slave03-4)
|
||||
|
||||
- `Slave04` contains a GPU through Proxmox CPU pass through.
|
||||
|
||||
```yaml
|
||||
Masterk: 192.168.1.10
|
||||
Slave01: 192.168.1.11
|
||||
Masterk: 192.168.1.9
|
||||
Slave01: 192.168.1.10
|
||||
Slave02: 192.168.1.11
|
||||
Slave03: 192.168.1.12
|
||||
Slave04: 192.168.1.13
|
||||
```
|
||||
|
||||
## Which services are running where.
|
||||
```yaml
|
||||
Node Available(GPUs) Used(GPUs)
|
||||
pi4.filter.home 0 0
|
||||
slave01.filter.home 0 0
|
||||
slave02.filter.home 0 0
|
||||
slave03.filter.home 0 0
|
||||
slave04.filter.home 1 0
|
||||
```
|
||||
|
||||
> **Note**:
|
||||
> `Depracated` doesn't mean that the service has obliterated, but that the service is no longer being run in that specific node/instance.
|
||||
## Which services I'm hosting
|
||||
|
||||
### Pi4 (main reverse proxy)
|
||||
### Home Network
|
||||
|
||||
> Initially the Pi4 would only contain lightweight services, performing "core" functions on the network, as well of providing access to some very specific web services that wouldn't incur in much load (such as DNS, DHCP, Gitea, DuckDNS IP updater and `Tube` + `Traefik` as a main reverse proxy for the network).
|
||||
- CoreDNS
|
||||
- DHCPd
|
||||
|
||||
Services run on `docker` / `docker-compose`.
|
||||
### Discord Bots
|
||||
|
||||
#### Containers
|
||||
- Traefik
|
||||
- Gitea
|
||||
- Portainer
|
||||
- Registry
|
||||
- containrrr/watchtower
|
||||
- https://gitea.filterhome.xyz/ofilter/Steam_Invite_Discord (both Master and Dev branches)
|
||||
- Shlink + ShlinkUI (deployed as it has functionality with the Steam Discord Bot from above)
|
||||
|
||||
##### Monitoring
|
||||
### Public DNS
|
||||
|
||||
- grafana
|
||||
- prometheus
|
||||
- alert manager
|
||||
- zcube/cadvisor
|
||||
|
||||
##### Home Network
|
||||
- Coredns
|
||||
- dhcpd
|
||||
- Godaddy
|
||||
- Duckdns
|
||||
|
||||
##### Misc
|
||||
### CRDs
|
||||
|
||||
- DuckDNS
|
||||
- emulatorjs
|
||||
- [Steam_Invite_Discord](https://gitea.filterhome.xyz/ofilter/Steam_Invite_Discord)
|
||||
|
||||
##### Depracated
|
||||
|
||||
- bind9 DNS
|
||||
- [Internet speedtest metrics](https://github.com/nickmaccarthy/internet-speed-test-metrics)
|
||||
- kanboard
|
||||
- mantis
|
||||
- minecraft server + [Minecraft Discord Bot](https://gitea.filterhome.xyz/ofilter/Minecraft_Discord_Bot)
|
||||
- [FGO Tools](https://github.com/OriolFilter/FGO_tools)
|
||||
- muximix
|
||||
- openvpn
|
||||
- Plex
|
||||
- Protainer
|
||||
- mantis
|
||||
- [speedtest_container](https://gitea.filterhome.xyz/ofilter/speedtest_contiainer)
|
||||
- splunk
|
||||
- vaultwarden
|
||||
|
||||
|
||||
|
||||
### Srv (main media server)
|
||||
|
||||
> Initially the server would contain media services and some with higher load, like Minecraft and factorio servers. Right now this server is the designated media server provider, and as well contains other more generalized services, as currently in planning a migration to reorganize the infrastructure.
|
||||
|
||||
Services run on `docker` / `docker-compose`.
|
||||
|
||||
#### Containers
|
||||
|
||||
- Traefik
|
||||
- Portainer
|
||||
- Jenkins
|
||||
- containrrr/watchtower
|
||||
- zcube/cadvisor
|
||||
|
||||
##### Media
|
||||
|
||||
- kizaing/kavita
|
||||
- prologic/tube
|
||||
- gotson/komga
|
||||
- lscr.io/linuxserver/qbittorrent
|
||||
- grafana
|
||||
- lscr.io/linuxserver/jellyfin
|
||||
- difegue/lanraragi
|
||||
- filebrowser/filebrowser
|
||||
|
||||
##### Misc
|
||||
|
||||
- chesscorp/chess-club
|
||||
|
||||
##### Depracated
|
||||
|
||||
##### Notes
|
||||
|
||||
Traefik generates public certificates automatically
|
||||
|
||||
> https://doc.traefik.io/traefik/https/acme/
|
||||
|
||||
#### Kluster
|
||||
|
||||
> Idk I can run whatever I want.\
|
||||
> So far been a playground of Istio for me to create [an Istio documentation](https://gitea.filterhome.xyz/ofilter/Istio_Examples).
|
||||
|
||||
|
||||
- Cilium
|
||||
- Istio Service Mesh
|
||||
- Cert Manager
|
||||
- Istio
|
||||
- Nvidia Gpu Operator
|
||||
- NFS Volume Provisioner
|
||||
- MetalLB
|
||||
|
||||
### Observability
|
||||
|
||||
- Grafana
|
||||
- Prometheus
|
||||
- Kiali
|
||||
- Jaeger
|
||||
|
||||
### CI/CD
|
||||
|
||||
- Jenkins master + dynamic agent(s)
|
||||
- Docker Registry
|
||||
- Skaffold (Client/User side, not running on the Kubernetes cluster, yet relies on it to create multiarch docker images)
|
||||
|
||||
### Git servers
|
||||
|
||||
- Gitea
|
||||
|
||||
### Media related
|
||||
|
||||
- Tube
|
||||
- Fireshare
|
||||
- Filebrowser
|
||||
- Jellyfin
|
||||
- qBitTorrent
|
||||
|
||||
## Downsides of my current setup
|
||||
|
||||
- Only 1 Kubernetes master node, therefore no full High Availability
|
||||
- Only 1 NFS server / no HA NFS server, therefore if the NFS server is down most of the services on the Kubernetes cluster will also be down as they depend on such NFS
|
||||
|
||||
##### Services
|
||||
|
||||
-
|
@ -1,42 +0,0 @@
|
||||
https://github.com/mikeroyal/Self-Hosting-Guide#backups
|
||||
|
||||
https://github.com/mikeroyal/Self-Hosting-Guide#snapshots-managementsystem-recovery
|
||||
|
||||
https://github.com/mikeroyal/Self-Hosting-Guide#file-systems
|
||||
|
||||
https://github.com/mikeroyal/Self-Hosting-Guide#storage
|
||||
|
||||
https://goteleport.com/
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
Volumes
|
||||
|
||||
|
||||
https://github.com/seaweedfs/seaweedfs
|
||||
|
||||
|
||||
---
|
||||
DNS
|
||||
|
||||
https://github.com/awesome-selfhosted/awesome-selfhosted#dns
|
||||
|
||||
https://github.com/awesome-foss/awesome-sysadmin#dns---control-panels--domain-management
|
||||
|
||||
---
|
||||
#3dp
|
||||
|
||||
https://github.com/Floppy/van_dam
|
||||
|
||||
---
|
||||
|
||||
? https://goteleport.com/
|
||||
|
||||
|
||||
---
|
||||
|
||||
Gitea thingies
|
||||
|
||||
https://docs.gitea.com/awesome?_highlight=content#sdk
|
Reference in New Issue
Block a user