Journal

Rebooting my MkDocs setup to start hosting my blog from here as well. I want to redirect a subdomain of stansyfert.com there, but had some issues with setting the TXT DNS records. I checked this command to see the TXT records myself, and saw that I had Cloudflare setup in between, forgot about that.

dig _github-pages-challenge-sguldemond.stansyfert.com
...
stansyfert.com.         1800    IN      SOA     brianna.ns.cloudflare.com. dns.cloudflare.com. 2402276066 10000 2400 604800 1800

Lost connection to Homelab, Mac mini and MBP keep disconnecting after some time. Need to figure out why! Else Homelab is not reliable for OVN+MetalLB debugging. DHCP-lease is expiring on the NIC interface, OVN took over the NIC with its bridge, inherits its IP for a while, but when DHCP lease expires, router does not match the new bridge MAC-address to the lease, so therefore drops it. NIC interface loses IP, OVN bridge gets internal IP assigned, 169.254.0.2.

Solution would be to set static IP, either on the bridge, or perhaps to assign static IP from VyOS to the bridge MAC.

Best solution seems to set static IP on the machines themselves, avoid DHCP on the interfaces entirely. Added static IPs to Butane files as well, but for now did:

sudo nmcli con modify "Wired connection 1" ipv4.method manual ipv4.addresses 192.168.2.60/24 ipv4.gateway 192.168.2.1 ipv4.dns 192.168.2.1

Rebooting the node sets all the NIC and its OVN bridge up more stable it seems. Only the bridge has the static IP, although on startup it is first the NIC, then OVN takes over. Interestingly if you restart NetworkManager via systemctl, the NIC interface gets backs its IP, I think this won't be a problem, since there is no lease issue anymore. It's just NetworkManager and OVN fighting a bit over managing the interfaces.

Mac mini lost connection in the morning.

alpine/socat deployment, lb service and egressservice running:

⬢ [stan@toolbx stan]$ nc -zv 192.168.2.120 2701
Connection to 192.168.2.120 2701 port [tcp/sms-rcinfo] succeeded!
⬢ [stan@toolbx stan]$ echo "hello" | nc -uzv 192.168.2.120 2701
Connection to 192.168.2.120 2701 port [udp/sms-rcinfo] succeeded!

After changing TCP port to 1025 and UDP to 1026, TCP stopped working. Reverting back to both ports set to 2701, TCP stopped working, UDP still works. OVN caching some routes???

TCP and UDP have to be exposed...

Looks like OVS is interfering with the machine DNS settings, interesting!

OVN in shared gateway mode has taken over my default eth NIC and bridged it: - original: enp1s0f0 - bridge: brenp1s0f0

Fix is to setup a global DNS on the machine via systemd-resolve:

sudo mkdir -p /etc/systemd/resolved.conf.d/
sudo vi /etc/systemd/resolved.conf.d/global-dns.conf
[Resolve]
DNS=192.168.2.1
FallbackDNS=1.1.1.1
sudo systemctl restart systemd-resolved

DNS works now! Creating toolbox on macmini CoreOS to install ovs cli tools, not working getting stuck somewhere...

Also disabling NetworkManager from managing the original NIC (enp1s0f0), since both it and the bridge (brenp1s0f0) has the same Lab LAN IP assigned, they were fighting over it.

Going to add the MBP as node! Forgot to connect the MBP to Lab LAN, was still on Home LAN, rebooting. Probably need to do some k3s config to fix the IP of the node settings. Being naive, just re-applied the k3s install script...

k3s-agent service:

Apr 20 08:37:05 mbp k3s[1258]: time="2026-04-20T08:37:05Z" level=error msg="Failed to validate connection to cluster at https://192.168.2.60:6443: token CA hash does not match the Cluster CA certificate hash: cfa9469b9905033c4c83bf18552d02ea64e699161e60ca4a09000f039275590b != eaf527c32542105452c6abf0c917a028813d6ee5b3ddc46d1c1b9ec776d19a9a"

Used the node-token of my fedora laptop, not the macmini... Having all Fedora based systems, same k3s, same username, easy to confuse.

Node added and Ready! Want to make control-plane node not UnSchedulable and add make the second node a proper worker node.

kubectl taint nodes macmini1 node-role.kubernetes.io/control-plane:NoSchedule
kubectl label node mbp node-role.kubernetes.io/worker=worker

Same issue with CNI binaries on mbp!

sudo mkdir -p /opt/cni/bin
sudo ln -sf /var/lib/rancher/k3s/data/current/bin/* /opt/cni/bin/

And after this the DNS issue pops up! Redoing the systemd-resolve > global-dns > restart, added it to Butane of mbp as well.

mm1 node got pod subnet: 10.42.55.0/24 mbp node got pod subnet: 10.42.0.0/24

The values.yaml annotation for OVN podNetwork: 10.42.0.0/16/24, means that each node will get its own 16-base subnet: 10.42.x.x, per node all addresses within that are then 24-base: 10.42..x

Moving Kong Deployment and Service from DaedalusPlatform/k3s_collection here, to first test access without MetalLB. Starting with the kong/go-echo deployment, with a Port Forward (via k9s) to test TCP and UDP using netcat:

nc localhost 2701
nc -u localhost 2701

TCP response looks good, no response UDP, since port-forward does not support UDP. Exposing deployment manually:

kubectl expose deployment tcp-echo-deployment --type=NodePort --port=2701 --protocol=UDP --name=go-echo-udp
kubectl get svc go-echo-udp
echo "hello" | nc -u 192.168.2.60 <nodeport>
Welcome, you are connected to node mbp.
Running on Pod tcp-echo-deployment-6d977d7788-qzt5j.
In namespace default.
With IP address 10.42.0.6.
Service account default.
hello

Need to send data over UDP first!

Installing MetalLB:

kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.15.3/config/manifests/metallb-native.yaml

Added IPPool and L2 advertisement, created kong-echo-service, which gets IP from MetalLB: 192.168.2.111. Now trying netcat again:

nc 192.168.2.111 2701
echo "hello" | nc -u 192.168.2.111 2701

Cannot reach 192.168.2.111 from my machine, lets check VyOS, also not, so IP is assigned my MetalLB, but not properly advertised to router/DHCP?

MetalLB speaker sending ARP over wrong interface, default NIC instead of the bridge? Added bridge interfaces to L2 advertisement definition, so ARP will be sent over bridges.

Something is happening, redirecting, but ping (ICMP) not working, since not exposed by service:

stan@vyos:~$ ping 192.168.2.111
PING 192.168.2.111 (192.168.2.111) 56(84) bytes of data.
From 192.168.2.11: icmp_seq=2 Redirect Host(New nexthop: 192.168.2.111)

traceroute:

stan@vyos:~$ traceroute 192.168.2.111
traceroute to 192.168.2.111 (192.168.2.111), 30 hops max, 60 byte packets
 1  192.168.2.11 (192.168.2.11)  2.822 ms  2.705 ms  2.675 ms
 2  192.168.2.11 (192.168.2.11)  3057.553 ms !H  3057.578 ms !H  3057.547 ms !H

Second hop meaning, host unreachable, we've seemed to have reach the bug, UDP is working!

⬢ [stan@toolbx ovn-kubernetes]$ echo "hello" | nc -u 192.168.2.111 2701
Welcome, you are connected to node mbp.
Running on Pod tcp-echo-deployment-6d977d7788-qzt5j.
In namespace default.
With IP address 10.42.0.6.
Service account default.
hello

Yeah, setting externalTrafficPolicy: Cluster on kong-echo-service works:

⬢ [stan@toolbx ovn-kubernetes]$ nc 192.168.2.111 2701
Welcome, you are connected to node mbp.
Running on Pod tcp-echo-deployment-6d977d7788-qzt5j.

Moved the "bugged" setup to v1 dir, starting EgressService fix in v2 dir.

Claude Code filled the v2 dir with new setup, where MetalLB + OVN + Kong Service + EgressService are all in relationship. It works! But it is now tightly coupled.

The question is why externalTrafficPolicy: Local was placed on the Service? Is there a specific reason why ingress and egress has to be from the same node? If not a wider L2Advertisement can be implemented, where egress might be from a different node then the ingress. Otherwise BGP can be implemented, where the speaker on the node only advertises routes it owns. So a LB Service on worker1 with Deployment on worker1 will expose an IP from there, promising ingress and egress from that node.

Claude did not add externalTrafficPolicy: Local to the v2 service, since setup defacto implements this behavior. I want to set it to see what happens anyway. Manually changes it via k9s, still works!

Creating v3 where we use BGP mode, lets keep going! Claude created v3, but issues BGP setup...

Restarting VyOS BGP setup without ip-range, but hard coded neighbours (nodes). Neighbours setup, Netcat works!

I want to learn to trace the packet getting stuck in v1. I just got a speed course on how to trace packets.

Start a tcpdump on vyos:

stan@vyos:~$ tcpdump -i eth1 -n host 192.168.2.120

Add a debug container to Kong Pod:

kubectl debug -it -n default tcp-echo-deployment-6d977d7788-qzt5j --image=nicolaka/netshoot

Send a TCP packet using netcat from the Kong pod:

$ nc 192.168.1.108 2701

Deployed OVN using Helm, Pods are not starting!

One ovnkube-node pod is requesting more access:

│ ovnkube-controller E0419 14:57:15.198150   21349 reflector.go:204] "Failed to watch" err="failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:ovn-kubernetes:ovnkube-node\" cannot list resource \"pods\" in API gro │

Running helm install from ovn-kubernetes repo:

cd /home/stan/Documents/ovn-kubernetes/helm/ovn-kubernetes
helm upgrade --install ovn-kubernetes . -f ~/Documents/Homelab/homelab/projects/ovn-kubernetes/values.yaml

Test pod was not being created, error on events:

Warning  FailedCreatePodSandBox  6s    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8781ec838472431a04d303c02131ff01f5e27506367dd5 │
│ 3b3aee4e6b1452e1f7": plugin type="loopback" failed (add): failed to find plugin "loopback" in path [/opt/cni/bin]

k3s manages cni binaries, but with Flannel disabled, these binaries are not present, quick fix:

sudo mkdir -p /opt/cni/bin
sudo ln -sf /var/lib/rancher/k3s/data/current/bin/* /opt/cni/bin/

I have no DNS resolution on the macmini1... DNS works after reboot, but then stops working...

Want to install CoreOS on the MBP as well, to setup two node cluster with k3s. The motivation is to experiment with the combo of MetalLB and OVN-kubernetes. I want to recreate the egress bug that occurs when combining these. An OVN EgressService might be the answer. Also I want to experiment with tracing packets and analyzing topology of OVN/OVS. Tools I want to use: nc (netcat), tcpdump, ovs-<>cli, ovn-<>cli.

Extending this repo, e.g. SOPS settings, SSH keys, to also develop on my Devoteam laptop. Hello world, from the Devoteam laptop (devo-hp-fedora). Got my age file on here, so can decrypt my sops encrypted files with that! Added sops bin here: ~/.local/bin.

Copied macmini1 butane file, even though it's 99% the same, only hostname is different...

sops -d macmini1-butane.yaml > ../mbp/mbp-butane.yaml

Running Butane on Atomic Fedora

alias butane='podman --remote run --rm --interactive         \
              --security-opt label=disable          \
              --volume "${PWD}:/pwd" --workdir /pwd \
              quay.io/coreos/butane:release'
butane --pretty --strict mbp-butane.yaml > mbp.ign

Serving mbp.ign butane on network, added MBP to Home LAN network for now:

python3 -m http.server

Had to run sudo wipefs -af /dev/sda after error, similar to mm1 install. Importantly had to reboot after, then install worked. Also important to not serve the encrypted ignition file!

Installed CoreOS on MBP, now need to SSH into it, first get its IP. Then init tailscale, should be already installed. After first boot, CoreOS auto reboots with Tailscale added to rpm-ostree!

sudo systemctl enable --now tailscaled
sudo tailscale up

Disabled expiration key in Tailscale UI.

Need to disable sleep on lid close and such still! Need to edit /usr/lib/systemd/logind.conf, and set ignore on HandleLidSwitch settings, but can't edit that in Atomic CoreOS.

Added file:

sudo vi /etc/systemd/logind.conf.d/lid.conf
[Login]
HandleLidSwitch=ignore
HandleLidSwitchExternalPower=ignore
HandleLidSwitchDocked=ignore

Restart and check config:

sudo systemctl restart systemd-logind
systemd-analyze cat-config systemd/logind.conf

I added this to the Butane file, but untested!

MBP is up with CoreOS and Lid fix, I can access k3s cluster on macmini1.

Need to setup k3s in mm1 without flannel and prepare for OVN-kubernetes, then add mbp to the cluster, then add MetalLB.

Installing k3s on CoreOS Mac mini now... with no issues at all, can access the k3s node from my laptop, also added some extra tls-san:

tls-san:
  - "macmini1"
  - "macmini1.<tailscale-domain>"
  - "100.69.168.103"
  - "192.168.2.60"

Been talking with ChatGPT for a while! Result is that I want to install Fedora CoreOS on macmini1, with k3s (probably). Also instead of a static IP VyOS should make a DHCP reservation for the macmini1 mac address:

vyos@vyos# set service dhcp-server shared-network-name LAB subnet 192.168.2.0/24 static-mapping macmini1 mac '0c:4d:e9:9a:70:aa'
vyos@vyos# set service dhcp-server shared-network-name LAB subnet 192.168.2.0/24 static-mapping macmini1 ip '192.168.2.20'
vyos@vyos# set service dhcp-server shared-network-name LAB subnet 192.168.2.0/24 static-mapping macmini1 description 'macmini1-coreos'

For CoreOS I'm using the Ignition feature to bootstrap the server, this JSON file can be generated from a Butane YAML file,, docs: https://docs.fedoraproject.org/en-US/fedora-coreos/producing-ign/

Call this in the same directory as the Butane file:

alias butane='podman run --rm --interactive         \
              --security-opt label=disable          \
              --volume "${PWD}:/pwd" --workdir /pwd \
              quay.io/coreos/butane:release'
butane --pretty --strict macmini1-butane.yaml > macmini1.ign

Installing CoreOS on Mac mini, had to unset some LVM stuff (not sure what, LLM told me what to do), after /dev/sda3 was being busy.

after I ran:

sudo wipefs -a /dev/sda

because on this forum someone needed to do that: https://discussion.fedoraproject.org/t/installing-bare-metal-on-mac-mini-late-2012-fails-with-fsconfig-system-call-failed-dev-disk-by-label-root-cant-lookup-blockdev/127241/7

Install worked, but my user had no password, so reinstalling. This time serving the Ignition file (macmini1.ign) from my laptop:

butane --pretty --strict macmini1-butane.yaml > macmini1.ign
python3 -m http.server 8080

From the coreos installer:

sudo coreos-installer install /dev/sda --insecure-ignition --ignition-url http://192.168.1.179:8080/macmini1.ign

This way I don't have to add the .ign file to the USB again, which I did the first time, I mounted the separate USB partition locally:

sudo mkdir -p /mnt/usb
sudo mount /dev/sda3 /mnt/usb

So far no issues with installing CoreOS on the Mac mini 2012, seems to run fine, which is exciting, this type of bootstrap booting and serving the config over HTTP, was what I was trying way back in the beginning of my Homelab journey with Ubuntu, that didn't work so smoothly, or at all really, I learned a bit about CloudInit and cloud images though, still useful.

Re-leasing the IP on VyOS is still a bit of a puzzle. This did work:

vyos@vyos:~$ clear dhcp-server lease 192.168.2.24
Lease "192.168.2.24" has been cleared

Mac mini still has this IP though.

The static IP I chose, was already taken by other machine. Changed it to: 191.168.2.60, removed the lease on VyOS, reconnected the eth device on CoreOS:

sudo nmcli device disconnect enp1s0f0 && sudo nmcli device connect enp1s0f0

Setup the second Mac mini with Proxmox running a Ubuntu Cloud image VM with RKE2 installed, followed the docs here: https://docs.rke2.io/install/quickstart Docs are simple and clear, added the .../bin to PATH, copied the kubeconfig. Little confusion when systemctl start rke2-server got stuck, printing issues with connection to 127.0.0.1:2379 (etcd), but cancelling the operation and starting it again it worked immediately, maybe some race condition.

Right off the bat, memory usage of the RKE2 master node is quite significant! Close to 4GB all together with the node OS (Ubuntu).

Accessing the cluster from my machine it possible using the LAB IP, via the subnet forwarding on VyOS VM. Want to install and try out OpenBao, but not gonna bother with Terraform and stuff, just imperativaly.

helm install openbao openbao/openbao -n openbao --create-namespace

OpenBao needs PersistentVolume or default storage class to start a pod:

  Normal  FailedBinding  11s (x8 over 107s)  persistentvolume-controller  no persistent volumes available for this claim and no storage class is set

Installed: https://github.com/rancher/local-path-provisioner Pod is not running.

k exec -it -n openbao openbao-0 -- sh
$ bao secrets enable kv
Success! Enabled the kv secrets engine at: kv/

Docs: https://openbao.org/docs/commands/secrets/enable/

/ $ bao kv put secret/my-first-secret name=stansyfert
======= Secret Path =======
secret/data/my-first-secret

======= Metadata =======
Key                Value
---                -----
created_time       2026-02-06T13:20:07.449345348Z
custom_metadata    <nil>
deletion_time      n/a
destroyed          false
version            1

Want to load secret into pod using CSI driver: https://openbao.org/docs/platform/k8s/csi/examples/

Installed CSI Secret Store Driver: https://secrets-store-csi-driver.sigs.k8s.io/getting-started/installation, and created SecretProviderClass pointing to my-first-secret.

Had to enable the Openbao CSI provider in Helm install:

...
csi:
  enabled: true

upgrade release:

helm upgrade openbao openbao/openbao -n openbao -f ~/Development/DevOps/homelab/projects/openbao/values.yaml

Now getting:

  Warning  FailedMount  3s (x5 over 11s)  kubelet            MountVolume.SetUp failed for volume "openbao-first-secret" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod openbao/demo-app-5574d78dc4-rhw9v, err: rpc error: code = Unknown desc = error making mount request: couldn't read secret "my-first-secret": failed to login: Error making API request.

I think secret path should be: secretPath: "secret/my-first-secret", based on this:

/ $ bao kv get secret/data/my-first-secret
No value found at secret/data/data/my-first-secret
/ $ bao kv get secret/my-first-secret
======= Secret Path =======
secret/data/my-first-secret
...

Still not working after recreating the SecretProviderClass, maybe it has to do with the "failed to login" part.

creates SA:

k create sa -n openbao demo-app

bao auth enable kubernetes
bao write auth/kubernetes/role/demo-app \
    bound_service_account_names=demo-app \
    bound_service_account_namespaces=openbao \
    policies=default \
    ttl=1h

New error:

│   Warning  FailedMount  26s   kubelet            MountVolume.SetUp failed for volume "openbao-first-secret" : rpc error: code = DeadlineExceeded desc = failed to mount secrets store obje │
│ cts for pod openbao/demo-app-6f8bf7fb98-245b5, err: rpc error: code = DeadlineExceeded desc = error making mount request: couldn't read secret "my-first-secret": failed to login: context │
│  deadline exceeded

Seeing this in the openbao-0 logs, might be related:


WARNING! dev mode is enabled! In this mode, OpenBao runs entirely in-memory
and starts unsealed with a single unseal key. The root token is already
authenticated to the CLI, so you can immediately begin using OpenBao.

You may need to set the following environment variables:

    $ export BAO_ADDR='http://[::]:8200'

The unseal key and root token are displayed below in case you want to
seal/unseal the Vault or re-authenticate.

Unseal Key: ma7kWLthWXy/4XOCCJcLHo4GTtFDnp2t8jn9L+E9jvQ=
Root Token: root

Development mode should NOT be used in production installations!

Decided to already step away from Talos, although cool, now want to try RKE2, alternative to K3s.

Talos is working on my MBP, so I'll leave that for now. I can use the other Mac mini to try out RKE2.

Just quickly set stuff up: - Gave MBP a static IP: 192.168.2.2 - Advertising 192.168.2.0/24 subnet via VyOS using tailscale (tailscale set --advertise-routes=192.168.2.0/24) - Accepted routes on Thinkpad (tailscale set --accept-routes), no need for physical connection to Lab LAN, should be able to access everything remotely - Tested if I can reach my whomai service on Talos, no problems, needed to update the MetalLB routes matching the Lab LAN network - Installing Proxmox OS on macmini1, with static IP: 192.168.2.3 - Mount USB and copy ssh pub key to Proxmox: - mkdir /mnt/usb - mount /dev/sdb5 /mnt/usb - cat /mnt/usb/key.txt >> ~/.ssh/authorized_keys - Setup Proxmox cluster on MBP: pvecm add lab - Added macmnini1 to cluster: pvecm add 192.168.2.2 --use_ssh 1

Cluster setup failed, I broke both Proxmox machines, ChatGPT helped me recover them! Not trying that again...

Need to setup VyOS so LAB devices can reach the internet. I think I need to setup (S)NAT as well as DNS. - https://docs.vyos.io/en/latest/quick-start.html#nat - https://docs.vyos.io/en/latest/quick-start.html#dhcp-dns-quick-start

I think I've setup NAT correctly, the MBP can reach the internet now:

set nat source rule 100 outbound-interface name 'eth0'
set nat source rule 100 source address '192.168.2.0/24'
set nat source rule 100 translation address masquerade

Some docs: https://docs.vyos.io/en/latest/configuration/nat/nat44.html#source-nat

Now need to setup DNS!

set service dns forwarding cache-size '0'
set service dns forwarding listen-address '192.168.2.1'
set service dns forwarding allow-from '192.168.2.0/24'

This worked as well, domain names now resolve to IPs on the MBP.

I didn't change VyOS (I think). On the Ubuntu VM (same machine, but also get both NICs), I added a netplan config:

$ sudo vim /etc/netplan/60-cloud-init.yaml
network:
  version: 2
  ethernets:
    enp6s19:
      match:
        macaddress: "bc:24:11:f9:b9:dc"
      dhcp4: true
      set-name: "enp6s19"

Applying it: sudo netplan apply, gives me a IP address now and I can reach the VyOS and vice versa! So DHCP is working on VyOS! Small win there, got me into understanding netplan a bit again.

Looks like the Thunderbolt NIC is not being loaded on the Proxmox OS on Ubuntu, so my Ubuntu VM is getting an IP since VyOS is making it available on vmbr1, but nic1 is connecting to vmbr1, which I though it was. Is this again the badly supported Thunderbolt Ethernet NIC chasing me...?

Success! Got an IP on my laptop, doesn't bode well for my Thunderbolt Ethernet though... When it happens again, should look into the logs.

vyos@vyos:~$ show dhcp server leases
IP Address    MAC address        State    Lease start                Lease expiration           Remaining    Pool    Hostname       Origin
------------  -----------------  -------  -------------------------  -------------------------  -----------  ------  -------------  --------
192.168.2.10  bc:24:11:f9:b9:dc  active   2026-02-03 10:38:51+00:00  2026-02-04 10:38:51+00:00  23:41:08     LAB     ubuntu-server  local
192.168.2.11  bc:24:11:af:6d:02  active   2026-02-03 10:56:00+00:00  2026-02-04 10:56:00+00:00  23:58:17     LAB     talos-v1v-p8m  local
192.168.2.12  bc:24:11:7c:fb:b6  active   2026-02-03 10:56:19+00:00  2026-02-04 10:56:19+00:00  23:58:36     LAB     talos-j2r-sow  local
192.168.2.13  00:e0:4c:4d:22:a0  active   2026-02-03 10:56:25+00:00  2026-02-04 10:56:25+00:00  23:58:42     LAB     tp1-ubuntu     local

Looks like the MBP VMs are getting an IP now from VyOS, now need to get into the Proxmox UI of the MBP. This works, need to install Tailscale on this Proxmox instance as well.

Back at it (02-02-2026). Want to get an IP on the MBP from the DHCP server running on VyOS VM on the Mac mini. After system reboot and ifreload -a no luck. Would VyOS be handing out IPs correctly?

Connecting my laptop to the LAB switch is also not giving me an IP on the 192.168.2.1/32 subnet. so seems like VyOS indeed.

The DHCP refresh didn't take place, so I don't have access to my MacBook Pro connected to only Lab LAN.

So I'm doing a little side project where I expose this file via a simple MkDocs setup. Hosting it using GitHub Pages, setting up a redirect of homelab.stansyfert.com to the GH Pages of the Homelab repo.

This file will be reverted and the README will be slightly adjusted to render nicely. Cursor created some scripts for this which worked and I haven't really looked at, let's see how long those hold up.

For the domain to GH Pages redirect I'm following: https://docs.github.com/en/pages/configuring-a-custom-domain-for-your-github-pages-site/managing-a-custom-domain-for-your-github-pages-site

They recommend to add my domain to my GH account, did that, so waiting for DNS records to update: https://github.com/settings/pages_verified_domains/stansyfert.com

When that is setup I should follow this: https://docs.github.com/en/pages/configuring-a-custom-domain-for-your-github-pages-site/managing-a-custom-domain-for-your-github-pages-site#configuring-a-subdomain

Installed Tailscale on VyOS using by downloading the binary files and supporting systemd files from: https://pkgs.tailscale.com/stable/#static - curl'd the tgz & unpacked it - moved the files inside to the right place, based on how it is on my laptop

Adding my SSH public key uses this command (https://docs.vyos.io/en/latest/configuration/service/ssh.html):

generate public-key-command user vyos path id_rsa_t480s.pub

I can now access VyOS using my SSH key, so I disabled password auth:

set service ssh disable-password-authentication

I've setup a DHCP server, serving IPs on the 192.168.2.0/24 subnet (https://docs.vyos.io/en/latest/configuration/service/dhcp-server.html). - Set subnet id: 1 - Set start and stop range: .10 --> .100 - DNS (name-server): 192.168.2.1 - Default route: 192.168.2.1

I would want to access my MacBook now, which is only connected to Lab LAB, but it might be stuck with a old DHCP lease of the old OPNSense setup. Could check up in 12-24h to see of the lease expired and VyOS has served it a new IP now, would be cool.

show dhcp server leases

Btw, I have a backup of the VyOS config using:

scp vyos:/config/config.boot /my/path/config.boot

Setting up a Ubuntu Server VM as exit node via Tailscale. This way I can flow traffic via The Netherlands while I'm in Poland.

Pasting in a Proxmox VM seems a common issue.

Adding a serial port via Hardware and running this in the VM:

sudo systemctl enable serial-getty@ttyS0.service

works, then I can get a shell via the Proxmox OS:

qm terminal <VMID>

Now I can install Tailscale on the VM and SSH to it from my machine. After install I ran:

sudo tailscale up --advertise-exit-node

In Tailscale I have to allow the machine to be an exit node. Also I have to enable IP forwarding on the VM as explained here: https://tailscale.com/kb/1019/subnets?tab=linux#enable-ip-forwarding

echo 'net.ipv4.ip_forward = 1' | sudo tee -a /etc/sysctl.d/99-tailscale.conf
echo 'net.ipv6.conf.all.forwarding = 1' | sudo tee -a /etc/sysctl.d/99-tailscale.conf
sudo sysctl -p /etc/sysctl.d/99-tailscale.conf

Playing around with VyOS on Proxmox now. Installed it. Setup for WAN IP:

configure
set interfaces ethernet eth0 description 'WAN'
set interfaces ethernet eth0 address dhcp
commit
ip a
save

Moving to VyOS, a Linux kernel based router, all CLI based. Or not, no GUI, just CLI, seems cool, but after install I got no screen.

Let's play it "safe" and install Proxmox on the Mac mini and get OPNSense on there. Have that setup, in order to install Tailscale on the Proxmox OS I had to disable Enterprise repo's in the settings.

Also I have to do some cert stuff: https://tailscale.com/kb/1133/proxmox#enable-https-access-to-the-proxmox-web-ui Adding the certs enables SSL over the Tailscale URL of the Mac mini: https://macmini.:8006/

Curious if I can just add the subnet to Tailscale settings in OPN and will be able to reach the MBP again.

Thunderbolt Ethernet on Mac mini with OPNSense (FreeBSD) failed again. I now have the logs:

bge1: firmware handshake timed out, found 0xffffffff
brgphy1: detached
miibus1: detached
bge1: detached
pci9: detached
pcib9: detached
pci8: detached
pcib8: detached
pci7: detached

"This is a hard device failure from the OS point of view." FreeBSD forum responses on the "bge" driver doesn't look hopefull.

The MBP with Proxmox is connected to the router and the Talos VMs have their new IPs. The traefik service still has a 192.168.1.xxx IP, which needs to be updated.

I want to know how Tailscale routes the requests to the Lab LAN via the Tailscale interface to to router. It is not showing up in my ip route show results.

It is showing when checking all tables, not just main (default):

-> % ip route show table all | grep tailscale -n
1:100.83.153.29 dev tailscale0 table 52 
2:100.83.234.73 dev tailscale0 table 52 
3:100.93.235.14 dev tailscale0 table 52 
4:100.100.100.100 dev tailscale0 table 52 
5:100.111.248.12 dev tailscale0 table 52 
6:100.111.255.7 dev tailscale0 table 52 
7:192.168.2.0/24 dev tailscale0 table 52

Useful:

-> % ip route get 192.168.2.1                   
192.168.2.1 dev tailscale0 table 52 src 100.109.194.44 uid 1000 
    cache

Also involves ip rule setting "Policy decision" which I don't fully understand yet.

Haven't worked on my lab for a few days, can't reach the OPNSense router via the LAN IP. Expected to be able to via Tailscale, ip route show is not showing the Lab LAN subnet:

-> % ip route show
default via 192.168.1.1 dev wlp61s0 proto dhcp src 192.168.1.179 metric 600 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
172.18.0.0/16 dev br-9fee8da751a1 proto kernel scope link src 172.f18.0.1 linkdown 
172.19.0.0/16 dev br-f24dc19acc19 proto kernel scope link src 172.19.0.1 linkdown 
172.20.0.0/16 dev br-0b98cec484ef proto kernel scope link src 172.20.0.1 
172.21.0.0/16 dev br-444ba1ec82d0 proto kernel scope link src 172.21.0.1 linkdown 
172.22.0.0/16 dev br-bbad8fd4fad4 proto kernel scope link src 172.22.0.1 linkdown 
192.168.1.0/24 dev wlp61s0 proto kernel scope link src 192.168.1.179 metric 600

Not getting an IP after physically connecting to the Lab LAN using Ethernet on my laptop either. Seems like the router is not responding, Home LAN IP is not responding to ping either. Attached monitor and keyboard to router. When I disable the firewall I can ping it. It can ping the ISP router as well, and get into the web GUI via the WAN IP, using http://.

I see that the LAN interface has the device bge1, but it says "missing". Also I don't see it using ifconfig -a. The switch light is green though, so I don't think the Thunderbolt adapter is dead. Not ideal, but rebooting system, see if it shows up again. I see it in the startup logs already present. It's back after the reboot, could be that OPNSense/FreeBSD disconnected the NIC and didn't reconnect.

Reattaching the MacBook to the new Lab LAN. Edited /etc/network/interfaces and ran ifreload -a, but got a error that no IP was found for vmbr0. After reboot it worked again.

I added the os-tailscale community plugin to OPNSense, can't find proper documentation on its use. Had to restart OPN to see the options. Options are available via VPN > Tailscale. Under Settings you need to enable Tailscale first, then under Status you can get the auth link. I enable Accept Subnet Router, with an Advertised Route of 192.168.2.0/24 (LAN). In the Tailscale dashboard under opnsense I still needed to confirm that this subroute is legit.

To setup Firewall rules I need to first assign the Tailscale interface tailscale0 to OPNSense via Interfaces > Assignment. Under the OPNSense shell (FreeBSD) using ifconfig -a you can see the interfaces as well. Also don't forget after creation to Enable the interface, now I can set Firewall rules to allow access to LAN net via Tailscale.

Firewall → Rules → Tailscale

Action: Pass
Interface: Tailscale
Source: Tailscale net
Destination: LAN net
Description: Allow Tailscale access to Lab LAN

Action: Pass
Interface: Tailscale
Source: Tailscale net
Destination: This Firewall
Description: Allow Tailscale access to OPNsense

Finally on my laptop I have to run the following command:

sudo tailscale set --accept-routes

As explained here: https://tailscale.com/kb/1019/subnets#use-your-subnet-routes-from-other-devices

Also useful to disable key expiry of the opnsense machine in Tailscale dashboard via "Machine settings".

I might have to run this everytime I start Tailscale on my machine:

sudo tailscale up --accept-routes

But it works, I can ping the LAN address of OPNSense Mac mini.

I have my Thunderbolt-Ethernet cable and I setup my new hardware arrangement. Now going to install OPNSense directly on the 2014 Mac mini, which now has two NICs.

I dd'd the ISO onto my USB, but it's not showing up in the post-Alt screen on the Mac mini (yet).

sudo dd if=/home/stan/Downloads/OPNsense-25.7-dvd-amd64.iso of=/dev/sdb bs=4M status=progress oflag=sync

Downloading the "vga" version which is a .img file as instructed in the docs (https://docs.opnsense.org/manual/install.html#installation-media). sudo dd if=/home/stan/Downloads/OPNsense-25.7-vga-amd64.img of=/dev/sdb bs=16k status=progress oflag=sync

I can already check the MAC addresses of the two NIC on the current Debian install. - enp3s0f0: 0c:4d:e9:c6:85:30, should become WAN - ens9 (enp9s0): 0c:4d:e9:d1:54:11, should become LAN

I need to give the WAN side a static IP on my ISP router.

OPNSense it setup and working on the Mac mini, there is an option for auto config, would be good to document the settings. Here a quick overview: - Set WAN and LAN to correct NICs - WAN gets static IP from ISP router (192.168.1.100) - LAN sets static IP: 192.168.2.1/24 - Enable DHCP on LAN with range .10 > .100

I can connect via Tailscale as well using the os-tailscale plugin. This requires me to update OPNSense though:

***GOT REQUEST TO INSTALL***
Currently running OPNsense 25.7 (amd64) at Tue Dec  9 20:27:27 UTC 2025
Installation out of date. The update to opnsense-25.7.9 is required.
***DONE***

I'm going to clean up my repo so I can commit my changes. Need to encrypt some secrets, trying SOPS in combination with age.

Created age key:

mkdir -p ~/.config/sops/age
age-keygen -o ~/.config/sops/age/keys.txt

Added .sops.yaml in repo main:

keys:
  - &me <age-public-key>

creation_rules:
  - path_regex: projects/proxmox/talos/.*\.ya?ml$
    key_groups:
      - age:
          - *me

Encrypted my YAML files:

sops -e -i controlplane.yaml
sops -e -i worker.yaml
sops -e -i talosconfig.yaml

In order to use the encrypted file I have two options:

sops -d talos/controlplane.yaml | talosctl apply-config \
  --nodes 192.168.2.59 \
  --file /dev/stdin

sops -d talos/controlplane.yaml > cp.dec.yaml
talosctl apply-config --nodes 192.168.2.59 --file cp.dec.yaml
rm cp.dec.yaml

Want to port forward the calls to the WAN IP on 6443 to the Talos VIP so I don't need the Ubuntu VM. It is not working as expected, trying setting up WireGuard instead.

I have WireGuard working with OPNSense and Ubuntu. This guide has most information: https://docs.opnsense.org/manual/how-tos/wireguard-client.html

I'm using this config on Ubuntu:

[Interface]
PrivateKey = < private key from client! >
Address = 10.0.0.2/32
DNS = 192.168.2.1

[Peer]
PublicKey = < public key of wireguard instance in opnsense! >
Endpoint = 192.168.1.51:51820
AllowedIPs = 192.168.2.0/24, 10.0.0.0/24
PersistentKeepalive = 25

Firewall → Rules → WireGuard

Add rule: - Interface: WireGuard - Source: 10.0.0.0/24 - Destination: LAN net - Description: Allow WG clients to LAN

Also add: - Interface: WireGuard - Source: 10.0.0.0/24 - Destination: This firewall - Description: Allow WG to reach OPNSense itself

Without this rule you cannot reach 192.168.2.1.

I have successfully updated the IP and VIP related to the control plane node. I changed the references to both in the controlplane.yaml, transferred this file to my Ubuntu VM and applied the changes there. On my machine:

scp controlplane.yaml proxmox-ubuntu:/home/stan/talos-cluster

On the Ubuntu VM:

talosctl apply-config --nodes 192.168.2.59 --file talos-cluster/controlplane.yaml

Talos seems quite reactive to these kind of changes, unlike k3s.

Setting up a Ubuntu Server VM in order to access everything inside the OPNSense LAN, like Talos (talosctl, kubectl). Using Cloud-Init feature, needed to download a cloud image of Ubuntu from https://cloud-images.ubuntu.com/noble/current/. The .img is not like a .iso, you cannot boot from it by attaching it to a CDROM drive. I changed the size of the img, following https://github.com/UntouchedWagons/Ubuntu-CloudInit-Docs:

qemu-img resize noble-server-cloudimg-amd64.img 16G

And now going to mount that as the main drive.

qm importdisk 104 noble-server-cloudimg-amd64.img local-lvm

This is working. Using a cloud-image is different from using a ISO install in that the img is already pre-configured for usage within a cloud setting. No installer is needed, the cloud-init vars are used to create a user, add SSH key etc and you can start using the VM, pretty cool.

Installing OPNSense on Proxmox. I added to network interfaces to the VM. - vmbr0: default bridge by Proxmox, connected to the physical NIC of the machine (MacBook) - vmbr1: newly created bridge without any NIC or IP (yet)

From OPNSense console I had to configure both interfaces. The vmbr0 would be the WAN (Wide Area Network) side, connected to my router ISP. vmbr1 would be the LAN side, which gets its own subnet (192.168.2.1/24 in this case). The Talos VMs will connect then connect to vmbr1 instead of vmbr0 and get their IP from OPNSense DHCP.

Haven't been able to connect to the Web GUI yet. The WAN IP was not responsing, and the LAN ip is not accessible from my machine.

Also Traefik has stopped working again...

I can access the web GUI via the WAN IP when I disable the firewall in the shell (pfctl -d, to re-enable: pfctl -e). There should be a way to setup access via WAN securily via the web interface, haven't found that yet.

Trying the steps described here: https://forum.opnsense.org/index.php?topic=36950.0

1. Go to Interfaces > [WAN] deselect "Block private networks"
2. Go to Firewall > Rules > WAN and create a new rule using below parameter save then apply.

  Action : Pass
  Interface : WAN
  Direction : In
  TCP/IP Version: IPv4
  Protocol: any
  Source: WAN net
  Destination: any
  Destination port range: any
  Gateway: default
  repeate this for IPv6

3. Go to Firewall > Settings > Advanced and tick "Disable reply-to (Disable reply-to on WAN rules)"
4. Reboot (Very Important)

This worked! I can now access the web GUI from the WAN IP.

Of course now I'm in for some fun. I have switched the network interface of the two Talos VMs to vmbr1, which first of all makes it not directly accessible from my machine. I can do some port forwarding in OPNSense in order to forward traffic from the WAN IP with k8s API port (6443) to the new internal LAN IP of the control plane node. But that doesn't just work, like I expected. I can either completely re-install Talos on both VMs starting of with the now internal LAN IP, or try to figure out what I can do to get it working without... Reinstall takes a while, figuring out how to fix it as well.

Time to get something running, lets start with whoami. For this I need Traefic. I'm installing the Helm chart using Terraform so I have everything documented.

The LoadBalancer service is not getting a external-ip yet, so that is pending, which made Terraform wait.

Added MetalLB as well via Terraform, giving the Traefic LoadBalancer a IP from my routers DHCP service.

Applied the whoami manifests, almost there, got errors on the traefic pods:

│ traefik-9bfb99fc6-qx97t 2025-12-04T17:38:47Z INF Updated ingress status ingress=whoami-ingress namespace=whoami                                                                            │
│ traefik-9bfb99fc6-qx97t 2025-12-04T17:38:47Z ERR Cannot create service error="service not found" ingress=whoami-ingress namespace=whoami providerName=kubernetes serviceName=whoami-svc se │
│ traefik-9bfb99fc6-qx97t 2025-12-04T17:38:47Z ERR Cannot create service error="service not found" ingress=whoami-ingress namespace=whoami providerName=kubernetes serviceName=whoami-svc se │

Changing to a IngressRoute instead of Ingress gives me these errors:

│ 2025-12-04T20:21:14Z ERR error="kubernetes service not found: default/whoami-svc" ingress=whoami namespace=default providerName=kubernetescrd                                              │
│ 2025-12-04T20:21:37Z ERR error="kubernetes service not found: default/whoami-svc" ingress=whoami namespace=default providerName=kubernetescrd                                              │

Seems like a namespace issue, where Traefik is looking in default.

In order to re-apply Traefik via Terraform I had to run:

terraform import helm_release.traefik traefik/traefik
terraform import helm_release.metallb metallb/metallb
terraform plan

Issue around re-applying MetalLB now, interesting resource to check in k8s:

kubectl -n metallb get events --sort-by=.metadata.creationTimestamp | tail -n 30

Or just check k9s ==> :events

This issue on GitHub helped me fix deploying MetalLB: https://github.com/siderolabs/talos/issues/10291 I had to add some labels to the namespace:

pod-security.kubernetes.io/audit: privileged
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/warn: privileged

It seemed like initially installing MetalLB went okay, but I might be wrong here.

Continuing with getting Traefik working. Installing via Terraform keeps it in pending-install state. Doing it directly using helm command works:

helm upgrade --install traefik oci://ghcr.io/traefik/helm/traefik -f values/traefik.yaml -n traefik --create-namespace --debug

But! The External-IP keep state <pending>. Which is odd, since it did get it the first time I tried it. What changes is that I now am using a Virtual IP for Talos, this might influence stuff. I have to still apply the MetalLB manifests! This immediately gives the Traefik LoadBalancer an external-ip.

Apparently the Service had to be defined before the Ingress(Route) for it to work. Although I had this working on k3s (I thought...).

I need a single IP when you have a HA control plane cluster. I thought of using kube-vip, but Talos has its own solution. Setting up a Virtual IP for the CP node for when I have multiple CP nodes: https://docs.siderolabs.com/talos/v1.9/networking/vip

I added this to the controlplane.yaml:

machine:
  network:
    interfaces:
      - interface: eth0
        dhcp: true
        vip:
          ip: 192.168.1.70

This seems to be working.

I can't get the Talos to boot properly. I think it has not yet to do with Talos but with booting the ISO in Proxmox correctly.

The Talos docs on Proxmox says to use "ovmf" BIOS, this did not work for me. I selected SeaBIOS (default) which did work.

Following the rest of the instructions got me to init a control plane node and kubectl to it! Success!

macmini0 started spinning up it's fans every 30 seconds or so, I see CPU spike as well. Looks like Loki is using the most CPU in those times, restarted a statefulset but still happening. kubesystem coredns also spiking

Removed all Loki and Fluentbit stuff, still spiking, am I getting that much traffic? Cloudflare is not really showing crazy traffic.

At the end of the day turned off both Mac minis. This morning on startup no more fan spikes. I do see more traffic then usual on Cloudflare, perhaps bots visiting the URL.

Switching gears. Installing Proxmox on top of Ubuntu is not recommended at all. Installing Proxmox VE OS from USB on the MacBook now.

Also planning to get Kubernetes running via Talos Linux instead of K3s, which will hopefully help me with my CKA certification. I will start with a single control plane node VM with kube-vip to provide access to the API. Together with one worker node VM that will be a minimal cluster. When the Mac minis are Proxmox'd I will add a control plane node VM per machine to run a HA k8s cluster, including some worker nodes.

Getting Proxmox OS running is really easy, it serves a web GUI which you can control the node. Added my pub key manually, would be nice to do some auto install: https://pve.proxmox.com/wiki/Automated_Installation

Downloading Talos ISO from: https://factory.talos.dev/ - Add siderolabs/qemu-guest-agent as instructed!

And following instructions for Proxmox: https://docs.siderolabs.com/talos/v1.11/platform-specific-installations/virtualized-platforms/proxmox

I need to add a EFI Disk of 4MB, figuring out how to create one in Proxmox.

Installing Tailscale on the MBP so I can setup the bridge and still connect to it.

Added static IP for the MPB on the router (MAC address based). Renewed DHCP lease:

sudo systemctl restart systemd-networkd

Pausing in Jenkins setup, adding my MacBook Pro to the cluster. The plan is to setup a Proxmox cluster on all machines, which will provide me with a Hypervisor layer. From there I can setup VMs that will run Kubernetes nodes and a virtual router for MetalLB BGP mode.

Not touching the Mac minis for now, first getting the MacBook setup. I want to learn more about cloud-init, so I'll be using that in combination with Ubuntu Server, since it supports cloud-init better than Debian I've read.

With cloud-init I'll be setting up a virtual bridge as well, which I'll connect met NIC to. The MacBook doesn't have built-in ethernet, but I have an official Thunderbolt to ethernet adapter. - Mac: ac:87:a3:13:08:10 (Ethernet adapter) - Mac: 60:f8:1d:b1:a0:74 (WiFi)

Documentation on using the autoinstall feature via cloud-init: - https://canonical-subiquity.readthedocs-hosted.com/en/latest/tutorial/providing-autoinstall.html#providing-autoinstall - https://canonical-subiquity.readthedocs-hosted.com/en/latest/reference/autoinstall-reference.html

I removed the networking part, since it was throwing an error on install on de MBP. The user password I created using openssl passwd.

The default network behavior let's the ethernet NIC retrieve an IP address over DHCP dynamically. That is enough for now, I will setup the virtual bridge using Ansible later. With the cloud-init+autoinstall I can SSH into the machine directly. I just need to check the IP address on the machine (or the router using the MAC address). This makes the cloud-init YAML also more generic, since I don't have to know the MAC address of the NIC or have to set a static IP.

I have to disable the lid close behavior, on the MBP. I should integrate this into the cloud-init setup. Edited: /etc/systemd/logind.conf

HandleLidSwitch=ignore
HandleLidSwitchExternalPower=ignore

systemctl restart systemd-logind

Seems to work fine.

Change brightness:

sudo tee /sys/class/backlight/acpi_video0/brightness <<< 50

Getting Jenkins pipeline working, starting of with manually adding it via the Web UI. Setting up a way to build and push Docker container from the pipeline. Trying out docker-workflow plugin: https://docs.cloudbees.com/docs/cloudbees-ci/latest/pipelines/docker-workflow Have to add credentials to pipeline, but admin has no rights.

Installing Jenkins via Terraform. Created a new provider for Kubernetes and Helm. Running terraform init and terraform apply. Jenkins is running, but the Tailscale domain is not working yet. Maybe something to do with the port. Should add an Ingress instead, I like that approach, because it shows the URL in from kubectl/k9s.

Persisted the Tailscale install to a HelmChart manifest. Finding out how I can see the logs I get from Helm usually. A pod was created and completed, in this case helm-install-tailscale-operator-mhwzn, which has the logs.

Starting slow with setting up a Ansible playbook to configure a bridge between the two machines. I have a simple setup where I can png both machines via Ansible:

USE_TAILNET=true ansible -K -i inventory.ini all -m ping

I'm not at home so I have to be careful with changing the networking settings. I want to setup a bridge and connect both machines to it. The current IP on the NIC will be disabled and the NIC will be attached to the bridge. The bridge will provide the an IP address for the machines.

USE_TAILNET=true ansible-playbook -K -i inventory.ini playbooks/configure-bridge.yaml

For now Ansible cannot find the vars/interfaces.j2 file. Moving it to the playbooks folder.

Everything works as expected, the bridge is setup and the machines are connected to it. I can ping both machines from the other.

Side quest, installing Tailscale Kubernetes operator to expose some services to VPN with MagicDNS domain names.

helm upgrade \
  --install \
  tailscale-operator \
  tailscale/tailscale-operator \
  --namespace=tailscale \
  --create-namespace \
  --set-string oauth.clientId="<OAauth client ID>" \
  --set-string oauth.clientSecret="<OAuth client secret>" \
  --wait

Want to expose ArgoCD over Tailscale via Ingress to give it a MagicDNS domain name and TLS.

Had to configure some stuff in Tailscale Web UI: - Create tags: k8s-operator and k8s - Create OAuth client to add to the operator with correct scopes: auth_keys, devices:core - Enable HTTPS

Then I created a Ingress which assigns a domain name automatically, see Address of the Ingress.

Bridging vs. Routing vs. NAT

Mode	Layer	What it does	Typical use
Bridge	2 (Ethernet)	Forwards frames between interfaces	VMs in same LAN
Router	3 (IP)	Moves packets between networks/subnets	Connecting LANs
NAT	3+4	Rewrites IPs to share one address	Internet access sharing

Subnet mask

The subnet mask tells the system which portion of the IP address refers to the network and which portion refers to the host.

/24 → 24 bits of the address (out of 32) are for the network

The remaining 8 bits are for hosts

CIDR	Mask	Network Range	Host Range	# Hosts
/24	255.255.255.0	192.168.1.0	192.168.1.1–192.168.1.254	254
/25	255.255.255.128	192.168.1.0	192.168.1.1–192.168.1.126	126
/16	255.255.0.0	192.168.0.0	192.168.0.1–192.168.255.254	65,534

Starting with implementation of MetalLB. MetalLB requires a Network Add-on, researching options. K3s comes with Flannel as default, but I might have disabled it, not sure.

Interesting setup now where I have a network interface from Tailscale, tailscale0. But my setup will require IPs from the router at home.

Disabling ServiceLB on k3s. It says to disable on all nodes, but when adding:

disable:
  - servicelb

To the worker-node (macmini1) and running k3s-agent service restart it won't restart.

Starting of with Layer 2 mode which used the ARP protocol. One node gets a IP address, and broadcasts its IP address to the network.

Installing MetalLB:

kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.15.2/config/manifests/metallb-native.yaml

After setting ip IPAddressPool and L2Advertisement my load balancers immediately got a new IP address from the range.

I setup Tailscale VPN, and got access to the cluster from outside. I updated the k3s-config.yaml to include the Tailscale IP addresses:

tls-san:
  - 192.168.1.100
  - 100.66.64.12

And updated the ingress.yaml to include the Tailscale IP addresses:

Would be nice to setup deployment via Ansible, but not a priority. Requires maintenance of the playbook, not yet needed at this stage.

Also want to export Application definition to my repo. Done, see projects/gitops/argocd-portal-app.yaml.

Setting up Ingress to reach ArgoCD from domain name: https://argo-cd.readthedocs.io/en/stable/operator-manual/ingress/#traefik-v30

Have to set argocd-server as --insecure.

Installing ArgoCD:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

ArgoCD is working! Just not sure how ArgoCD knows about Kustomize. The Deployment image is not updated, only the kustomization.yaml is. And the correct image is deployed. Want to know how this works.

Docs about Kustomize (https://argo-cd.readthedocs.io/en/stable/user-guide/kustomize/):

If the kustomization.yaml file exists at the location pointed to by repoURL and path, Argo CD will render the manifests using Kustomize.

Continuing the pipeline setup by adding the Kustomize override. Testing it locally:

kustomize edit set image sguldemond/my-portal=sguldemond/my-portal:dev
kubectl kustomize base

Settings "Actions permissions" in my repo, with "Workflow permissions" to "Read and write permissions", to allow pipeline to commit to the repo, and update image tag. Had to set for it to work:

    permissions:
      contents: write

Shifting to GH Actions and publishing my portal image in Docker Hub. Widely accepted stack and flow. GitHub Actions is getting stuck on building the image:

 > [build 8/8] RUN npm run build:
0.142 
0.142 > my-portal@0.0.1 build
0.142 > vite build
0.142 
0.146 sh: 1: vite: not found

Trying to simulate what the pipeline does:

docker buildx build -f Dockerfile .

This works. Now running the pipeline using act, from repo root:

act --secret-file act.secrets workflow_dispatch

Act needs these secrets:

DOCKERHUB_USERNAME
DOCKERHUB_TOKEN
GITHUB_TOKEN

I can run it using act, and get the same error. So no closer to a solution. Some odd behavior around building the portal on GH Actions. For now I added RUN npm install -g @sveltejs/kit vite so the Runner can access svelte-kit and vite. New error:

Error [ERR_MODULE_NOT_FOUND]: Cannot find package '@tailwindcss/vite' imported from /app/node_modules/.vite-temp/vite.config.ts.timestamp-1759655627539-9542043a22014.mjs

Maybe since I install vite globally I need to install @tailwindcss/vite as well globally? Nope.. Apparently I had to include dev specifically:

RUN npm ci --include=dev

Created a pipeline, got some errors in Woodpecker. Couldn't easily run linter on the pipeline, so have to push fixes until it works. Seems like I cannot run images in privileged mode until the project is set as trusted.

Insufficient trust level to use privileged mode

For this you need to be a admin, so I updated the WOODPECKER_ADMIN to "sguldemond" in the Helm values. After setting this up, I don't see the option to make the project "trusted". Works, I needed to login and logout! Refresh the access token. The CLI is available here: https://github.com/woodpecker-ci/woodpecker/releases From the UI you can grab a token and locally do:

export WOODPECKER_SERVER="http://woodpecker.macmini.home"
export WOODPECKER_TOKEN="xxx"

Then you can run the cli locally, e.g.:

woodpecker-cli admin user ls
woodpecker-cli lint .woodpecker/build-push-portal.yaml

Getting stuck on this:

-> % woodpecker-cli exec .woodpecker/build-push-portal.yaml
🔥 .woodpecker/build-push-portal.yaml has 2 errors:
   ❌ steps.build-and-push   Insufficient trust level to use `privileged` mode
   ❌ services.buildkitd Insufficient trust level to use `privileged` mode
6:22PM FTL error running cli error="config has errors"

Even though I gave the project all the trusted checks. It could be that it's just not working very well...

Trying Woodpecker. Nice command to get values file for a Helm chart, to see options and adjust if needed.

helm show values oci://ghcr.io/woodpecker-ci/helm/woodpecker > values.yaml

If values are updated, you can do:

helm install woodpecker \
  oci://ghcr.io/woodpecker-ci/helm/woodpecker \
  -f values.yaml

Woodpecker requires a forge to be setup, getting an error from the server pod:

can't setup globals: could not setup service manager: forge not configured

Setting up GitHub via OAuth client app. Added these env vars:

WOODPECKER_GITHUB: "true"
WOODPECKER_GITHUB_CLIENT: <github-oauth-client-id>
WOODPECKER_GITHUB_SECRET: xxx (in BitWarden)

Update Woodpecker via Helm:

helm upgrade --install woodpecker oci://ghcr.io/woodpecker-ci/helm/woodpecker -f values.yaml

By default there is no Ingress, so I'll add my own. Ingress works, navigating to URL shows Woodpecker UI. Login with github.com fails with error on server pod:

cannot register sguldemond. registration closed

Had to set WOODPECKER_OPEN to True. And remember to re-set the client token. Now I can login and add my homelab repo.

Added the Gitea Helm chart to the cluster. Pods fail to start, cannot pull images from docker.io. Bitnami has deprecated its Debian based images, and removed the tags. Gitea has not updated their Helm chart to reflect this change.