Network Flow
How traffic flows from the public internet through Bifrost VPS into the Talos cluster — and how LAN clients bypass the VPS entirely.
Overview
Traffic takes one of three paths depending on where the client is and which service they're accessing:
| Path | Client | Route |
|---|---|---|
| Public → Protected | Internet browser | Cloudflare → Bifrost Traefik → Authentik ForwardAuth → WireGuard → Cilium Gateway → Pod |
| Public VPS-native | Internet browser | Cloudflare → Bifrost Traefik → container on VPS (Authentik, NetBird) |
| LAN direct | Home network device | DNS → Cilium L2 LB → Gateway API → Pod |
| VPN remote | NetBird client (laptop/phone) | WireGuard mesh → routing peer → cluster subnet |
DNS Split
The split happens at DNS. Specific hostnames point to Bifrost (Hetzner VPS); everything else resolves to the Cilium L2 LoadBalancer IP on the home network.
| Hostname | DNS A record | Where traffic lands |
|---|---|---|
auth.madhan.app | 178.156.199.250 (Hetzner) | Bifrost — Authentik container |
netbird.madhan.app | 178.156.199.250 (Hetzner) | Bifrost — NetBird dashboard |
proxy.madhan.app | 178.156.199.250 (Hetzner) | Bifrost — NetBird reverse proxy |
grafana.madhan.app | 178.156.199.250 (Hetzner) | Bifrost → WireGuard → Cluster |
*.madhan.app (all others) | 192.168.1.220 (LAN) | Cilium Gateway — LAN only |
Opting in a service to internet exposure: edit the
publicServicesslice incore/cloud/cloudflare.goand runjust core cloudflare up. Traefik routes are auto-generated bygenerateTraefikPublicServices()inhetzner.go.
High-Level Traffic Diagram
flowchart TB
subgraph INET["Internet"]
BROWSER["Browser / Client"]
CF["Cloudflare DNS<br/>grafana → 178.156.199.250<br/>*.madhan.app → 192.168.1.220"]
end
subgraph BIF["Bifrost VPS · 178.156.199.250"]
TR["Traefik :443<br/>TLS termination<br/>ForwardAuth middleware"]
AUTH["Authentik<br/>SSO · GitHub OAuth"]
NBS["NetBird Server<br/>management + signal + relay + STUN"]
NBAGENT["netbird-agent<br/>network_mode: host<br/>wt0: 100.109.47.211"]
end
subgraph WG["WireGuard Mesh · 100.109.0.0/16"]
direction LR
RELAY["NetBird Relay<br/>TURN/STUN via Bifrost"]
end
subgraph K8S["Talos Cluster · worker1 · 192.168.1.221"]
NBPEER["netbird-peer pod<br/>hostNetwork · wt0: 100.109.244.71<br/>routes 192.168.1.0/24"]
MASQ["iptables MASQUERADE<br/>src 100.109.x → 192.168.1.221"]
CILIUMETH["Cilium BPF · eth0<br/>L7LB DNAT → Envoy :13507"]
end
subgraph GW["Cilium Gateway API · 192.168.1.220"]
ENV["Cilium Envoy proxy<br/>HTTPRoute matching"]
POD["Grafana pod<br/>grafana.madhan.app"]
end
subgraph LAN["Home LAN · 192.168.1.0/24"]
LANCLI["LAN browser<br/>DNS → 192.168.1.220<br/>direct · no VPS hop"]
end
BROWSER -->|"DNS grafana.madhan.app<br/>→ 178.156.199.250"| CF
CF --> TR
TR -->|"ForwardAuth /outpost.goauthentik.io/auth/nginx"| AUTH
AUTH -->|"200 OK (authenticated)"| TR
TR -->|"proxy to 192.168.1.220:80<br/>via host network"| NBAGENT
NBAGENT -->|"WireGuard · 100.109.x.x"| RELAY
RELAY -->|"decapsulated on wt0"| NBPEER
NBPEER -->|"kernel IP forward<br/>wt0 → eth0"| MASQ
MASQ -->|"src=192.168.1.221<br/>dst=192.168.1.220"| CILIUMETH
CILIUMETH --> ENV
ENV --> POD
LANCLI -->|"DNS → 192.168.1.220<br/>direct"| CILIUMETH
CILIUMETH --> ENV
Public Request — Packet-Level Detail
When a user opens https://grafana.madhan.app, the request traverses seven distinct hops before reaching the Grafana pod.
sequenceDiagram
actor Browser
participant CF as Cloudflare DNS
participant TR as Traefik<br/>178.156.199.250:443
participant AU as Authentik<br/>auth.madhan.app
participant NA as netbird-agent<br/>wt0:100.109.47.211
participant RE as NetBird Relay<br/>rels://netbird.madhan.app:443
participant NP as netbird-peer<br/>wt0:100.109.244.71<br/>worker1 eth0:192.168.1.221
participant GW as Cilium Gateway<br/>192.168.1.220
participant GR as Grafana pod
Browser->>CF: DNS? grafana.madhan.app
CF-->>Browser: 178.156.199.250
Browser->>TR: TLS ClientHello → HTTPS GET /
Note over TR: Terminates TLS<br/>wildcard cert (Cloudflare ACME)
TR->>AU: ForwardAuth sub-request<br/>GET /outpost.goauthentik.io/auth/nginx
alt not authenticated
AU-->>TR: 401
TR-->>Browser: redirect → auth.madhan.app/login
Browser->>AU: GitHub OAuth flow
AU-->>Browser: session cookie
Browser->>TR: retry GET / with cookie
TR->>AU: ForwardAuth sub-request
end
AU-->>TR: 200 OK
Note over TR,NA: Traefik proxies to 192.168.1.220:80<br/>via Docker bridge → host wt0
TR->>NA: HTTP GET / Host:grafana.madhan.app<br/>src:172.30.0.10 → dst:192.168.1.220:80
Note over NA: wt0 AllowedIPs: 192.168.1.0/24<br/>WireGuard encapsulates packet
NA->>RE: WireGuard UDP (encrypted)<br/>src:100.109.47.211 → dst:100.109.244.71
RE->>NP: relay → decapsulated on wt0<br/>src:100.109.47.211 dst:192.168.1.220:80
Note over NP: No Cilium BPF on wt0 (NOARP/P2P)<br/>kernel IP forwards: wt0 → eth0<br/>CILIUM_POST_nat MASQUERADE:<br/>src 100.109.47.211 → 192.168.1.221
NP->>GW: src:192.168.1.221 dst:192.168.1.220:80<br/>arrives on L2-announced node eth0
Note over GW: Cilium cil_from_netdev (eth0)<br/>L7LB DNAT: dst → Envoy :13507<br/>Envoy matches HTTPRoute for grafana.madhan.app
GW->>GR: HTTP GET / Host:grafana.madhan.app
GR-->>GW: 302 Found → /login
GW-->>NP: response (conntrack reverses DNAT)
NP-->>NA: MASQUERADE conntrack reverses src NAT<br/>WireGuard re-encrypts
NA-->>TR: HTTP 302
TR-->>Browser: HTTPS 302 (TLS wrapped)
IP Address Transformation at Each Hop
| Hop | Interface | Source IP | Destination IP | What changes |
|---|---|---|---|---|
| Browser → Traefik | public internet | client IP | 178.156.199.250 | TLS terminated |
| Traefik → netbird-agent | Docker bridge bifrost_net | 172.30.0.10 (Traefik) | 192.168.1.220 | Nothing — Docker routes to host |
| netbird-agent wt0 → relay | WireGuard encapsulated | 100.109.47.211 | 100.109.244.71 | Original IP hidden inside WireGuard |
| relay → netbird-peer wt0 | decapsulated WireGuard | 100.109.47.211 | 192.168.1.220 | Original packet restored |
| wt0 → eth0 (kernel fwd) | worker1 | 100.109.47.211 | 192.168.1.220 | IP forwarding only |
| MASQUERADE (CILIUM_POST_nat) | worker1 eth0 | 192.168.1.221 | 192.168.1.220 | Source NAT — cluster can reply |
| eth0 → Cilium LB | another worker's eth0 | 192.168.1.221 | Envoy :13507 | L7LB DNAT by Cilium BPF |
| Envoy → Grafana pod | pod overlay | pod IP | Grafana pod IP | L7 routing by HTTPRoute |
Why wt0 is NOT in Cilium Devices
During debugging of the 504 Gateway Timeout, the wt0 WireGuard interface was added to Cilium's devices list under the assumption that Cilium's TC BPF programs needed to be attached to it for LB DNAT. This was incorrect.
The root cause: wt0 is a NOARP/POINTOPOINT WireGuard interface — it has no Ethernet header. Cilium's cil_from_netdev TC BPF program expects Ethernet frames (IEEE 802.3) and silently misparsed all packets arriving on wt0, dropping them without any monitor events. The cilium-dbg monitor showed zero traces for wt0 traffic even when packets were confirmed arriving (verified by /proc/net/dev byte counters).
The fix: Remove wt0 from devices in core/platform/cilium.go. With no BPF on wt0, the kernel handles the forwarded packets normally:
- Packet arrives on
wt0(decapsulated by WireGuard) - Kernel IP forwarding routes it toward
eth0(192.168.1.0/24 is directly connected) CILIUM_POST_natBPF chain applies MASQUERADE (source →192.168.1.221)- Packet goes on the LAN wire to
192.168.1.220 - The node that owns ARP for
192.168.1.220receives it on itseth0 - Cilium's
cil_from_netdevon thateth0— a real Ethernet interface — does the L7LB DNAT correctly
LAN Direct Access
LAN clients resolve *.madhan.app (except the explicitly listed public ones) to 192.168.1.220 — the Cilium L2 LoadBalancer IP. Traffic never leaves the home network.
Bypassing SSO on LAN: For any public service, you can override DNS locally to hit the cluster directly and skip the Authentik ForwardAuth redirect:
# /etc/hosts — bypasses Hetzner + ForwardAuth 192.168.1.220 grafana.madhan.app
VPN Remote Access
From anywhere in the world, a connected NetBird client (laptop or phone) can reach cluster services directly:
- NetBird client connects to the WireGuard mesh via
netbird.madhan.app:443(Traefik → NetBird server) - Routing peer (K8s pod) advertises
192.168.1.0/24into the mesh - Cluster IPs (
192.168.1.220–230) are routable from the client — no browser, no SSO redirect
Traefik Routing in Detail
Traefik on Bifrost uses the file provider only (no Docker provider). Routes are defined in core/cloud/bifrost/traefik/dynamic/:
| File | Contents |
|---|---|
services.yml | Static routes: Authentik, NetBird GRPC, NetBird REST, NetBird dashboard, TCP wildcard for *.proxy.madhan.app |
public-services.yml | Auto-generated by hetzner.go — public homelab routes with ForwardAuth |
public-services.yml is gitignored and regenerated on every just core hetzner up. To expose a new service, add its name to publicServices in core/cloud/cloudflare.go.
Troubleshooting the Public Traffic Path
504 Gateway Timeout from public URL
Traefik is reachable but cannot proxy to the cluster backend.
# 1. Check netbird-agent is running and connected on Bifrost
ssh root@178.156.199.250 'docker exec netbird-agent netbird status'
# Must show: Management: Connected, Peers count: 1/1 Connected
# 2. Check the route is selected
ssh root@178.156.199.250 'docker exec netbird-agent netbird routes list'
# Must show: 192.168.1.0/24 Status: Selected
# 3. Verify kernel route table has the tunnel entry
ssh root@178.156.199.250 'ip route show table 7120'
# Must show: 192.168.1.0/24 dev wt0
# 4. Test direct HTTP to the Cilium LB from Bifrost
ssh root@178.156.199.250 'curl -sv -H "Host: grafana.madhan.app" --connect-timeout 5 http://192.168.1.220/'
# Expect: HTTP/1.1 302 Found
If step 4 times out, the WireGuard → cluster path is broken. Continue to the next section.
WireGuard tunnel up but traffic not flowing to cluster
The netbird-peer pod is connected but 192.168.1.220 is unreachable from Bifrost.
# Check netbird-peer pod status
kubectl exec -n netbird statefulset/netbird-peer -- netbird status
# Must show: Networks: 192.168.1.0/24, Peers count: 1/1 Connected
# Confirm WireGuard bytes are flowing during a test
kubectl exec -n netbird statefulset/netbird-peer -- cat /proc/net/dev | grep wt0
# RX bytes should increase while making requests
# Check worker1 can reach the Cilium LB via another worker's NodePort
ssh root@178.156.199.250 'curl -sv -H "Host: grafana.madhan.app" --connect-timeout 3 http://192.168.1.222:32601/'
# If this works but 192.168.1.220 times out, the issue is on worker1 — check Cilium pod health