Network Flow

How traffic flows from the public internet through Bifrost VPS into the Talos cluster — and how LAN clients bypass the VPS entirely.

Overview

Traffic takes one of three paths depending on where the client is and which service they're accessing:

PathClientRoute
Public → ProtectedInternet browserCloudflare → Bifrost Traefik → Authentik ForwardAuth → WireGuard → Cilium Gateway → Pod
Public VPS-nativeInternet browserCloudflare → Bifrost Traefik → container on VPS (Authentik, NetBird)
LAN directHome network deviceDNS → Cilium L2 LB → Gateway API → Pod
VPN remoteNetBird client (laptop/phone)WireGuard mesh → routing peer → cluster subnet

DNS Split

The split happens at DNS. Specific hostnames point to Bifrost (Hetzner VPS); everything else resolves to the Cilium L2 LoadBalancer IP on the home network.

HostnameDNS A recordWhere traffic lands
auth.madhan.app178.156.199.250 (Hetzner)Bifrost — Authentik container
netbird.madhan.app178.156.199.250 (Hetzner)Bifrost — NetBird dashboard
proxy.madhan.app178.156.199.250 (Hetzner)Bifrost — NetBird reverse proxy
grafana.madhan.app178.156.199.250 (Hetzner)Bifrost → WireGuard → Cluster
*.madhan.app (all others)192.168.1.220 (LAN)Cilium Gateway — LAN only

Opting in a service to internet exposure: edit the publicServices slice in core/cloud/cloudflare.go and run just core cloudflare up. Traefik routes are auto-generated by generateTraefikPublicServices() in hetzner.go.


High-Level Traffic Diagram

 flowchart TB
    subgraph INET["Internet"]
        BROWSER["Browser / Client"]
        CF["Cloudflare DNS<br/>grafana → 178.156.199.250<br/>*.madhan.app → 192.168.1.220"]
    end

    subgraph BIF["Bifrost VPS · 178.156.199.250"]
        TR["Traefik :443<br/>TLS termination<br/>ForwardAuth middleware"]
        AUTH["Authentik<br/>SSO · GitHub OAuth"]
        NBS["NetBird Server<br/>management + signal + relay + STUN"]
        NBAGENT["netbird-agent<br/>network_mode: host<br/>wt0: 100.109.47.211"]
    end

    subgraph WG["WireGuard Mesh · 100.109.0.0/16"]
        direction LR
        RELAY["NetBird Relay<br/>TURN/STUN via Bifrost"]
    end

    subgraph K8S["Talos Cluster · worker1 · 192.168.1.221"]
        NBPEER["netbird-peer pod<br/>hostNetwork · wt0: 100.109.244.71<br/>routes 192.168.1.0/24"]
        MASQ["iptables MASQUERADE<br/>src 100.109.x → 192.168.1.221"]
        CILIUMETH["Cilium BPF · eth0<br/>L7LB DNAT → Envoy :13507"]
    end

    subgraph GW["Cilium Gateway API · 192.168.1.220"]
        ENV["Cilium Envoy proxy<br/>HTTPRoute matching"]
        POD["Grafana pod<br/>grafana.madhan.app"]
    end

    subgraph LAN["Home LAN · 192.168.1.0/24"]
        LANCLI["LAN browser<br/>DNS → 192.168.1.220<br/>direct · no VPS hop"]
    end

    BROWSER -->|"DNS grafana.madhan.app<br/>→ 178.156.199.250"| CF
    CF --> TR
    TR -->|"ForwardAuth /outpost.goauthentik.io/auth/nginx"| AUTH
    AUTH -->|"200 OK (authenticated)"| TR
    TR -->|"proxy to 192.168.1.220:80<br/>via host network"| NBAGENT
    NBAGENT -->|"WireGuard · 100.109.x.x"| RELAY
    RELAY -->|"decapsulated on wt0"| NBPEER
    NBPEER -->|"kernel IP forward<br/>wt0 → eth0"| MASQ
    MASQ -->|"src=192.168.1.221<br/>dst=192.168.1.220"| CILIUMETH
    CILIUMETH --> ENV
    ENV --> POD

    LANCLI -->|"DNS → 192.168.1.220<br/>direct"| CILIUMETH
    CILIUMETH --> ENV

Public Request — Packet-Level Detail

When a user opens https://grafana.madhan.app, the request traverses seven distinct hops before reaching the Grafana pod.

 sequenceDiagram
    actor Browser
    participant CF as Cloudflare DNS
    participant TR as Traefik<br/>178.156.199.250:443
    participant AU as Authentik<br/>auth.madhan.app
    participant NA as netbird-agent<br/>wt0:100.109.47.211
    participant RE as NetBird Relay<br/>rels://netbird.madhan.app:443
    participant NP as netbird-peer<br/>wt0:100.109.244.71<br/>worker1 eth0:192.168.1.221
    participant GW as Cilium Gateway<br/>192.168.1.220
    participant GR as Grafana pod

    Browser->>CF: DNS? grafana.madhan.app
    CF-->>Browser: 178.156.199.250

    Browser->>TR: TLS ClientHello → HTTPS GET /
    Note over TR: Terminates TLS<br/>wildcard cert (Cloudflare ACME)

    TR->>AU: ForwardAuth sub-request<br/>GET /outpost.goauthentik.io/auth/nginx
    alt not authenticated
        AU-->>TR: 401
        TR-->>Browser: redirect → auth.madhan.app/login
        Browser->>AU: GitHub OAuth flow
        AU-->>Browser: session cookie
        Browser->>TR: retry GET / with cookie
        TR->>AU: ForwardAuth sub-request
    end
    AU-->>TR: 200 OK

    Note over TR,NA: Traefik proxies to 192.168.1.220:80<br/>via Docker bridge → host wt0

    TR->>NA: HTTP GET / Host:grafana.madhan.app<br/>src:172.30.0.10 → dst:192.168.1.220:80
    Note over NA: wt0 AllowedIPs: 192.168.1.0/24<br/>WireGuard encapsulates packet
    NA->>RE: WireGuard UDP (encrypted)<br/>src:100.109.47.211 → dst:100.109.244.71
    RE->>NP: relay → decapsulated on wt0<br/>src:100.109.47.211 dst:192.168.1.220:80

    Note over NP: No Cilium BPF on wt0 (NOARP/P2P)<br/>kernel IP forwards: wt0 → eth0<br/>CILIUM_POST_nat MASQUERADE:<br/>src 100.109.47.211 → 192.168.1.221

    NP->>GW: src:192.168.1.221 dst:192.168.1.220:80<br/>arrives on L2-announced node eth0
    Note over GW: Cilium cil_from_netdev (eth0)<br/>L7LB DNAT: dst → Envoy :13507<br/>Envoy matches HTTPRoute for grafana.madhan.app
    GW->>GR: HTTP GET / Host:grafana.madhan.app
    GR-->>GW: 302 Found → /login
    GW-->>NP: response (conntrack reverses DNAT)
    NP-->>NA: MASQUERADE conntrack reverses src NAT<br/>WireGuard re-encrypts
    NA-->>TR: HTTP 302
    TR-->>Browser: HTTPS 302 (TLS wrapped)

IP Address Transformation at Each Hop

HopInterfaceSource IPDestination IPWhat changes
Browser → Traefikpublic internetclient IP178.156.199.250TLS terminated
Traefik → netbird-agentDocker bridge bifrost_net172.30.0.10 (Traefik)192.168.1.220Nothing — Docker routes to host
netbird-agent wt0 → relayWireGuard encapsulated100.109.47.211100.109.244.71Original IP hidden inside WireGuard
relay → netbird-peer wt0decapsulated WireGuard100.109.47.211192.168.1.220Original packet restored
wt0 → eth0 (kernel fwd)worker1100.109.47.211192.168.1.220IP forwarding only
MASQUERADE (CILIUM_POST_nat)worker1 eth0192.168.1.221192.168.1.220Source NAT — cluster can reply
eth0 → Cilium LBanother worker's eth0192.168.1.221Envoy :13507L7LB DNAT by Cilium BPF
Envoy → Grafana podpod overlaypod IPGrafana pod IPL7 routing by HTTPRoute

Why wt0 is NOT in Cilium Devices

During debugging of the 504 Gateway Timeout, the wt0 WireGuard interface was added to Cilium's devices list under the assumption that Cilium's TC BPF programs needed to be attached to it for LB DNAT. This was incorrect.

The root cause: wt0 is a NOARP/POINTOPOINT WireGuard interface — it has no Ethernet header. Cilium's cil_from_netdev TC BPF program expects Ethernet frames (IEEE 802.3) and silently misparsed all packets arriving on wt0, dropping them without any monitor events. The cilium-dbg monitor showed zero traces for wt0 traffic even when packets were confirmed arriving (verified by /proc/net/dev byte counters).

The fix: Remove wt0 from devices in core/platform/cilium.go. With no BPF on wt0, the kernel handles the forwarded packets normally:

  1. Packet arrives on wt0 (decapsulated by WireGuard)
  2. Kernel IP forwarding routes it toward eth0 (192.168.1.0/24 is directly connected)
  3. CILIUM_POST_nat BPF chain applies MASQUERADE (source → 192.168.1.221)
  4. Packet goes on the LAN wire to 192.168.1.220
  5. The node that owns ARP for 192.168.1.220 receives it on its eth0
  6. Cilium's cil_from_netdev on that eth0 — a real Ethernet interface — does the L7LB DNAT correctly

LAN Direct Access

LAN clients resolve *.madhan.app (except the explicitly listed public ones) to 192.168.1.220 — the Cilium L2 LoadBalancer IP. Traffic never leaves the home network.

Bypassing SSO on LAN: For any public service, you can override DNS locally to hit the cluster directly and skip the Authentik ForwardAuth redirect:

# /etc/hosts — bypasses Hetzner + ForwardAuth
192.168.1.220  grafana.madhan.app

VPN Remote Access

From anywhere in the world, a connected NetBird client (laptop or phone) can reach cluster services directly:

  1. NetBird client connects to the WireGuard mesh via netbird.madhan.app:443 (Traefik → NetBird server)
  2. Routing peer (K8s pod) advertises 192.168.1.0/24 into the mesh
  3. Cluster IPs (192.168.1.220–230) are routable from the client — no browser, no SSO redirect

Traefik Routing in Detail

Traefik on Bifrost uses the file provider only (no Docker provider). Routes are defined in core/cloud/bifrost/traefik/dynamic/:

FileContents
services.ymlStatic routes: Authentik, NetBird GRPC, NetBird REST, NetBird dashboard, TCP wildcard for *.proxy.madhan.app
public-services.ymlAuto-generated by hetzner.go — public homelab routes with ForwardAuth

public-services.yml is gitignored and regenerated on every just core hetzner up. To expose a new service, add its name to publicServices in core/cloud/cloudflare.go.


Troubleshooting the Public Traffic Path

504 Gateway Timeout from public URL

Traefik is reachable but cannot proxy to the cluster backend.

# 1. Check netbird-agent is running and connected on Bifrost
ssh root@178.156.199.250 'docker exec netbird-agent netbird status'
# Must show: Management: Connected, Peers count: 1/1 Connected

# 2. Check the route is selected
ssh root@178.156.199.250 'docker exec netbird-agent netbird routes list'
# Must show: 192.168.1.0/24 Status: Selected

# 3. Verify kernel route table has the tunnel entry
ssh root@178.156.199.250 'ip route show table 7120'
# Must show: 192.168.1.0/24 dev wt0

# 4. Test direct HTTP to the Cilium LB from Bifrost
ssh root@178.156.199.250 'curl -sv -H "Host: grafana.madhan.app" --connect-timeout 5 http://192.168.1.220/'
# Expect: HTTP/1.1 302 Found

If step 4 times out, the WireGuard → cluster path is broken. Continue to the next section.

WireGuard tunnel up but traffic not flowing to cluster

The netbird-peer pod is connected but 192.168.1.220 is unreachable from Bifrost.

# Check netbird-peer pod status
kubectl exec -n netbird statefulset/netbird-peer -- netbird status
# Must show: Networks: 192.168.1.0/24, Peers count: 1/1 Connected

# Confirm WireGuard bytes are flowing during a test
kubectl exec -n netbird statefulset/netbird-peer -- cat /proc/net/dev | grep wt0
# RX bytes should increase while making requests

# Check worker1 can reach the Cilium LB via another worker's NodePort
ssh root@178.156.199.250 'curl -sv -H "Host: grafana.madhan.app" --connect-timeout 3 http://192.168.1.222:32601/'
# If this works but 192.168.1.220 times out, the issue is on worker1 — check Cilium pod health