Multicloud Kubernetes at Scale: OCM + Argo CD + Cloudflare

Running Kubernetes across AWS, GCP, and Azure is great for resilience—but managing dozens of clusters and routing traffic globally gets messy fast. Here’s a battle-tested stack that unifies cluster control, GitOps delivery, and global load balancing.

The Stack

· Open Cluster Management (OCM) – hub-spoke model to register and manage any Kubernetes cluster, regardless of cloud.
· Argo CD – continuous delivery via Git, integrated with OCM’s Placement API.
· Cloudflare – global anycast network, DNS, and load balancer with health checks.

How It Works

  1. OCM Hub & Spoke Clusters

Deploy an OCM hub (lightweight control plane) and join spoke clusters using clusteradm.

clusteradm join --hub-token <token> --hub-apiserver <url>
  1. GitOps with Argo CD + OCM

Install the OCM Argo CD add-on on the hub. Then define a Placement to decide which clusters get which apps.

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
  name: prod-placement
spec:
  clusterSets:
    - prod-clusters
  predicates:
    - requiredClusterSelector:
        labelSelector:
          matchLabels:
            environment: production

An Argo CD ApplicationSet uses the clusterDecisionResource generator to automatically deploy to all clusters selected by that placement.

  1. Cloudflare for DNS & Load Balancing

· Install external-dns in each cluster with Cloudflare provider.
· Annotate Ingresses to register DNS and enable proxy:

annotations:
  external-dns.alpha.kubernetes.io/hostname: app.example.com
  external-dns.alpha.kubernetes.io/cloudflare-proxied: "true"

· Create a Cloudflare Load Balancer with origin pools pointing to each cluster’s ingress IP. Enable active health checks and geo‑routing (e.g., route US users to AWS, EU to GCP).

Active-Passive vs Active-Active

Pattern Cloudflare Config Use case
Active‑Passive One primary pool; failover pool with low priority Cost‑sensitive, infrequent failover
Active‑Active All pools active; Cloudflare steers by latency / region Low latency, maximum resilience

Result

· Cluster lifecycle – OCM handles registration, heartbeat, and manifest work.
· Deployment – Argo CD syncs from Git; OCM Placement controls rollout.
· Traffic – Cloudflare routes users to the healthiest, closest cluster.

No more per‑cloud scripts. No manual DNS updates. Just Git, OCM, and Cloudflare.


You'll only receive email when they publish something new.

More from Theseus Org.
All posts