Added Argo fix explanation
This commit is contained in:
107
argocd-traefik-fix.md
Normal file
107
argocd-traefik-fix.md
Normal file
@@ -0,0 +1,107 @@
|
|||||||
|
# ArgoCD Ingress Fix - Traefik Bad Gateway
|
||||||
|
|
||||||
|
## Environment
|
||||||
|
|
||||||
|
- **Cluster**: RKE2 managed by Rancher
|
||||||
|
- **Ingress Controller**: Traefik (kube-system namespace)
|
||||||
|
- **ArgoCD Version**: v3.4.2 (Helm chart argo-cd-9.5.14)
|
||||||
|
- **Namespace**: infrastructure
|
||||||
|
- **Hostname**: argo.snarfnet.net
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
After deploying ArgoCD, accessing `https://argo.snarfnet.net` returned a **502 Bad Gateway** from Traefik.
|
||||||
|
|
||||||
|
## Root Cause
|
||||||
|
|
||||||
|
Two issues were identified:
|
||||||
|
|
||||||
|
### 1. Service TargetPort Mismatch
|
||||||
|
|
||||||
|
The ArgoCD server was listening on port **8080**, but the Kubernetes service had `targetPort: 8081`. This was corrected by patching the service to point both ports (80 and 443) to targetPort 8080.
|
||||||
|
|
||||||
|
### 2. Traefik Protocol Mismatch (Primary Issue)
|
||||||
|
|
||||||
|
The ArgoCD service defined two ports:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
ports:
|
||||||
|
- name: http
|
||||||
|
port: 80
|
||||||
|
targetPort: 8080
|
||||||
|
- name: https
|
||||||
|
port: 443
|
||||||
|
targetPort: 8080
|
||||||
|
```
|
||||||
|
|
||||||
|
The Ingress resource routed traffic to port 80, but Traefik's Kubernetes provider saw the port named `https` (443) on the service and automatically selected it, connecting to the backend using **HTTPS**:
|
||||||
|
|
||||||
|
```
|
||||||
|
"servers":[{"url":"https://10.42.1.76:8080"}]
|
||||||
|
```
|
||||||
|
|
||||||
|
However, ArgoCD was configured to run in insecure mode (`server.insecure: true`), meaning it only served plain **HTTP** on port 8080. Traefik's HTTPS connection to an HTTP backend resulted in the Bad Gateway.
|
||||||
|
|
||||||
|
Working services (Gitea, Jenkins, etc.) did not have this problem because they only exposed a single HTTP port with no `https` named port to confuse Traefik.
|
||||||
|
|
||||||
|
## Fix
|
||||||
|
|
||||||
|
Removed the `https` (port 443) entry from the `argocd-server` service, leaving only the HTTP port:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
spec:
|
||||||
|
ports:
|
||||||
|
- name: http
|
||||||
|
port: 80
|
||||||
|
targetPort: 8080
|
||||||
|
```
|
||||||
|
|
||||||
|
This forced Traefik to use `http://` when connecting to the backend, which matched ArgoCD's insecure mode.
|
||||||
|
|
||||||
|
After the change, Traefik's internal service config showed:
|
||||||
|
|
||||||
|
```
|
||||||
|
"servers":[{"url":"http://10.42.1.76:8080"}]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Permanent Fix for Helm Upgrades
|
||||||
|
|
||||||
|
To prevent the Helm chart from recreating the 443 port on future upgrades, use one of these approaches:
|
||||||
|
|
||||||
|
### Option A: Annotate the Ingress
|
||||||
|
|
||||||
|
Add this annotation to the `argo-ing` Ingress resource so Traefik always uses HTTP regardless of service port names:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
metadata:
|
||||||
|
annotations:
|
||||||
|
traefik.ingress.kubernetes.io/service.serversscheme: http
|
||||||
|
```
|
||||||
|
|
||||||
|
### Option B: Helm Values
|
||||||
|
|
||||||
|
Configure the chart to not expose the HTTPS service port (check chart documentation for exact key, as it varies by version):
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
configs:
|
||||||
|
params:
|
||||||
|
server.insecure: true
|
||||||
|
|
||||||
|
server:
|
||||||
|
service:
|
||||||
|
type: ClusterIP
|
||||||
|
```
|
||||||
|
|
||||||
|
## Debugging Steps That Led to the Fix
|
||||||
|
|
||||||
|
1. Verified the pod was running and healthy (`1/1 Ready`)
|
||||||
|
2. Confirmed the pod was listening on port 8080 via `/proc/net/tcp6`
|
||||||
|
3. Tested direct pod connectivity from another pod in the cluster — returned HTTP 200
|
||||||
|
4. Queried Traefik's internal API at `http://127.0.0.1:9000/api/http/services`
|
||||||
|
5. Discovered Traefik was using `https://` to connect to the backend
|
||||||
|
6. Compared with working services (Gitea, Jenkins) which all used `http://`
|
||||||
|
7. Identified the `https` named port on the service as the cause
|
||||||
|
|
||||||
|
## Key Takeaway
|
||||||
|
|
||||||
|
Traefik's Kubernetes Ingress provider infers the backend protocol from the service port name. A port named `https` causes Traefik to connect using HTTPS, regardless of what port number the Ingress backend specifies. When running ArgoCD in insecure mode behind a TLS-terminating reverse proxy, ensure the service does not expose an `https` named port, or use the `traefik.ingress.kubernetes.io/service.serversscheme` annotation to override the behavior.
|
||||||
Reference in New Issue
Block a user