Running HAProxy on Kubernetes: Hard Lessons and Failure Modes

Article Map

TL;DR

HAProxy works in Kubernetes, but it behaves like a stateful edge component forced into a dynamic scheduler. This mismatch creates operational challenges that teams don't expect when they first deploy HAProxy pods. Pod restarts can drop in-flight connections under default Kubernetes settings. ConfigMap updates propagate asynchronously across pods. Resource limits can trigger OOMKills during traffic spikes when sized incorrectly. Health checks may pass while traffic fails. These aren't bugs-they're fundamental tensions between HAProxy's stateful design and Kubernetes' stateless assumptions.

HAProxy ≠ an ingress controller-expect manual config or custom controllers
Pod restarts drop connections unless you configure termination grace periods
ConfigMap updates propagate asynchronously, causing routing inconsistencies
Resource limits based on idle usage cause OOMKills under load
Readiness probes can pass while traffic fails
iptables and CNI add latency and connection limits

The same HAProxy that runs reliably on bare metal or VMs behaves differently under Kubernetes' orchestration. Pod restarts can drop in-flight connections under default settings. ConfigMap updates propagate asynchronously across pods. Resource limits can trigger OOMKills during traffic spikes when sized incorrectly. Health checks may pass while traffic fails. These aren't bugs-they're fundamental tensions between HAProxy's stateful design and Kubernetes' stateless assumptions.

Why HAProxy ≠ an Ingress Controller

Teams often assume HAProxy works like an ingress controller in Kubernetes. It doesn't. This conceptual mismatch causes operational problems that compound over time.

Conceptual Mismatch

Ingress controllers integrate with Kubernetes APIs to discover services and update routing automatically. When you create a Service or Ingress resource, ingress controllers watch the API and update their routing configuration. HAProxy, by contrast, requires manual configuration or custom controllers that generate HAProxy configs from Kubernetes resources.

This distinction matters operationally. Ingress controllers update routing automatically when services scale. HAProxy requires config regeneration and reloads. This creates delays between service scaling and traffic routing. We've seen teams experience traffic routing failures during autoscaling events because HAProxy configs weren't updated in time.

Service discovery is automatic for ingress controllers. They query the Kubernetes API for Service endpoints and update routing accordingly. HAProxy requires custom controllers that watch Kubernetes APIs and generate HAProxy configs. Some teams use existing operators, but many build custom controllers. This adds operational complexity but provides the control that HAProxy offers.

Operational Consequences

The operational consequences of this mismatch appear in several ways. Config generation from Kubernetes resources requires custom tooling. Reload coordination across pods becomes necessary. Service scaling delays create routing gaps. Custom controller maintenance adds overhead.

We've seen teams build custom controllers that watch Kubernetes Services and generate HAProxy configs. These controllers must handle edge cases: service deletion, endpoint changes, and config validation. When controllers fail, routing breaks. When controllers lag, routing becomes stale. This operational burden is the price of using HAProxy in Kubernetes.

For more on HAProxy architecture patterns, see our HAProxy in Production guide.

Key Takeaway:

HAProxy is a load balancer, not an ingress controller. Expect manual configuration or custom controllers. Plan for config generation, reload coordination, and service discovery delays.

Reload Behavior Inside Pods

When Kubernetes restarts a pod under default settings, HAProxy can lose in-flight connections. Kubernetes doesn't drain connections gracefully by default. Pods terminate immediately unless configured otherwise, dropping active connections. We've seen teams lose traffic during rolling updates because they didn't configure connection draining.

In-Flight Connections

HAProxy's reload mechanism drains connections, but Kubernetes pod termination doesn't wait for HAProxy to finish draining. You need to configure pod termination grace periods and readiness probes that account for connection draining time. This requires understanding both HAProxy behavior and Kubernetes lifecycle.

The default termination grace period is 30 seconds. For HAProxy, this is often too short. Long-lived connections can prevent draining from completing. When the grace period expires, Kubernetes sends SIGKILL, dropping all remaining connections. We've seen teams experience connection loss during rolling updates because grace periods were too short.

Connection draining requires coordination between HAProxy's reload mechanism and Kubernetes' pod lifecycle. HAProxy starts a new process, transfers listening sockets, and drains connections from the old process. Kubernetes must wait for this process to complete. If Kubernetes terminates the pod before draining finishes, connections drop.

Restart Amplification

Multiple pods reloading simultaneously amplify connection loss. When a Deployment updates, Kubernetes restarts pods in a rolling update. If multiple HAProxy pods reload at the same time, connection draining can't keep up with new connection rates. Queues build, connections time out, and users experience errors.

Rolling update strategies can prevent amplification. Configure maxSurge and maxUnavailable to control how many pods restart simultaneously. Stagger pod restarts to prevent all pods from reloading at once. We've seen teams use pod disruption budgets to limit concurrent restarts during updates.

For more on safe reload patterns, see our HAProxy automation guide.

Connection Loss Risk:

Pod restarts drop in-flight connections unless you configure termination grace periods and readiness probes for connection draining. Plan for connection loss during rolling updates.

ConfigMap and Propagation Risks

ConfigMaps update asynchronously. When you update a ConfigMap, Kubernetes propagates changes to pods over time. During this propagation, some pods have new configs and some pods have old configs. This inconsistency causes routing problems.

Stale Configs

We've seen teams update ConfigMaps and assume all pods have the new configuration immediately. They don't. ConfigMap propagation can take minutes, depending on cluster size and API server load. During this time, traffic routes inconsistently.

Stale configs persist when pods don't restart. ConfigMaps are mounted as volumes, but HAProxy typically doesn't reload automatically when volumes change. You need to restart pods or trigger reloads manually. This creates a gap between config updates and config activation.

We've seen incidents where ConfigMap updates took 5 minutes to propagate across a 100-node cluster. During this time, some HAProxy pods had new routing rules and some had old rules. Traffic routed inconsistently, causing user-facing errors. Teams that don't account for this propagation delay experience routing failures.

Race Conditions

Race conditions occur when pods reload at different times. Some pods reload with new configs while others still have old configs. This creates routing inconsistencies that persist until all pods have reloaded.

Strategies to ensure consistent config across pods include: waiting for ConfigMap propagation before triggering reloads, using init containers to verify config consistency, and implementing staged rollouts that update pods gradually. We've seen teams use custom controllers that coordinate config updates and pod restarts to prevent race conditions.

ConfigMap Update Strategy:

Wait for ConfigMap propagation before triggering pod restarts. Use staged rollouts to update pods gradually. Monitor config consistency across pods to detect race conditions.

Resource Limits and OOM Kill Patterns

HAProxy memory usage grows with connection counts and configuration complexity. When memory limits are too low, Kubernetes OOMKills HAProxy pods. When memory limits are too high, you waste resources. Finding the right balance requires understanding HAProxy memory behavior under load.

Memory Spikes

We've seen teams set memory limits based on idle usage, then experience OOMKills under load. HAProxy memory usage increases with active connections, backend counts, and configuration size. You need to size limits based on peak usage, not average usage.

Memory usage patterns vary with traffic. During traffic spikes, connection counts increase, causing memory usage to spike. If limits are set based on average usage, spikes trigger OOMKills. We've seen teams experience OOMKills during traffic spikes because limits were too conservative.

Configuration complexity affects memory usage. Large configs with many backends, ACLs, and routing rules consume more memory. Teams that add complexity without adjusting limits experience OOMKills. Monitor memory usage under different load patterns to size limits correctly.

CPU Throttling

CPU limits cause connection queuing. When CPU limits are too low, HAProxy can't process connections fast enough. Requests queue, response times increase, and eventually connections time out.

Burst traffic triggers throttling. When traffic spikes, CPU usage increases. If limits are too low, Kubernetes throttles HAProxy, causing processing delays. We've seen teams experience latency spikes during traffic bursts because CPU limits were too conservative.

Sizing CPU limits for traffic patterns requires understanding peak CPU usage. Monitor CPU usage under different load patterns. Set limits based on peak usage with headroom for bursts. We've seen teams use HPA (Horizontal Pod Autoscaler) to scale pods based on CPU usage, but this requires careful tuning to avoid scaling delays.

OOMKill Scenarios:

Resource limits based on idle usage cause OOMKills under load. Size limits based on peak usage with headroom for traffic spikes. Monitor memory and CPU usage under different load patterns.

Health Checks: Readiness vs Reality

Readiness probes typically check HAProxy process, not traffic flow. Backends can be down while readiness passes. Queue saturation typically doesn't fail readiness under default probe configurations. Reloads in progress often don't fail readiness. This gap between readiness and reality causes operational problems.

Why Readiness Passes While Traffic Fails

Readiness probes typically check if HAProxy is listening on its port. This confirms the process is running, but it doesn't confirm HAProxy is serving traffic correctly. We've seen incidents where readiness probes passed, but all backends were down. Traffic routed to pods that couldn't serve requests.

Queue saturation typically doesn't fail readiness under default probe configurations. When queues are full, HAProxy stops accepting new connections, but the process is still running. Readiness probes pass, but traffic fails. We've seen teams experience user-facing errors while readiness probes indicated healthy pods.

Reloads in progress don't fail readiness. During reloads, HAProxy is transitioning between old and new processes. Readiness probes can pass during this transition, but traffic may be queued or dropped. We've seen teams experience connection loss during reloads even though readiness probes passed.

Operational Implications

Traffic routed to unhealthy pods causes user-facing errors. When readiness passes but traffic fails, Kubernetes continues routing traffic to pods that can't serve requests. This amplifies failures and delays recovery.

Need for application-level health checks. Readiness probes that check HAProxy process aren't sufficient. You need probes that check if HAProxy is actually serving traffic. Some teams use custom health check endpoints that verify backend connectivity and queue depth.

Combining readiness and liveness probes effectively requires understanding their different purposes. Readiness probes determine if a pod can receive traffic. Liveness probes determine if a pod should be restarted. Configure both to match your operational requirements.

Health Check Limitations:

Readiness probes that check process status don't confirm traffic flow. Use application-level health checks that verify HAProxy is serving traffic correctly. Monitor queue depth and backend health separately from readiness probes.

Service Discovery and Networking Pitfalls

Kubernetes networking adds layers that affect HAProxy performance and reliability. iptables rules, CNI plugins, and service discovery mechanisms create operational challenges that don't exist in traditional deployments.

iptables and Connection Limits

kube-proxy uses iptables rules to route traffic to pods. These rules add latency and connection limits. Connection tracking tables have limits. When limits are reached, new connections fail. We've seen teams experience connection failures during traffic spikes because iptables connection tracking tables were exhausted.

IPVS mode reduces iptables overhead. IPVS uses kernel-level load balancing instead of iptables rules. This reduces latency and increases connection capacity. Teams that switch from iptables to IPVS see performance improvements, but IPVS requires additional configuration and monitoring.

Performance implications of service mesh integration. Service meshes add network hops and latency. When HAProxy runs behind a service mesh, traffic flows through multiple layers. Each layer adds latency and connection overhead. We've seen teams experience performance degradation when adding service meshes to HAProxy deployments.

CNI and Node Pressure

CNI plugins add network hops. Pod networking requires CNI plugins to create network interfaces and routes. These plugins add overhead that affects HAProxy performance. Network policy enforcement adds additional overhead.

Node network saturation affects HAProxy. When nodes are network-saturated, HAProxy performance degrades. Network saturation can occur from other pods on the same node, node-level network issues, or cluster-wide network problems. We've seen teams experience HAProxy performance degradation during node network saturation.

Pod networking overhead compounds with multiple layers. HAProxy pods that route through service meshes, CNI plugins, and iptables experience cumulative overhead. Monitor network latency and throughput at each layer to identify bottlenecks.

Internal Service Discovery

Kubernetes DNS delays affect service discovery. When services scale or endpoints change, DNS updates lag. HAProxy configs that rely on DNS can become stale. We've seen teams experience routing failures because DNS hadn't updated when services scaled.

Service endpoint updates lag. Kubernetes Services update endpoints asynchronously. When pods start or stop, endpoint updates can take seconds. During this lag, HAProxy configs can route to non-existent pods. We've seen teams experience connection failures during pod restarts because endpoints were stale.

Custom controllers for real-time discovery. Some teams build custom controllers that watch Kubernetes APIs and update HAProxy configs in real-time. These controllers reduce discovery delays but add operational complexity. Alternatives to Kubernetes service discovery include direct pod IP routing or external service discovery mechanisms.

Networking Optimization:

Monitor iptables connection tracking, CNI plugin overhead, and DNS delays. Consider IPVS mode for better performance. Use custom controllers for real-time service discovery if delays are problematic.

Need Help Running HAProxy in Kubernetes?

If you'd like guidance on pod lifecycle management, ConfigMap propagation, resource sizing, and Kubernetes-specific HAProxy patterns, we can help review your setup and suggest improvements. Our site reliability engineers have implemented HAProxy solutions in Kubernetes for multiple production environments.

Frequently Asked Questions

Everything you need to know about running HAProxy in Kubernetes

Why does HAProxy drop connections during pod restarts? Production

Kubernetes pod termination doesn't wait for HAProxy to drain connections by default. When a pod restarts, Kubernetes sends SIGTERM and waits for the termination grace period (default 30 seconds), then sends SIGKILL. HAProxy's reload mechanism drains connections, but if the grace period is too short or if long-lived connections prevent draining, connections drop. Configure termination grace periods and readiness probes that account for connection draining time to minimize connection loss.

How do I ensure ConfigMap updates propagate consistently? Monitoring

ConfigMap updates propagate asynchronously across pods. Some pods get new configs immediately while others lag behind, causing routing inconsistencies. Wait for ConfigMap propagation before triggering pod restarts. Use staged rollouts that update pods gradually. Monitor config consistency across pods to detect race conditions. Some teams use custom controllers that coordinate ConfigMap updates and pod restarts to ensure consistency.

How does HAProxy behave differently in Kubernetes compared to traditional deployments? Kubernetes

HAProxy in Kubernetes requires understanding stateful edge component behavior within a dynamic scheduler. Pod restarts can drop in-flight connections unless you configure termination grace periods and readiness probes for connection draining. ConfigMap updates propagate asynchronously, causing routing inconsistencies during updates. HAProxy requires manual configuration or custom controllers-it's not an ingress controller and doesn't auto-discover services. Expect config propagation delays and plan for pod lifecycle management.

What resource limits should I set for HAProxy pods? Kubernetes

Size resource limits based on peak usage, not idle usage. HAProxy memory usage increases with active connections, backend counts, and configuration complexity. CPU limits that are too low cause connection queuing and throttling. Monitor memory and CPU usage under different load patterns to size limits correctly. We've seen teams experience OOMKills during traffic spikes because limits were set based on average usage. Set limits with headroom for traffic bursts.

Why do readiness probes pass while traffic fails? Kubernetes

Readiness probes that check if HAProxy is listening on its port only confirm the process is running, not that HAProxy is serving traffic correctly. Backends can be down, queues can be saturated, or reloads can be in progress while readiness passes. Use application-level health checks that verify HAProxy is actually serving traffic. Some teams use custom health check endpoints that verify backend connectivity and queue depth.

Can HAProxy work as a Kubernetes ingress controller? Kubernetes

HAProxy is a load balancer, not an ingress controller. Ingress controllers integrate with Kubernetes APIs to discover services and update routing automatically. HAProxy requires manual configuration or custom controllers that generate HAProxy configs from Kubernetes resources. Some teams use existing operators, but many build custom controllers that watch Kubernetes Services and generate HAProxy configs. This adds operational complexity but provides the control that HAProxy offers.

Conclusion

This fundamental tension between HAProxy's stateful design and Kubernetes' dynamic scheduler creates operational challenges that teams don't expect. Pod restarts can drop connections under default settings. ConfigMap updates propagate asynchronously. Resource limits can trigger OOMKills when misconfigured. Health checks may pass while traffic fails.

The key to successful HAProxy operations in Kubernetes is understanding these tensions and planning for them. Configure termination grace periods for connection draining. Account for ConfigMap propagation delays. Size resource limits based on peak usage. Use application-level health checks. Monitor networking overhead from iptables and CNI plugins.

This guide covers failure modes we've seen in production Kubernetes deployments. Use it as a starting point, but adapt it to your specific cluster configuration and traffic patterns. For broader HAProxy patterns beyond Kubernetes, see our HAProxy in Production guide.

Running HAProxy on Kubernetes: Hard Lessons and Failure Modes

Article Map

TL;DR

Why HAProxy ≠ an Ingress Controller

Conceptual Mismatch

Operational Consequences

Reload Behavior Inside Pods

In-Flight Connections

Restart Amplification

ConfigMap and Propagation Risks

Stale Configs

Race Conditions

Resource Limits and OOM Kill Patterns

Memory Spikes

CPU Throttling

Health Checks: Readiness vs Reality

Why Readiness Passes While Traffic Fails

Operational Implications

Service Discovery and Networking Pitfalls

iptables and Connection Limits

CNI and Node Pressure

Internal Service Discovery

Need Help Running HAProxy in Kubernetes?

Frequently Asked Questions

Conclusion

Related HAProxy Deep Dives

Need Help with HAProxy in Kubernetes?

Running HAProxy on Kubernetes: Hard Lessons and Failure Modes

Article Map

TL;DR

Why HAProxy ≠ an Ingress Controller

Conceptual Mismatch

Operational Consequences

Reload Behavior Inside Pods

In-Flight Connections

Restart Amplification

ConfigMap and Propagation Risks

Stale Configs

Race Conditions

Resource Limits and OOM Kill Patterns

Memory Spikes

CPU Throttling

Health Checks: Readiness vs Reality

Why Readiness Passes While Traffic Fails

Operational Implications

Service Discovery and Networking Pitfalls

iptables and Connection Limits

CNI and Node Pressure

Internal Service Discovery

Need Help Running HAProxy in Kubernetes?

Frequently Asked Questions

Conclusion

Related HAProxy Deep Dives

Need Help with HAProxy in Kubernetes?

Let's Get Started