Keycloak session perssistence not working across pods on high availability

Hi, I’m trying to run key cloak on high availability mode with 3 replicas using the bitnami helm on kc v20.

I have the cluster working.

At first the admin did not work with a css/js resource issue - the rquests seemed to have a piece of the session in the url path. I resolved this with

      nginx.ingress.kubernetes.io/affinity-mode: "persistent"
      nginx.ingress.kubernetes.io/session-cookie-name: "kc_stky"
      nginx.ingress.kubernetes.io/session-cookie-expires: "172800"
      nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"

But then if I login as admin and one at a time kill each of the three instances (expecting the session to move across the pods and I should stay login). The session will be lost and I have to relogin.

I’m thinking this means I dont have true high availability? I dont see much discussion of this so I’m wondering if I’ve missed something.

Here is my bitnami helm chart

  args:
    - start
    - --cache-config-file=cache-ispn.xml
    #- --log-level=DEBUG
  extraEnvVars:
    - name: PROXY_ADDRESS_FORWARDING
      value: "true"
    - name: KC_HEALTH_ENABLED
      value: "true"
    - name: KC_METRICS_ENABLED
      value: "true"
    - name: KC_HTTP_ENABLED
      value: "true"
    - name: KC_HOSTNAME_STRICT
      value: "false"
    - name: KC_HOSTNAME_STRICT_HTTPS
      value: "false"
    - name: KC_CACHE
      value: "ispn"
    # - name: KC_CACHE_STACK
    #   value: "kubernetes"
    - name: jgroups.dns.query
      value: name-keycloak-headless.default.svc.cluster.local
    - name: KEYCLOAK_PRODUCTION
      value: "true"
  cache:
    enabled: true
    stackName: kubernetes
  auth:
    adminUser: "admin"
    adminPassword: "admin"
  image:
    registry: ####
    repository: ####
    tag: #####
  imagePullSecrets:
    - name: docker-registry
  ingress:
    enabled: true
    pathType: Prefix
    hostname: localhost
    servicePort: 8080
    path: /
    annotations:
      nginx.ingress.kubernetes.io/proxy-buffer-size: 128k
      #kubernetes.io/ingress.class: bitnami-internal
      kubernetes.io/ingress.class: nginx
      nginx.ingress.kubernetes.io/rewrite-target: /
      nginx.org/hsts-behind-proxy: "True"
      nginx.org/hsts: "True"
      nginx.ingress.kubernetes.io/affinity: "cookie"
      nginx.ingress.kubernetes.io/affinity-mode: "persistent"
      nginx.ingress.kubernetes.io/session-cookie-name: "kc_stky"
      nginx.ingress.kubernetes.io/session-cookie-expires: "172800"
      nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"
      # nginx.org/server-snippets: |
      #   location / {
      #     proxy_set_header X-Forwarded-For $host;
      #     proxy_set_header X-Forwarded-Proto $scheme;
      #   }
  livenessProbe: 
    enabled: true
    initialDelaySeconds: 5
    timeoutSeconds: 5
  postgresql:
    enabled: true
  readinessProbe: 
    enabled: true
    initialDelaySeconds: 5
    timeoutSeconds: 1
  replicaCount: 3
  service:
    type: ClusterIP
    ports:
      http: 8080
    sessionAffinity: ClientIP
  startupProbe:
    enabled: true
    initialDelaySeconds: 5
    timeoutSeconds: 1
    failureThreshold: 60
    periodSeconds: 5

I could really use some input on this.

+1 On this one.

@stianst, @mposolda – Sorry to ping you directly, but we’ve been trying to get this problem figured out, and the documentation for Keycloak spans so many versions, and the community is non-responsive.

V20 docs suggested that this should “just work” but I’m finding threads that suggest that Clustered (HA) mode isn’t functioning in V20 for Kubernetes rolling deploys. So far our local tests are concurring with this. How/where should we go to get reliable help on these kinds of topics. Thanks!

Threads I’m referencing: