Following are my observations after some testing.
TLDR: Values in cache-ispn.xml during build are taken into consideration and any updated values during deployment are ignored if there is no explicit rebuild during deployment.
From logs
Initially i enabled infinispan logs by settings Keycloak debug log level to
INFO,org.keycloak:DEBUG,org.keycloak.connections:TRACE,org.keycloak.connections.infinispan:TRACE,org.infinispan:TRACE
understand how replication is working by tracking session id in the logs and tracking rpc commands. But these logs where overwhelming and was impossible to understand what exactly is happening regarding who owns a object and where its getting replicated to.
Using infinispan statistics
I gave up this approach and enabled statistics as described at Configuring distributed caches - Keycloak . I had enable statistics both at global and for each cache to get full metrics. Corresponding metrics endpoint is at /auth/metrics endpoint. Metrics of interest regarding my usecase had following naming convention
vendor_cache_manager_keycloak_cache_<cache-name>_cluster_cache_stats_required_minimum_number_of_nodes
For example for my deployment configuration with 2 nodes in Keycloak cluster
Cache configuration during docker image build
...
<distributed-cache name="sessions" owners="1" statistics="true">
<expiration lifespan="-1"/>
</distributed-cache>
...
Cache configuration during deployment(my mounting custom cache-ispn.xml file)
...
<distributed-cache name="sessions" owners="2" statistics="true">
<expiration lifespan="-1"/>
</distributed-cache>
...
I had following metric for sessions cache
# HELP vendor_cache_manager_keycloak_cache_sessions_cluster_cache_stats_required_minimum_number_of_nodes Minimum number of nodes to avoid losing data
# TYPE vendor_cache_manager_keycloak_cache_sessions_cluster_cache_stats_required_minimum_number_of_nodes gauge
vendor_cache_manager_keycloak_cache_sessions_cluster_cache_stats_required_minimum_number_of_nodes{cache="sessions",node="keycloak-0-14766",} 2.0
Value here indicates minimum number of nodes that should be available to result in no data loss. Here it implies my setup cannot afford any nodes going down since my build configuration with only one owner per object is effective and not the updated cache-ispn.xml during deployment, which has two replicas and can afford one node down without any dataloss.
I tried with different build and deploy configurations to come to this conclusion. It would have been great if there is more visibility into replication and Keycloak documentation to understand replication issues.