Keycloak 18.0.1 on Wildfly intermittently fails on startup with error Failed to find provider

I have an EAR with multiple custom providers for SPIs such as login and emailTemplate. This has been working without any issues on keycloak v11.0.0 v15.0.1. Now that I am upgrading to 18.0.1, still on wildfly, I started to intermittently see errors on startup such as

18:41:33,726 FATAL [org.keycloak.services] (ServerService Thread Pool -- 64) Error during startup: java.lang.RuntimeException: Failed to find provider custom-freemarker for emailTemplate
	at org.keycloak.keycloak-services@18.0.1//org.keycloak.services.DefaultKeycloakSessionFactory.checkProvider(DefaultKeycloakSessionFactory.java:234)
	at org.keycloak.keycloak-services@18.0.1//org.keycloak.services.DefaultKeycloakSessionFactory.init(DefaultKeycloakSessionFactory.java:120)
	at org.keycloak.keycloak-services@18.0.1//org.keycloak.services.resources.KeycloakApplication.createSessionFactory(KeycloakApplication.java:235)
	at org.keycloak.keycloak-services@18.0.1//org.keycloak.services.resources.KeycloakApplication.startup(KeycloakApplication.java:126)
	at org.keycloak.keycloak-wildfly-extensions@18.0.1//org.keycloak.provider.wildfly.WildflyPlatform.onStartup(WildflyPlatform.java:36)
	at org.keycloak.keycloak-services@18.0.1//org.keycloak.services.resources.KeycloakApplication.<init>(KeycloakApplication.java:116)

It seems that when the above error occurs Wildfly has not had a chance to deploy the modules in the EAR and the above DefaultKeycloakSessionFactory.java:234 fails to find my “custom-freemarker” provider for the emailTemplate SPI.

Other times when keycloak starts successfully I see in the logs messages from Wildfly that the modules are deployed

18:45:18,285 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-3) WFLYSRV0207: Starting subdeployment (runtime-name: ".....
18:45:18,499 WARN  [org.jboss.as.dependency.private] (MSC service thread 1-5) WFLYSRV0018: Deployment "deployment.custom-keycloak-providers-ear-18.0.0-SNAPSHOT.ear....

Update on this.

There seems to be a race condition between
org.keycloak.services.DefaultKeycloakSessionFactory.init()
and org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor.deploy(DeploymentPhaseContext)

For example, when you have a jar, war, ear with custom keycloak providers, and you drop that into the wildfly “deployments” directory on server startup the KeycloakProviderDeploymentProcessor pick up all those modules in addition to the actual keycloak-server.war. So, If the keycloak war gets deployed first, the keycloak-server.war starts to initialize and attempt to load all available SPI providers. If the modules in the deployments directory have not been processed yet via the other KeycloakProviderDeploymentProcessor threads the error above occurs when you have providers that override default keycloak SPIs.

I think this has always been an issue, and either wildfly or keycloak-server.war is processing the deploy faster which made this issue more likely to occur in the latest 18.x.x release

The solution for deploying providers that override keycloak’s OOB default SPIs is to deploy them as wildfly modules (wildfly_home/modules directory) or filesystem modules (wildfly_home/providers directory). This is because during keycloak’s war deployment initialization org.keycloak.services.DefaultKeycloakSessionFactory.init() looks for providers in three places

org.keycloak.provider.DefaultProviderLoaderFactory
org.keycloak.provider.FileSystemProviderLoaderFactory
org.keycloak.provider.wildfly.ModuleProviderLoaderFactory

If you have a SPI Provider in either the modules or providers directories, keycloak will always pick them up and the race condition is a non issue. The keycloak docs provide details how to create wildfly modules.

Finally, for the deployments directory, that is still a good solution to drop custom SPIs that do not override any default keycloak providers.

Since wildfly is now deprecated, not sure this will be so much of an issue in the quarkus distribution