Migrating to LDAP PAM Pass Through Auth

The Research Computing authentication path is more complex than I'd like.

  • We start with pam_sss which, of course, authenticates against sssd.

  • Because we have users from multiple home institutions, both internal and external, sssd is configured with multiple domains.

  • Two of our configured domains authenticate against Duo and Active Directory. To support this we run two discrete instances of the Duo authentication proxy, one for each domain.

  • The Duo authentication proxy can present either an LDAP or RADIUS interface. We went with RADIUS. So sssd is configured with auth_provider = proxy, with a discrete pam stack for each domain. This pam stack uses pam_radius to authenticate against the correct Duo authentication proxy.

  • The relevant Duo authentication proxy then performs AD authentication to the relevant authoritative domain and, on success, performs Duo authentication for second factor.

All of this technically works, and has been working for some time. However, we've increasingly seen a certain bug in sssd's proxy authentication provider, which manifests as an incorrect monitoring or management of authentication threads.

The problem

[sssd[be[rc.colorado.edu]]] [dp_attach_req] (0x0400): Number of active DP request: 32

sssd maintains a number of pre-forked children for performing this proxy authentication. This default to 10 threads, and is configurable per-domain as proxy_max_children. Somewhere in sssd a bug exists that either prevents threads from being closed properly or fails to decrement the active thread count when they are closed. When the "Number of active DP request" exceeds proxy_max_children sssd will no longer perform authentication for the affected domain.

We have reported this issue to Red Hat, but 8 months on and we still don't have a fix. Meanwhile, I'm interested in simplifying our authentication path, hopefully removing the proxy authentication provider from our configuration in the process, and making sssd optional for authentication in our environment.

Our solution

We use 389 Directory Server as our local LDAP server. 389 has with it the capability to proxy authentication via PAM. A previous generation RC LDAP used this to perform authentication; but only in a way that supported a single authentication path. However, with some research and experimentation, we have managed to configure our instance with different proxy authentication paths for each of our child domains.

First we simply activate the PAM Pass Through Auth plugin by setting nsslapd-pluginEnabled: on in the existing LDAP entry.

dn: cn=PAM Pass Through Auth,cn=plugins,cn=config
objectClass: top
objectClass: nsSlapdPlugin
objectClass: extensibleObject
objectClass: pamConfig
cn: PAM Pass Through Auth
nsslapd-pluginPath: libpam-passthru-plugin
nsslapd-pluginInitfunc: pam_passthruauth_init
nsslapd-pluginType: betxnpreoperation
nsslapd-pluginEnabled: on
nsslapd-pluginloadglobal: true
nsslapd-plugin-depends-on-type: database
pamMissingSuffix: ALLOW
pamExcludeSuffix: cn=config
pamIDMapMethod: RDN
pamIDAttr: uid
pamFallback: FALSE
pamSecure: TRUE
pamService: ldapserver
nsslapd-pluginId: pam_passthruauth
nsslapd-pluginVendor: 389 Project
nsslapd-pluginDescription: PAM pass through authentication plugin

The specifics of authentication can be specified at this level as well, if we're able to express our desired behavior in a single configuration. However, the plugin supports multiple simultaneous configurations expressed as nested LDAP entries.

dn: cn=colorado.edu PAM,cn=PAM Pass Through Auth,cn=plugins,cn=config
objectClass: pamConfig
objectClass: top
cn: colorado.edu PAM
pamMissingSuffix: ALLOW
pamExcludeSuffix: cn=config
pamIDMapMethod: RDN ENTRY
pamIDAttr: uid
pamFallback: FALSE
pamSecure: TRUE
pamService: curc-twofactor-duo
pamFilter: (&(objectClass=posixAccount)(!(homeDirectory=/home/*@*)))

dn: cn=colostate.edu PAM,cn=PAM Pass Through Auth,cn=plugins,cn=config
objectClass: pamConfig
objectClass: top
cn: colostate.edu PAM
pamMissingSuffix: ALLOW
pamExcludeSuffix: cn=config
pamIDMapMethod: RDN ENTRY
pamIDAttr: uid
pamFallback: FALSE
pamSecure: TRUE
pamService: csu
pamFilter: (&(objectClass=posixAccount)(homeDirectory=/home/*@colostate.edu))

Our two sets of users are authenticated using different PAM stacks, as before. Only now this proxy authentication is happening within the LDAP server, rather than within sssd. This may seem like a small difference, but there are multiple benefits:

  • The proxy configuration exists, and need only be maintained, only within the LDAP server. It does not require all login nodes to run sssd and a complex, multi-tiered PAM stack.

  • The LDAP "PAM Pass Through Auth" plugin does not have the same bug as the sssd proxy authentication method, bypassing our immediate problem.

  • Applications that do not support PAM authentication, such as XDMoD, Foreman, and Grafana, can now be configured with simple LDAP authentication, and need not know anything of the complexity of authenticating our multiple domains.

For now I'm differentiating our different user types based on the name of their home directory, because it happens to include the relevant domain suffix. In the future we expect to update usernames in the directory to match and would then likely update this configuration to use uid.

Cleaning up a few remaining issues

However, when I first tied this back into sssd, I DOS'd our LDAP server.

debug_level = 3

description = CU Boulder Research Computing
id_provider = ldap
auth_provider = ldap
chpass_provider = none

enumerate = false
entry_cache_timeout = 300

ldap_id_use_start_tls = True
ldap_tls_reqcert = allow
ldap_uri = ldap://ldap.rc.int.colorado.edu
ldap_search_base = dc=rc,dc=int,dc=colorado,dc=edu
ldap_user_search_base = ou=UCB,ou=People,dc=rc,dc=int,dc=colorado,dc=edu
ldap_group_search_base = ou=UCB,ou=Groups,dc=rc,dc=int,dc=colorado,dc=edu

This seemed simple enough: when I would try to authenticate using this configuration, I would enter my password as usual and then respond to a Duo "push." But the authentication never cleared in sssd, and I would keep receiving Duo pushes until I stopped sssd. This despite the fact that I could authenticate with ldapsearch as expected.

$ ldapsearch -LLL -x -ZZ -D uid=[redacted],ou=UCB,ou=People,dc=rc,dc=int,dc=colorado,dc=edu -W '(uid=[redacted])' dn
Enter LDAP Password:
dn: uid=[redacted],ou=UCB,ou=People,dc=rc,dc=int,dc=colorado,dc=edu

I eventually discovered that sssd has a six-second timeout for "calls to synchronous LDAP APIs," including BIND. This timeout is entirely reasonable--even generous--for operations that do not have a manual intervention component. But when BIND includes time to send a notification to a phone, unlock the phone, and acknowledge the notification in an app, it is easy to exceed this timeout. sssd gives up and tries again, prompting a new push that won't be received until the first is addressed. In this way, the timeouts just extend against each other.

Thankfully, this timeout is also configurable as ldap_opt_timeout in the relevant sssd domain section. I went with ldap_opt_timeout = 90, which is likely longer than anyone will need.

There is still the matter of the fact that this DOS'd the LDAP server, however. I suspect I had exhausted the number of directory server threads with pending, long-living (due to manual intervention required / timeout) BIND requests.

The number of threads Directory Server uses to handle simultaneous connections affects the performance of the server. For example, if all threads are busy handling time-consuming tasks (such as add operations), new incoming connections are queued until a free thread can process the request.

Red Hat suggests that nsslapd-threadnumber should be 32 for an eight-CPU system like ours; so for now I simply increased to this recommendation from 16. If we continue to experience thread exhaustion in real-world use, we can always increase the number of threads again.