Issue:
The default configuration of Keystone is not necessarily tuned for anything specific. As with other components of OpenStack, Keystone tuning based on use case and size are required in order for it to perform well. When integrating Keystone with large LDAP environments (10k users+), if not tuned properly, you can suffer slow logons, API lag and an incredibly slow Horizon Dashboard.
Objective:
To Improve OpenStack Keystone performance when integrating with LDAP / Active Directory.
Software and Hardware Used for testing:
- Red Hat OpenStack Platform Version 12
- Ceph 2.4
- 8 HP DL360 G7
- 3 controllers (2 x Intel x5620, 24GB RAM, SSDs for OS)
- 4 Hyper-converged nodes (2 x Intel x5620, 24GB RAM, 7 x 72GB 15KSAS)
- 1 Utility server (dual-x5620, 72 GBRAM, and 4 x SSD RAID-10)
- Red Hat OpenStack Platform Director version 12 running as a VM
- 8 vCPU, 24 GBs RAM, 100GB disk)
- Red Hat OpenStack Platform Director version 12 running as a VM
- Microsoft Windows Server 2008 Enterprise Active Directory
Environment tested
Windows 2k8 Server Enterprise with 10000+ users and nested groups. Red Hat OpenStack Platform Version 12 Hyper-converged Reference architecture with 3 x Active Controller cluster, 4 x node Ceph/Nova nodes, with full network isolation, SSL certificates for OpenStack Endpoints, Horizon GUI and LDAPs. Also, I am using Fernet Tokens for keystone instead of UUID Tokens. This is strongly recommended as it alleviates the burden of token persistence in a database amongst other reasons.
NOTE: Red Hat OSP 12 uses a containerized control plane so instead of editing conf files and restarting the services, I am editing the template files uses to create the containers and then restarting the containers. If you are not running a containerized based control plane, edit the conf files and restart the respective service.
Results
These results demonstrate an OSP 12 Director-based deployment of an OSP 12 Overcloud with the default Active Directory configurations as documented. I then enable and tune user & auth pooling as well as implement user & group filtering. Lastly, I enable memcached to be used by Keystone for caching tokens, catalogs, and roles.
Download the token_issue_script used for testing.
With defaults (pooling disabled, no LDAP filtering, and no caching layer)
time ./token_issue.sh 100 getting 100 tokens real 2m43.052s user 0m5.627s sys 0m1.642s
With Keystone pooling configured and LDAP user & group filtering
time ./token_issue.sh 100 getting 100 tokens real 0m27.178s user 0m5.698s sys 0m1.470s
With Keystone pooling configured, LDAP user & group filtering, and Keystone Caching
time ./token_issue.sh 100 getting 100 tokens real 0m10.569s user 0m5.559s sys 0m1.465s
Custom Settings
Keystone Pooling
Check current Keystone Pooling configuration
# Check settings cat << EOF > check_keystone_pooling.sh crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_pool crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_size crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_max crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_delay crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_timeout crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_lifetime crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_auth_pool crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_size crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_connection_lifetime EOF chmod +x check_keystone_pooling.sh ./check_keystone_pooling.sh
Configure and enable Keystone Pooling # THIS WILL RESTART KEYSTONE
# Configure Keystone Pooling cat << EOF > set_keystone_pooling.sh crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_pool True crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_size 200 crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_max 20 crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_delay 0.1 crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_timeout -1 crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_lifetime 600 crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_auth_pool True crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_size 1000 crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_connection_lifetime 60 docker restart keystone EOF chmod +x set_keystone_pooling.sh ./set_keystone_pooling.sh
Results Pagination
Pagination is important in large LDAP environments as only a certain number of records will be returned by default. This option defines the maximum number of results per page that keystone should request from the LDAP server when listing objects. Add this to your keystone/domains/keystone.DOMAINNAME.conf file and restart Keystone
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap page_size 2
To test the page_size, you can use ldapsearch with the following syntax. Note that without specifying a page size, not all results will be returned.
$ ldapsearch -LLL -H ldap://ldapserver.domain.com -b 'dc=domain,dc=com' -D 'DOMAIN\USERNAME' -w PASSWORD |grep sAMAccountName |wc -l Size limit exceeded (4) 825 $ ldapsearch -LLL -H ldap://ldapserver.domain.com -E pr=1/noprompt -b 'dc=domain,dc=com' -D 'DOMAIN\USERNAME' -w PASSWORD |grep sAMAccountName |wc -l 10052
User and Group Filtering
Here are examples of user and group filters that can be used in the [ldap] section of /etc/keystone/domains/keystone.YOUR_DOMAIN_NAME.yaml.
For the user_filter example, I am filtering out all users excepts for those who are either a member of OpenStack-Admins OR OpenStack-Users.
user_filter=(&(|(memberOf=CN=OpenStack-Admins,OU=People,DC=lab,DC=lan)(memberOf=CN=OpenStack-Users,OU=People,DC=lab,DC=lan)))
For the group_filter example, I am filtering out all groups except for those with the ObjectClass of Group AND with a group named OpenStack, OpenStack-Admins, OR OpenStack-Users
group_filter=(&(objectClass=Group)(&(|(cn=OpenStack)(cn=OpenStack-Admins)(cn=OpenStack-Users))))
To test LDAP user and group filtering, use the following ldapsearch syntax
Members of group1 AND group2
ldapsearch -LLL -H ldap://192.168.1.249 -E pr=10000/noprompt -b 'dc=lab,dc=lan' -D 'LAB\USERNAME' -w PASSWORD '(&(|(memberOf=CN=OpenStack-Admins,OU=People,DC=lab,DC=lan)(memberOf=CN=OpenStack-Users,OU=People,DC=lab,DC=lan)))'|grep sAMAccountName
Members of group1 OR group2
$ ldapsearch -LLL -H ldap://192.168.1.249 -E pr=10000/noprompt -b 'dc=lab,dc=lan' -D 'LAB\USERNAME' -w PASSWORD '(&(|(memberOf=CN=OpenStack-Admins,OU=People,DC=lab,DC=lan)(memberOf=CN=OpenStack-Users,OU=People,DC=lab,DC=lan)))'|grep sAMAccountName sAMAccountName: user1 sAMAccountName: user2 sAMAccountName: user3 sAMAccountName: user4 sAMAccountName: user5
List groups with names
$ ldapsearch -LLL -H ldap://192.168.1.249 -E pr=10000/noprompt -b 'dc=lab,dc=lan' -D 'LAB\USERNAME' -w PASSWORD '(&(objectClass=Group)(&(|(cn=OpenStack)(cn=OpenStack-Admins)(cn=OpenStack-Users))))'|grep distinguishedName distinguishedName: CN=OpenStack,OU=People,DC=lab,DC=lan distinguishedName: CN=OpenStack-Admins,OU=People,DC=lab,DC=lan distinguishedName: CN=OpenStack-Users,OU=People,DC=lab,DC=lan
Keystone Caching
Check if caching is enabled
# Run on each controller as root cat << EOF > ~/check_keystone_cache.sh crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache enabled crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend_argument crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf catalog caching crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf domain_config caching crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf federation caching crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf revoke caching crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf role caching crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token caching crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token cache_on_issue crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity caching crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity cache_time EOF chmod +x ~/check_keystone_cache.sh ~/check_keystone_cache.sh
Set caching per on controllers # THIS WILL RESTART KEYSTONE
# Run on each controller as root cat << EOF > ~/enable_keystone_cache.sh API_IP=$(grep $HOSTNAME.internalapi /etc/hosts|awk '{print $1}') echo $API_IP systemctl enable memcached systemctl start memcached crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache enabled true crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend dogpile.cache.memcached crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend_argument url:$API_IP:11211 crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf catalog caching true crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf domain_config caching true crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf federation caching true crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf revoke caching true crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf role caching true crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token caching true crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token cache_on_issue true crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity caching true crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity cache_time 600 crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf eventlet_server admin_workers 72 crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf eventlet_server public_workers 72 docker restart keystone EOF chmod +x ~/enable_keystone_cache.sh ~/enable_keystone_cache.sh
Verify memcached hits
You can run the following command from your controllers to watch cache hits and misses. The hits should be increasing which means caching is working.
# Run on each controller API_IP=$(grep $HOSTNAME.internalapi /etc/hosts|awk '{print $1}') watch "memcached-tool $API_IP:11211 stats |grep -A2 get_hits"
Great post, Ken! In the last bits where you configure caching, are you always backing to a single memcached instance? Only asking because we noticed some interesting behavior with the oslo.cache library when using backend_argument and pointing it to a cluster of memcached server (if you wanted to shard cached data among them) [0].
Nice work putting this together!
[0] https://bugs.launchpad.net/oslo.cache/+bug/1743036
Thank you. I am just pointing each server to their local memcached instance and not a cluster VIP or multiple IPs. Seemed to have the best results and [0] probably explains why 🙂
thank you thank you thank you thank you thank you
Now comes the painful chore of adding the caching settings into director. It wouldn’t be advisable to do it after overcloud deployment via ansible as I suspect director/puppet would overwrite it. Where do the keystone.conf caching settings go? extra-config.yaml? ie.
keystone::cache::enabled::true
keystone::identity::caching::true
etc…
sort of thing?
This caching doc for the pike release says that ” The memory back end is not suitable for use in a production environment.”
https://docs.openstack.org/keystone/pike/admin/identity-caching-layer.html
That is a bit concerning because without it, our enterprise scale openstack deployment with LDAP is brutally slow. Any idea’s why this document would suggest that a memory back end isn’t suitable for prod workloads?
Tks
In talking with the keystone developers and with my own testing, you do run into issues if you point to a cluster / multiple memcache servers for the stated reason in that link however, I haven’t found issues if each keystone controller points to itself and only itself for caching.
Using this for OSP10, page size of 2 for some reason isn’t optimal. Using 512 was 75% faster.