Keystone Optimization

Issue:

The default configuration of Keystone is not necessarily tuned for anything specific.  As with other components of OpenStack, Keystone tuning based on use case and size are required in order for it to perform well.  When integrating Keystone with large LDAP environments (10k users+), if not tuned properly, you can suffer slow logons, API lag and an incredibly slow Horizon Dashboard.

Objective:

To Improve OpenStack Keystone performance when integrating with LDAP / Active Directory.

Software and Hardware Used for testing:

  • Red Hat OpenStack Platform Version 12
  • Ceph 2.4
  • 8 HP DL360 G7
    • 3 controllers (2 x Intel x5620, 24GB RAM, SSDs for OS)
    • 4 Hyper-converged nodes (2 x Intel x5620, 24GB RAM, 7 x 72GB 15KSAS)
    • 1 Utility server (dual-x5620, 72 GBRAM, and 4 x SSD RAID-10)
      • Red Hat OpenStack Platform Director version 12 running as a VM
        • 8 vCPU, 24 GBs RAM, 100GB disk)
  • Microsoft Windows Server 2008 Enterprise Active Directory

Environment tested

Windows 2k8 Server Enterprise with 10000+ users and nested groups.  Red Hat OpenStack Platform Version 12 Hyper-converged Reference architecture with 3 x Active Controller cluster, 4 x node Ceph/Nova nodes, with full network isolation, SSL certificates for OpenStack Endpoints, Horizon GUI and LDAPs.   Also, I am using Fernet Tokens for keystone instead of UUID Tokens.  This is strongly recommended as it alleviates the burden of token persistence in a database amongst other reasons.

NOTE:  Red Hat OSP 12 uses a containerized control plane so instead of editing conf files and restarting the services, I am editing the template files uses to create the containers and then restarting the containers.   If you are not running a containerized based control plane, edit the conf files and restart the respective service.

Results

These results demonstrate an OSP 12 Director-based deployment of an OSP 12 Overcloud with the default Active Directory configurations as documented.  I then enable and tune user & auth pooling as well as implement user & group filtering.  Lastly, I enable memcached to be used by Keystone for caching tokens, catalogs, and roles.

Download the token_issue_script used for testing.

With defaults (pooling disabled, no LDAP filtering, and no caching layer)
time ./token_issue.sh 100
getting 100 tokens
real 2m43.052s
user 0m5.627s
sys 0m1.642s
With Keystone pooling configured and LDAP user & group filtering
time ./token_issue.sh 100
getting 100 tokens
real 0m27.178s
user 0m5.698s
sys 0m1.470s
With Keystone pooling configured, LDAP user & group filtering, and Keystone Caching
time ./token_issue.sh 100
getting 100 tokens
real 0m10.569s
user 0m5.559s
sys 0m1.465s

Custom Settings

Keystone Pooling

Check current Keystone Pooling configuration

# Check settings
cat << EOF > check_keystone_pooling.sh
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_pool
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_size
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_max
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_delay
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_timeout
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_lifetime
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_auth_pool
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_size
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_connection_lifetime
EOF
chmod +x check_keystone_pooling.sh
./check_keystone_pooling.sh

Configure and enable Keystone Pooling # THIS WILL RESTART KEYSTONE

# Configure Keystone Pooling
cat << EOF > set_keystone_pooling.sh
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_pool True
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_size 200
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_max 20
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_delay 0.1
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_timeout -1
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_lifetime 600
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_auth_pool True
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_size 1000
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_connection_lifetime 60
docker restart keystone
EOF
chmod +x set_keystone_pooling.sh
./set_keystone_pooling.sh

Results Pagination

Pagination is important in large LDAP environments as only a certain number of records will be returned by default.   This option defines the maximum number of results per page that keystone should request from the LDAP server when listing objects.  Add this to your keystone/domains/keystone.DOMAINNAME.conf file and restart Keystone

crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap page_size 2

To test the page_size, you can use ldapsearch with the following syntax.  Note that without specifying a page size, not all results will be returned.

$ ldapsearch -LLL -H ldap://ldapserver.domain.com -b 'dc=domain,dc=com' -D 'DOMAIN\USERNAME' -w PASSWORD |grep sAMAccountName |wc -l
Size limit exceeded (4)
825

$ ldapsearch -LLL -H ldap://ldapserver.domain.com -E pr=1/noprompt -b 'dc=domain,dc=com' -D 'DOMAIN\USERNAME' -w PASSWORD |grep sAMAccountName |wc -l
10052

User and Group Filtering

Here are examples of user and group filters that can be used in the [ldap] section of /etc/keystone/domains/keystone.YOUR_DOMAIN_NAME.yaml.

For the user_filter example, I am filtering out all users excepts for those who are either a member of OpenStack-Admins OR OpenStack-Users.

user_filter=(&(|(memberOf=CN=OpenStack-Admins,OU=People,DC=lab,DC=lan)(memberOf=CN=OpenStack-Users,OU=People,DC=lab,DC=lan)))

For the group_filter example, I am filtering out all groups  except for those with the ObjectClass of Group AND with a group named OpenStack, OpenStack-Admins, OR OpenStack-Users

group_filter=(&(objectClass=Group)(&(|(cn=OpenStack)(cn=OpenStack-Admins)(cn=OpenStack-Users))))

To test LDAP user and group filtering, use the following ldapsearch syntax

Members of group1 AND group2
ldapsearch -LLL -H ldap://192.168.1.249 -E pr=10000/noprompt -b 'dc=lab,dc=lan' -D 'LAB\USERNAME' -w PASSWORD '(&(|(memberOf=CN=OpenStack-Admins,OU=People,DC=lab,DC=lan)(memberOf=CN=OpenStack-Users,OU=People,DC=lab,DC=lan)))'|grep sAMAccountName
Members of group1 OR group2
$ ldapsearch -LLL -H ldap://192.168.1.249 -E pr=10000/noprompt -b 'dc=lab,dc=lan' -D 'LAB\USERNAME' -w PASSWORD '(&(|(memberOf=CN=OpenStack-Admins,OU=People,DC=lab,DC=lan)(memberOf=CN=OpenStack-Users,OU=People,DC=lab,DC=lan)))'|grep sAMAccountName
sAMAccountName: user1
sAMAccountName: user2
sAMAccountName: user3
sAMAccountName: user4
sAMAccountName: user5
List groups with names
$ ldapsearch -LLL -H ldap://192.168.1.249 -E pr=10000/noprompt -b 'dc=lab,dc=lan' -D 'LAB\USERNAME' -w PASSWORD '(&(objectClass=Group)(&(|(cn=OpenStack)(cn=OpenStack-Admins)(cn=OpenStack-Users))))'|grep distinguishedName
distinguishedName: CN=OpenStack,OU=People,DC=lab,DC=lan
distinguishedName: CN=OpenStack-Admins,OU=People,DC=lab,DC=lan
distinguishedName: CN=OpenStack-Users,OU=People,DC=lab,DC=lan

Keystone Caching

Check if caching is enabled

# Run on each controller as root
cat << EOF > ~/check_keystone_cache.sh
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache enabled
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend_argument
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf catalog caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf domain_config caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf federation caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf revoke caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf role caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token cache_on_issue
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity cache_time
EOF
chmod +x ~/check_keystone_cache.sh
~/check_keystone_cache.sh

Set caching per on controllers # THIS WILL RESTART KEYSTONE

# Run on each controller as root
cat << EOF > ~/enable_keystone_cache.sh
API_IP=$(grep $HOSTNAME.internalapi /etc/hosts|awk '{print $1}')
echo $API_IP
systemctl enable memcached
systemctl start memcached
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache enabled true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend dogpile.cache.memcached
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend_argument url:$API_IP:11211
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf catalog caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf domain_config caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf federation caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf revoke caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf role caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token cache_on_issue true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity cache_time 600
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf eventlet_server admin_workers 72
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf eventlet_server public_workers 72
docker restart keystone
EOF
chmod +x ~/enable_keystone_cache.sh
~/enable_keystone_cache.sh

Verify memcached hits

You can run the following command from your controllers to watch cache hits and misses.  The hits should be increasing which means caching is working.

# Run on each controller
API_IP=$(grep $HOSTNAME.internalapi /etc/hosts|awk '{print $1}')
watch "memcached-tool $API_IP:11211 stats |grep -A2 get_hits"

7 thoughts on “Keystone Optimization

  1. Great post, Ken! In the last bits where you configure caching, are you always backing to a single memcached instance? Only asking because we noticed some interesting behavior with the oslo.cache library when using backend_argument and pointing it to a cluster of memcached server (if you wanted to shard cached data among them) [0].

    Nice work putting this together!

    [0] https://bugs.launchpad.net/oslo.cache/+bug/1743036

    • Thank you. I am just pointing each server to their local memcached instance and not a cluster VIP or multiple IPs. Seemed to have the best results and [0] probably explains why 🙂

  2. Now comes the painful chore of adding the caching settings into director. It wouldn’t be advisable to do it after overcloud deployment via ansible as I suspect director/puppet would overwrite it. Where do the keystone.conf caching settings go? extra-config.yaml? ie.

    keystone::cache::enabled::true
    keystone::identity::caching::true
    etc…
    sort of thing?

    • In talking with the keystone developers and with my own testing, you do run into issues if you point to a cluster / multiple memcache servers for the stated reason in that link however, I haven’t found issues if each keystone controller points to itself and only itself for caching.

Leave a Reply

Your email address will not be published. Required fields are marked *