Deploying OpenStack using TripleO dynamic templates

Staring with the OpenStack Queens release, TripleO dynamically generates templates using jinja2 templates. This blog post demonstrates how to deploy your overcloud using templates that get dynamically generated at deploy-time.

The OpenStack version I will be using for this demo is Red Hat OpenStack Platform 14 (Rocky). Documentation for this can be found at docs.redhat.com as well as the upstream documentation found at docs.openstack.org. Templates used for this blog can be found on github.com. This blog assumes that you have deployed a successful undercloud (also known as Director).

For this deploy, I will be using the following systems:

  • Director – RHEL 7.6 VM with 24GB RAM and 8 vCPUs
  • 7 x HP DL360 G7 (3 Controllers and 4 Nova Nodes running Ceph (aka HCI or Hyper-Converged Infrastructure)

Step 1 – Introspection of your hardware.

The first thing in any TripleO deployment is to tell TripleO about your hardware. This is achieved by having TripleO PXE-boot your hardware and inspect it. Using an instackenv.json file containing information about your overcloud hardware, run the following command:


1
openstack overcloud node import ~/instackenv.json --introspect --provide

If all runs correctly, your output should look like this:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
(undercloud) [stack@director14 ~]$ openstack overcloud node import ~/instackenv.json --introspect --provide
Waiting for messages on queue 'tripleo' with no timeout.
8 node(s) successfully moved to the "manageable" state.
Successfully registered node UUID a0e0d152-21c6-40f3-b19d-fbd92aac59e6
Successfully registered node UUID 7887510b-0d09-45cb-a826-ad1748fc00fa
Successfully registered node UUID 16be5f5f-43b2-4e57-ae1a-aed529afe94c
Successfully registered node UUID 94cea84a-c4a8-4094-9553-96cb6ed86a6e
Successfully registered node UUID 7917c925-c887-4639-8a11-5c66739f7412
Successfully registered node UUID af8262fa-1dbd-4b20-964a-ea1c6ec8a958
Successfully registered node UUID ba2ca7f9-3787-4add-8404-b9588264a26c
Successfully registered node UUID 8d0068cd-dfb2-4d6e-8a3a-6931b42e8b3a
Waiting for introspection to finish...
Waiting for messages on queue 'tripleo' with no timeout.
Introspection of node 8d0068cd-dfb2-4d6e-8a3a-6931b42e8b3a completed. Status:SUCCESS. Errors:None
Introspection of node 7887510b-0d09-45cb-a826-ad1748fc00fa completed. Status:SUCCESS. Errors:None
Introspection of node ba2ca7f9-3787-4add-8404-b9588264a26c completed. Status:SUCCESS. Errors:None
Introspection of node 16be5f5f-43b2-4e57-ae1a-aed529afe94c completed. Status:SUCCESS. Errors:None
Introspection of node af8262fa-1dbd-4b20-964a-ea1c6ec8a958 completed. Status:SUCCESS. Errors:None
Introspection of node 94cea84a-c4a8-4094-9553-96cb6ed86a6e completed. Status:SUCCESS. Errors:None
Introspection of node 7917c925-c887-4639-8a11-5c66739f7412 completed. Status:SUCCESS. Errors:None
Introspection of node a0e0d152-21c6-40f3-b19d-fbd92aac59e6 completed. Status:SUCCESS. Errors:None
Successfully introspected 8 node(s).
Waiting for messages on queue 'tripleo' with no timeout.
8 node(s) successfully moved to the "available" state.

Step 2 – Defining Network

Now that we have our hardware introspected, we need to define our networking environment for our overcloud. To do this, we fill out the template file named “network_data.yaml” and place it into our ~/templates directory


1
2
3
4
5
6
7
8
9
10
11
cp /usr/share/openstack-tripleo-heat-templates/network_data.yaml ~/templates
cat ~/templates/network_data.yaml
- name: Storage
  vip: true
  vlan: 30
  name_lower: storage
  ip_subnet: '172.16.1.0/24'
  allocation_pools: [{'start': '172.16.1.4', 'end': '172.16.1.250'}]
  ipv6_subnet: 'fd00:fd00:fd00:3000::/64'
  ipv6_allocation_pools: [{'start': 'fd00:fd00:fd00:3000::10', 'end': 'fd00:fd00:fd00:3000:ffff:ffff:ffff:fffe'}]
----> Truncated <----

I have modified my network_data.yaml file to suit my needs by reusing the existing sample’s naming convention, but providing my specific network information.


1
2
3
4
5
6
7
- name: Storage
  vip: true
  vlan: 6
  name_lower: storage
  ip_subnet: '172.16.6.0/24'
  allocation_pools: [{'start': '172.16.6.10', 'end': '172.16.6.18'}]
----> Truncated <----

Step 3 – Define overcloud roles

In this next step, we define the roles that we will be using to deploy our overcloud. We will be using the most common roles (Controller and Compute), but we will also be creating an additional role for our hyper-converged nodes. Fortunately, this process has been simplified with the inclusion of a list and a create tool. Use the following command to list all of the available roles in TripleO


1
2
3
4
5
6
7
8
9
openstack overcloud roles list
+-----------------------------+
| Role Name                   |
+-----------------------------+
| BlockStorage                |
| CephAll                     |
| CephFile                    |
| CephObject                  |
----> Truncated <----

Once you have determined the specific roles needed, we will use another command to create a roles_data.yaml file that defines everything we need to define our roles for our overcloud.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
openstack overcloud roles generate Controller ComputeHCI Compute -o ~/templates/roles_data.yaml

cat ~/templates/roles_data.yaml
###############################################################################
# File generated by TripleO
###############################################################################
###############################################################################
# Role: Controller                                                            #
###############################################################################
- name: Controller
  description: |
    Controller role that has all the controler services loaded and handles
    Database, Messaging and Network functions.
----> Truncated <----

Step 4 – Using our network and roles customizations to generate our templates

In this step, we will run a python script included in TripleO to read our network_data.yaml and roles_data.yaml file to create a temporary dump of the TripleO templates. This is useful to see what will be dynamically generated at deploy-time as well as allow us to grab some of the templates to make further customizations for our overcloud. Note that this python script must be run from the openstack-tripleo-heat-templates directory for it to work.


1
2
3
4
5
6
7
8
9
cd /usr/share/openstack-tripleo-heat-templates/
tools/process-templates.py -n ~/templates/network_data.yaml -r ~/templates/roles_data.yaml -o /tmp/templates
jinja2 rendering normal template net-config-bond.j2.yaml
rendering j2 template to file: /tmp/templates/./net-config-bond.yaml
jinja2 rendering normal template net-config-bridge.j2.yaml
rendering j2 template to file: /tmp/templates/./net-config-bridge.yaml
jinja2 rendering normal template net-config-linux-bridge.j2.yaml
rendering j2 template to file: /tmp/templates/./net-config-linux-bridge.yaml
----> Truncated <----

Using the “-o” switch instructs the python script to create an output directory to generate the templates to a directory specified. If you examine the /tmp/templates directory, you will see that a clone of the openstack-tripleo-heat-templates directory has been created with actual yaml files generated based on what was in the network_data.yaml and roles_data.yaml files. The content in this newly created directory is usable to deploy an overcloud, however, in my case I need to modify a few other items in order for my deploy to complete.

Step 5 – Nic Configs

TripleO templates do a great job in creating nic-config templates for commonly used deployments, however my setup is a bit different. Instead of having a dedicated NIC (eth0) for provisioning and then an OVS-Bond of 2 NICs (eth1 and eth2) for the OpenStack overcloud controlplane networks, I want to use 4 NICs in an LACP Linux Bond for provisioning and controlplane networks. To achieve this, I must modify the generated nic-config templates to suit my needs. I will start with the generated nic-configs for “bond-with-vlans” and modify them. I do this by copying the bond-with-vlans directory from the temporarily generated templates to my custom templates directory. You can diff the generated templates with my customized templates to see what changes I made.


1
2
3
4
5
mkdir ~/templates/nic-configs
cp -R /tmp/templates/network/config/bond-with-vlans ~/templates/nic-configs/
vi ~/templates/nic-configs/controller.yaml
vi ~/templates/nic-configs/compute.yaml
vi ~/templates/nic-configs/computehci.yaml

Step 6 – Network Customizations

We need to instruct TripleO during the deploy to make customizations such as custom nic-configs, bridge mapping, LACP configurations, MTU sizes, etc. This is achieved by making a yaml file defining those customizations and sourcing it during deployment time. Changes in this file will override any defaults in the generated templates. I have created a file named network.yaml which contains those customizations.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
cat templates/network.yaml
resource_registry:
  OS::TripleO::Controller::Net::SoftwareConfig: /home/stack/templates/nic-configs/controller.yaml
  OS::TripleO::ComputeHCI::Net::SoftwareConfig: /home/stack/templates/nic-configs/computehci.yaml
  OS::TripleO::Compute::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute.yaml

parameter_defaults:
  NeutronBridgeMappings: 'datacentre:br-ex'
  NeutronFlatNetworks: 'datacentre'
  NeutronNetworkVLANRanges: 'datacentre:1:100'
  NeutronNetworkType: 'vxlan,vlan,flat'
  NeutronTunnelType: 'vxlan'
  NeutronExternalNetworkBridge: "''"

  # enable isolated metadata agent on controllers
  # https://access.redhat.com/solutions/2292841
  # Enable isolated Neutron metadata (allow metadata server in provider networks)
  NeutronEnableIsolatedMetadata: true

  # Set Jumbo MTU for tenant networks
  NeutronGlobalPhysnetMtu: 8896

  # DNS
  DnsServers: ['192.168.1.249', '192.168.0.250']
  CloudName: overcloud.lab.lan
  CloudDomain: lab.lan

  # Bonding options
  BondInterfaceOvsOptions: 'mode=802.3ad lacp_rate=1 updelay=1000 miimon=100'

  # Global DNS name for instances
  NeutronDnsDomain: lab.lan
  NeutronPluginExtensions: "qos,port_security,dns"
  ControllerExtraConfig:
    neutron::agents::dhcp::dnsmasq_local_resolv: true
    neutron::agents::dhcp::enable_isolated_metadata: true

Step 7 – Ceph Customization

You need to configure how TripleO will configure the ceph nodes by defining what disks will be used for OSDs, journal drive mappings, and pool sizing. Following the Ceph documentation for TripleO deployed Ceph, I created a file called ceph-custom-config.yaml and placed it in my ~/templates directory. This file will get called during deployment to override any defaults from the TripleO generated templates.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
parameter_defaults:
  CephConfigOverrides:
    mon_max_pg_per_osd: 3072
    journal_size: 5120
    osd_pool_default_size: 3
    osd_pool_default_min_size: 2
    osd_pool_default_pg_num: 128
    osd_pool_default_pgp_num: 128
  CephAnsibleDisksConfig:
    osd_scenario: lvm
    osd_objectstore: bluestore
    devices:
      - /dev/sdb
      - /dev/sdc
      - /dev/sdd
      - /dev/sde
      - /dev/sdf
  CephPools:
    - {"name": .rgw.root, "pg_num": 16, "pgp_num": 16, "application": rados}
    - {"name": default.rgw.control, "pg_num": 16, "pgp_num": 16, "application": rados}
    - {"name": default.rgw.meta, "pg_num": 16, "pgp_num": 16, "application": rados}
    - {"name": default.rgw.log, "pg_num": 16, "pgp_num": 16, "application": rados}
    - {"name": images, "pg_num": 128, "pgp_num": 128, "application": rbd}
    - {"name": metrics, "pg_num": 16, "pgp_num": 16, "application":openstack_gnocchi}
    - {"name": backups, "pg_num": 16, "pgp_num": 16, "application": rbd}
    - {"name": vms, "pg_num": 512, "pgp_num": 512, "application": rbd}
    - {"name": volumes, "pg_num": 256, "pgp_num": 256, "application": rbd}
  CephPoolDefaultPgNum: 128

Step 8 – Other overcloud customizations

I am statically assigning hardware nodes to each role by defining a scheduler hint. This ensures that a physical server is statically chosen for a specific server name (i.e. server in rack-u 1 will become controller1). I also define static hostnames, domain names, IPs, VIPs and SSL certificates for all nodes by copying their corresponding templates from /tmp/templates into my custom ~/templates directory and modifying them to suit my needs. Reference the following files for my customizations:

Step 9 – Deployment

Now we will use all of the customizations from previous steps to deploy our overcloud. To do so, I create a deploy-overcloud.sh script to include the deployment command as well as to source all of my custom templates.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
openstack overcloud deploy --templates \
-r ~/templates/roles_data.yaml \
-n ~/templates/network_data.yaml \
-e ~/templates/containers-prepare-parameter.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e ~/templates/network.yaml \
-e ~/templates/scheduler_hints_env.yaml \
-e ~/templates/node-info.yaml \
-e ~/templates/ips-from-pool-all.yaml \
-e ~/templates/fixed-ip-vips.yaml \
-e ~/templates/inject-trust-anchor-hiera.yaml \
-e ~/templates/enable-tls.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \
-e ~/templates/ceph-custom-config.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-rgw.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml \
-e ~/templates/misc-settings.yaml

If all goes well, you should have a fully deployed overcloud. To modify your overcloud later after the initial deploy, you will use the deploy-overcloud.sh script with any modifications you need to apply.

Keystone Optimization

Issue:

The default configuration of Keystone is not necessarily tuned for anything specific.  As with other components of OpenStack, Keystone tuning based on use case and size are required in order for it to perform well.  When integrating Keystone with large LDAP environments (10k users+), if not tuned properly, you can suffer slow logons, API lag and an incredibly slow Horizon Dashboard.

Objective:

To Improve OpenStack Keystone performance when integrating with LDAP / Active Directory.

Software and Hardware Used for testing:

  • Red Hat OpenStack Platform Version 12
  • Ceph 2.4
  • 8 HP DL360 G7
    • 3 controllers (2 x Intel x5620, 24GB RAM, SSDs for OS)
    • 4 Hyper-converged nodes (2 x Intel x5620, 24GB RAM, 7 x 72GB 15KSAS)
    • 1 Utility server (dual-x5620, 72 GBRAM, and 4 x SSD RAID-10)
      • Red Hat OpenStack Platform Director version 12 running as a VM
        • 8 vCPU, 24 GBs RAM, 100GB disk)
  • Microsoft Windows Server 2008 Enterprise Active Directory

Environment tested

Windows 2k8 Server Enterprise with 10000+ users and nested groups.  Red Hat OpenStack Platform Version 12 Hyper-converged Reference architecture with 3 x Active Controller cluster, 4 x node Ceph/Nova nodes, with full network isolation, SSL certificates for OpenStack Endpoints, Horizon GUI and LDAPs.   Also, I am using Fernet Tokens for keystone instead of UUID Tokens.  This is strongly recommended as it alleviates the burden of token persistence in a database amongst other reasons.

NOTE:  Red Hat OSP 12 uses a containerized control plane so instead of editing conf files and restarting the services, I am editing the template files uses to create the containers and then restarting the containers.   If you are not running a containerized based control plane, edit the conf files and restart the respective service.

Results

These results demonstrate an OSP 12 Director-based deployment of an OSP 12 Overcloud with the default Active Directory configurations as documented.  I then enable and tune user & auth pooling as well as implement user & group filtering.  Lastly, I enable memcached to be used by Keystone for caching tokens, catalogs, and roles.

Download the token_issue_script used for testing.

With defaults (pooling disabled, no LDAP filtering, and no caching layer)
time ./token_issue.sh 100
getting 100 tokens
real 2m43.052s
user 0m5.627s
sys 0m1.642s
With Keystone pooling configured and LDAP user & group filtering
time ./token_issue.sh 100
getting 100 tokens
real 0m27.178s
user 0m5.698s
sys 0m1.470s
With Keystone pooling configured, LDAP user & group filtering, and Keystone Caching
time ./token_issue.sh 100
getting 100 tokens
real 0m10.569s
user 0m5.559s
sys 0m1.465s

Custom Settings

Keystone Pooling

Check current Keystone Pooling configuration

# Check settings
cat << EOF > check_keystone_pooling.sh
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_pool
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_size
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_max
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_delay
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_timeout
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_lifetime
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_auth_pool
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_size
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_connection_lifetime
EOF
chmod +x check_keystone_pooling.sh
./check_keystone_pooling.sh

Configure and enable Keystone Pooling # THIS WILL RESTART KEYSTONE

# Configure Keystone Pooling
cat << EOF > set_keystone_pooling.sh
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_pool True
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_size 200
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_max 20
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_retry_delay 0.1
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_timeout -1
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap pool_connection_lifetime 600
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap use_auth_pool True
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_size 1000
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap auth_pool_connection_lifetime 60
docker restart keystone
EOF
chmod +x set_keystone_pooling.sh
./set_keystone_pooling.sh

Results Pagination

Pagination is important in large LDAP environments as only a certain number of records will be returned by default.   This option defines the maximum number of results per page that keystone should request from the LDAP server when listing objects.  Add this to your keystone/domains/keystone.DOMAINNAME.conf file and restart Keystone

crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/domains/keystone.LAB.conf ldap page_size 2

To test the page_size, you can use ldapsearch with the following syntax.  Note that without specifying a page size, not all results will be returned.

$ ldapsearch -LLL -H ldap://ldapserver.domain.com -b 'dc=domain,dc=com' -D 'DOMAIN\USERNAME' -w PASSWORD |grep sAMAccountName |wc -l
Size limit exceeded (4)
825

$ ldapsearch -LLL -H ldap://ldapserver.domain.com -E pr=1/noprompt -b 'dc=domain,dc=com' -D 'DOMAIN\USERNAME' -w PASSWORD |grep sAMAccountName |wc -l
10052

User and Group Filtering

Here are examples of user and group filters that can be used in the [ldap] section of /etc/keystone/domains/keystone.YOUR_DOMAIN_NAME.yaml.

For the user_filter example, I am filtering out all users excepts for those who are either a member of OpenStack-Admins OR OpenStack-Users.

user_filter=(&(|(memberOf=CN=OpenStack-Admins,OU=People,DC=lab,DC=lan)(memberOf=CN=OpenStack-Users,OU=People,DC=lab,DC=lan)))

For the group_filter example, I am filtering out all groups  except for those with the ObjectClass of Group AND with a group named OpenStack, OpenStack-Admins, OR OpenStack-Users

group_filter=(&(objectClass=Group)(&(|(cn=OpenStack)(cn=OpenStack-Admins)(cn=OpenStack-Users))))

To test LDAP user and group filtering, use the following ldapsearch syntax

Members of group1 AND group2
ldapsearch -LLL -H ldap://192.168.1.249 -E pr=10000/noprompt -b 'dc=lab,dc=lan' -D 'LAB\USERNAME' -w PASSWORD '(&(|(memberOf=CN=OpenStack-Admins,OU=People,DC=lab,DC=lan)(memberOf=CN=OpenStack-Users,OU=People,DC=lab,DC=lan)))'|grep sAMAccountName
Members of group1 OR group2
$ ldapsearch -LLL -H ldap://192.168.1.249 -E pr=10000/noprompt -b 'dc=lab,dc=lan' -D 'LAB\USERNAME' -w PASSWORD '(&(|(memberOf=CN=OpenStack-Admins,OU=People,DC=lab,DC=lan)(memberOf=CN=OpenStack-Users,OU=People,DC=lab,DC=lan)))'|grep sAMAccountName
sAMAccountName: user1
sAMAccountName: user2
sAMAccountName: user3
sAMAccountName: user4
sAMAccountName: user5
List groups with names
$ ldapsearch -LLL -H ldap://192.168.1.249 -E pr=10000/noprompt -b 'dc=lab,dc=lan' -D 'LAB\USERNAME' -w PASSWORD '(&(objectClass=Group)(&(|(cn=OpenStack)(cn=OpenStack-Admins)(cn=OpenStack-Users))))'|grep distinguishedName
distinguishedName: CN=OpenStack,OU=People,DC=lab,DC=lan
distinguishedName: CN=OpenStack-Admins,OU=People,DC=lab,DC=lan
distinguishedName: CN=OpenStack-Users,OU=People,DC=lab,DC=lan

Keystone Caching

Check if caching is enabled

# Run on each controller as root
cat << EOF > ~/check_keystone_cache.sh
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache enabled
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend_argument
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf catalog caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf domain_config caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf federation caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf revoke caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf role caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token cache_on_issue
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity caching
crudini --get /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity cache_time
EOF
chmod +x ~/check_keystone_cache.sh
~/check_keystone_cache.sh

Set caching per on controllers # THIS WILL RESTART KEYSTONE

# Run on each controller as root
cat << EOF > ~/enable_keystone_cache.sh
API_IP=$(grep $HOSTNAME.internalapi /etc/hosts|awk '{print $1}')
echo $API_IP
systemctl enable memcached
systemctl start memcached
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache enabled true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend dogpile.cache.memcached
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf cache backend_argument url:$API_IP:11211
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf catalog caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf domain_config caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf federation caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf revoke caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf role caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf token cache_on_issue true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity caching true
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf identity cache_time 600
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf eventlet_server admin_workers 72
crudini --set /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf eventlet_server public_workers 72
docker restart keystone
EOF
chmod +x ~/enable_keystone_cache.sh
~/enable_keystone_cache.sh

Verify memcached hits

You can run the following command from your controllers to watch cache hits and misses.  The hits should be increasing which means caching is working.

# Run on each controller
API_IP=$(grep $HOSTNAME.internalapi /etc/hosts|awk '{print $1}')
watch "memcached-tool $API_IP:11211 stats |grep -A2 get_hits"

Separate Ceph from existing Director-based deployment

Deploying Ceph with OpenStack Platform Director is very convenient, but there are times when it’s simplicity isn’t enough for more advanced installations.  Fortunately, it is possible to decouple the Ceph installation from the OpenStack control plane and Director management which I will detail below.

  1. Deploy OSP to include at least 1 controller, compute and Ceph storage node
  2. Deploy at least one new server to take over Ceph monitor role
  3. Enable new Ceph monitor
  4. Disable existing Ceph Monitor role from OSP Controllers
  5. Set Ceph storage nodes to maintenance in Ironic
  6. Delete Ceph storage nodes from ironic
  7. re-run openstack overcloud deploy pointing to external ceph yaml and setting ceph scale out = 0

Initial OSP Deploy script:

1
2
3
4
5
6
7
8
9
10
11
#!/bin/bash
--ntp-server 192.168.1.250 \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/templates/network-environment.yaml \
-e /home/stack/templates/storage-environment.yaml \
--control-flavor control \
--compute-flavor compute \
--ceph-storage-flavor ceph-storage \
--control-scale 3 \
--compute-scale 1 \
--ceph-storage-scale 3

Once deployed, I run an openstack overcloud update stack to ensure Overcloud is updated to the latest RPMs within it’s Major version (i.e. If deploying OSP 8, update to the latest RPMs available to OSP 8)

OSP Update Script
This will update existing OSP deploy to the latest RPMs

1
2
3
4
5
6
7
#!/bin/bash
openstack overcloud update stack overcloud -i \
--templates \
-e /usr/share/openstack-tripleo-heat-templates/overcloud-resource-registry-puppet.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/templates/network-environment.yaml \
-e /home/stack/templates/storage-environment.yaml

Deploy a new server(s) to take over Ceph monitor role from OSP Controllers.
I used Director to deploy a baremetal server with the default baremetal Nova flavor using the following command:

1
2
ctrlplane_net=$(neutron net-list | grep ctrl | awk '{print $2;}')
openstack server create cloudbox4 --flavor=baremetal --nic net-id=$ctrlplane_net --image=overcloud-full --key-name=default

New OSP Deploy script (removing storage-environment.yaml and including puppet-ceph-external.yaml and setting ceph-storage-scale 0)

1
2
3
4
5
6
7
8
9
10
11
12
13
#!/bin/bash
openstack overcloud deploy --templates \
--ntp-server 192.168.1.250 \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/templates/network-environment.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/puppet-ceph-external.yaml \
-e /home/stack/templates/ceph-external.yaml \
--control-flavor control \
--compute-flavor compute \
--ceph-storage-flavor ceph-storage \
--control-scale 3 \
--compute-scale 1 \
--ceph-storage-scale 0

New OSP Update script

1
2
3
4
5
6
7
8
#!/bin/bash
openstack overcloud update stack overcloud -i \
--templates \
-e /usr/share/openstack-tripleo-heat-templates/overcloud-resource-registry-puppet.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/templates/network-environment.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/puppet-ceph-external.yaml \
-e /home/stack/templates/ceph-external.yaml