Puppet and Foreman demarcation (Part II)

This describes our assignment of responsibility between Foreman and Puppet. For an overview, please see Part I .

Old Configuration

Our original configuration relied primarily on Foreman to define services, required classes and their supply configuration parameters. This left puppet to solely provide a mix of modules (ie autofs, etc) and profile-like classes which would be glued together by foreman at the Host Group level. When we started down the path we were on Ubuntu 12.04 (~2013) and running Foreman 1.2 or 1.3. Config Groups were not yet and option and the UI tended to force most configuration overrides to occur when configuring classes.

At first, this configuration worked well, however it soon became unwieldy list of classes (400+) that were listed in foreman and the assignment per host started to get quite cluttered. For example the configuration of a research VM running our standard R setup was three Host Groups deep, and had 27 different classes whose configuration was keyed off a mix of host group and domain. Managing this, and determining what got applied where and ensuring configuration changes didn’t have unintended side effects became a burden. Additionally, adding new classes meant weeding through the 400+ included to find what you needed. In addition, as the groupings and configuration were all in Foreman, creating a development environment was a fairly manual process of recreating the host groups and applying all the configuration overrides.

The configuration we were performing on classes fell into two categories, service-based config where items like db names and who has access to a service would vary depending on the service and the static configs for items like overall domain configuration, core apt repo’s that would almost never vary once setup.

In hindsight, setting up ignored_environments.yml would have saved us some heartache and led to a cleaner class list. It wouldn’t have led to clarity on the filesystem of easily knowing which modules were top level modules (ie, foreman directly applies) vs modules that were installed to fulfil dependencies.

New Configuration

In our new configuration, we realized that we needed to draw a line between where configuration and class application should occur. This can be a bit tricky as there is substantial overlap between what foreman provides and what puppet provides.

foremanpuppet

In deciding whether foreman or puppet should be responsible for a particular item we decided to use the following guidelines:

  • Use foreman to determine what a host is. Foreman should be the starting point to seeing what classes have been applied to a host and at a quick glance give someone an idea of what services/processes should be running.
  • There should be a single point of connection between foreman and puppet.
  • Only service-level config in foreman, not domain or global configs.

We started by looking at the Roles and Profiles pattern in Puppet and seeing how we could adapt this to Foreman. The first mapping that was pretty obvious is that a Foreman config group is a puppet role. Both do not allow parameters and both are supposed to be composed only of classes. So config groups or roles? In order to allow an admin logged into foreman to see what services are running on a host, we decided to use Foreman config groups in favor of Puppet roles.

The next step was to reduce the surface area between foreman and puppet to clearly defined lines of control. Previously we had directly included any puppet module in a config group and applied configuration on foreman via smart parameters. This time, following the profile pattern, we define one profile per service and expose only these profiles to foreman by filter in ignored_environments.yml.

:filters:
 - !ruby/regexp '/^(?!role|profile).*$/'

These profiles have configurable service configuration exposed to foreman as parameters. Where possible, sane defaults for our environments are provided if we decide to even expose a parameter rather than configure it in the profile class.  These profiles are combined using config groups and applied to Host Groups. The diagram below shows roughly what this looks like:

foremanpuppetstructure

What about Hiera?

We considered using Hiera to manage global configuration options, but after mocking up some workflows and seeing how little data we would actually have in it vs foreman decided to just put those configuration values in the various profiles. A second reason for not using Heira was to reduce the number of places to look for configuration. While not too bad, using Hiera would have let to a second code repo which would have required careful synchronization with the main puppet code repo. We may revisit this in the future as the need arises.

A moment of clarity with Puppet and Foreman (Part I)

Over the past four years we’ve deployed a puppet/foreman environment to support Ubuntu 12.04 and 14.04 for our research and production Linux systems. As 12 is approaching end of life and there are no longer Foreman updates available we decided it was a good time to revisit our overall puppet/foreman integration. Over the years it had slowly grown to include a bit of cruft and needed a good haircut. In addition, during the past four years, Foreman had added additional features which made it a good time revisit how the two communicate and where the hand-off in responsibility lie. So with that introduction, the environment we deployed had the following goals and challenges in mind:

  • Clearly define the hand-off between Foreman and Puppet.
    • Profiles and roles vs smart parameters vs host groups vs config groups.
  • Create a clear development to production workflow.
    • Module updating, development, testing.
    • How do we support operational testing (ie, patching, etc) vs longer term development.
  • Improve git integration (yeah, part of above, but a major motivation in of itself)
    • How can we support multiple development efforts?

Final Environment

Skipping ahead to the end, our final Foreman environment consists of the following. We ended up creating a separate dev and production environment which were 100% independent, yet mirrors of each other. We developed a workflow for allowing each developer to have their own environment and setup a clean separation between this development environment and the testing environment/production environment. This allows the production side to quickly test and apply security and other upstream updates without impacting longer term development efforts.

Foreman and Puppet

  • Two Foreman instances, dev and production, each on its own subnet.
  • Only profile classes and their parameters are exposed to foreman.
  • Configuration groups within foreman are used to create roles.
  • Multiple base host groups, one for each subnet/authentication domain.
    • Base host group handles core auth/patching.
    • Second level host groups create services (ie, HPC Cluster Node) and have configuration groups applied, and contain all hosts.
  • Hosts/services are created in pairs, a patch-testing and production both on the production environment.
    • Each production has a corresponding test host which mirrors production and is used to test patches and other minor updates.
    • The testing host also serves as a recovery point, meaning that a nightly backup can quickly be applied and it can be moved into a production role.
  • Development VM’s are attached to the dev foreman server and are short lived hosts for developing and debugging puppet configurations.

Git and Environments

  • Single git repo for puppet all puppet modules and profiles.
  • Git branches mostly correspond to puppet environments
    • ie: development == /etc/code/pupplabs/code/environments/development
    • production – production environment on production foreman server
    • development – set as ‘production’ on development foreman server, development on production foreman.
    • Each developer/sysadmin has their own environment on the dev server, within their environment they can switch to the appropriate branch or use their own developer forks
  • Code workflow:
    1. Developer or sysadmin works in own branch.
    2. Pull request to development branch and import on dev server.
    3. Review and testing in development (if substantial chages, additional step of verifying on production server development environment)
    4. Pull request to production and set parameters.

In parts two and three, I’ll cover a bit more about the motivation and detail behind this setup.

Some additional reading: