So i’ve been asked this a few times now ..
“So you have experience with ansible.. what are some things you recommend when using it?”
I thought i’d try to codify the answer here.. in no particular order:
- leverage inventory ..
- have 1 repo for production and 1 repo for development .. dont’ mix them or at least have sub dirs in the top level of the repo “prod” and “dev” .. etc
- make sure someone “owns” inventory variables. Use a consistent naming scheme: I recommend:
- inventory vars are named as normal.. and should start with the first char
- role vars are named with a starting _
- Use your UNIX environment variables to constrain and reduce risk. I do things like:
- I have a shell command that I can then pick an environment and it sets all the ansible variables and constrains the running commands to that environment. it sets env vars like:
- ANSIBLE_ENV .. my distinct inventory name
- ANSIBLE_INVENTORY
- ANSIBLE_RETRY_FILES_SAVE_PATH .. make this unique to ${ANSIBLE_ENV} .. so /tmp/${ANSIBLE_ENV} for instance
- ANSIBLE_VAULT_PASSWORD_FILE .. this is a script that reads ${ANSIBLE_ENV} and uses pass(1) to store the vault password for that inventory. Something like
- I have a shell command that I can then pick an environment and it sets all the ansible variables and constrains the running commands to that environment. it sets env vars like:
pass show env/${ANSIBLE_ENV}/ansible_vault_password"
- I have a build_mode command that sets an environment variable that is checked by ansible for non-reversable commands .. say deleting and rebuilding a disk cluster. This is to prevent inadvertant running on a production cluster.
- use a directory hierarchy in /playbooks and /roles to organize .. dont’ use long names. So for example
roles/openstack/hypervisor/evacuate roles/network/n9k/webserver
- instead of creating huge monolithic playbooks.. use includes and subdirectories to call subparts.
- for operations .. make your roles VERY small if possible. ( we call em nuggets ) .. It’s better to give operators more flexability in what happens at the playbook level. So for instance I have:
playbooks/site/maintenance/set.yml
roles: - { role: site/maintenance/hostgroup, state: true } - { role: site/ticket/comment, comment: "placed {{ENV}} into maintenance for {{minutes}} minutes.", when: "maintenance.changed" }
- Use assert: at the beginning of roles to check and validate any variable not defined by role/defaults
- Require developers to make their roles “check” compatible. There’s a habit to use shell to get information for use in a later operation.. and unless “check_mode: no” is set on that task .. it won’t run by default.
- use become where necessary .. I would NOT require the user to use -b on the commandline all the time. instead use become: to get privs when needed.
- do not use a single ansible account .. have your admins use distinct accounts so you can track who/what/when.
- Require pep8 for your python .. so it’s consistent.
- Require a standardized set of “tags” … make them reasonable and useful.
- use prompt: to check for build_mode and pause if it’s not there.
- Require that any plugins that are written are ATOMIC! Don’t stack functions if you can help it so it’s easier to find filters ( you can use grep .. but this takes Yet Another Step )
- put the “logic” of managing data in the yaml in the task via stacking filters rather than do it all at once in a single filter that can’t be used again. When the next person comes behind you to read the code to figure it out .. they’ll have a better chance if they dont’ have to go find and then read through a random filter.
- use ssh controlMaster to speed up operations. this makes a HUGE difference.
- be careful in design where you allow ansible to be run *from* .. needs to be able to use ssh keys .. but you don’t want to use agent-forwarding .. instead use proxying
- use venv for your ansible tools .. this allows easy change between different ansible implementations if you’re not able to follow a particular version
- avoid var_files: declarations.. instead use role defaults and/or playbook/group_vars/all
- write facts when you can… it’s not always feasible .. but more cohesive if you do it.