I have quite a bit of experience with a (small) variety of config management tools: Ansible, Salt, and Terraform. I have also used adjacent tooling like Packer, Docker, and so much Kubernetes. They all have their place and are helpful for managing server environments. Earlier in my career, I developed a strong interest in these types of tools and went through several iterations using them to manage my personal infrastructure. But I find myself losing interest in spending time on such tools for my personal work, though I still enjoy getting to use them professionally.

Specifying a System

To me, the biggest draw of good config management is have a more-or-less formal specification of a system. And that’s just a different way of saying I want a reproducible infrastructure - provisioning a new machine should be a simple and thoughtless task. But anyone who works with these tools knows that building this specification comes with some costs. You probably have to learn a new DSL, one or more command-line tools, and spend a lot of time reading documentation and doing some trail-and-error.

This tradeoff is well worth it when multiple teams of people are deploying to hundreds of machines and managing regular updates. I have a particular fondness for Terraform and its ability to allow intelligent incremental changes by tracking state, something few other tools do. Many others are designed for the pattern of immutable infrastructure, but I don’t find that pattern as good of a fit for multi-tenant deployments like the ones I mostly work with.

I’ve written before about using Terraform to manage the filesystem (not just cloud resources), which left me able to use a single tool to fully recreate my personal infrastructure. It was great in theory, but over time I opted spend my free time on other projects like working with the esp32 chip or learning Zig.

ESTALE

Many of the tools mentioned above are not feature-complete. Sure, they work well as they are, but they are still evolving at a fairly quick pace. Terraform went through a huge change to their plugin API. Docker is transitioning to BuildKit, and Compose is being rewritten from python to go. Even projects without major changes still make changes to recommended patterns.

Some changes are good, either fixing bugs or adding features I miss. Most are irrelevant to my use of the tool, refactoring components or adding features I will never use. And yet a few are disruptive, forcing me to adapt

Consider this blog - I am not very consistent about writing here as I write a lot (internal to my day job) already. Hugo has probably made around 50 releases since I last used it, but I don’t think I care about any of them. I just need to rerun the build with a new markdown file added to the repo.

And I have to point the blame at myself too. I wrote my Terraform provider for Fedora but have sinced switched to Debian, which I am now much more familiar with. I just want to redeploy my site, but now I have to go spend half a day updating some go code first. Not to mention either finding an old enough version of Terraform that still supports the v1 providers or updating my code to support v2.

For a project like my personal infrastructure that should require little maintenance, I’m in a position of having to do quite a bit of work - enough that I keep putting it off out of annoyance.

Back to Basics

After decommissioning my Digital Ocean VMs, I have only one machine to manage: a RockPro64 from Pine. I do not need some fancy tools with tens or hundreds of thousands of lines of code to do that well. I am very comfortable (though far from an expert) with Bash and common Linux utilities. Instead of taking lots of time to get everything perfect, I can spend a very small ammount of time to write a script like

if ! dpkg -s prometheus > /dev/null 2>&1; then
  echo 'Installing package prometheus'
  sudo apt install --no-install-recommends prometheus
  sudo cp prometheus.yml /etc/prometheus/prometheus.yml
  sudo chown root:prometheus prometheus.yml
  sudo adduser ${USER} prometheus
fi

if [ ! -f /usr/share/keyrings/grafana.key ]; then
  echo 'Downloading grafana apt key'
  curl -sL /usr/share/keyrings/grafana.key | sudo tee -a https://apt.grafana.com/gpg.key
fi

if [ ! -f /etc/apt/sources.list.d/grafana.list ]; then
  echo 'Configuring apt list grafana'
  echo 'deb [signed-by=/usr/share/keyrings/grafana.key] https://apt.grafana.com stable main' | sudo tee -a /etc/apt/sources.lis
t.d/grafana.list
  sudo apt update -o Dir::Etc::sourcelist='sources.list.d/grafana.list' -o Dir::Etc::sourceparts='-' -o APT::Get::List-Cleanup='0'
fi

if ! dpkg -s grafana > /dev/null 2>&1; then
  echo 'Installing package grafana'
  sudo apt install grafana
  sudo systemctl daemon-reload
  sudo systemctl enable grafana-server
  sudo systemctl start grafana-server
fi

I don’t really need to template my systemd and nginx config files like I had been doing before. Sure, I may need to change them if I provision a new server, but simply cloning a repo gets me almost all of the way there.

I think it’s good enough. I will no longer toil excessively on simple infrastructure. Just git clone ... and ./install.sh.