In December, I did my first Advent of Code (opens in new tab). I used Ansible, which is a terrible choice for programming challenges but that’s part of the fun!
I gave myself these rules:
- No LLMs or search engines.
- No external commands (eg,
ansible.builtin.shelletc). - No custom plugins (eg,
my_filter.py). - No extra dependencies (eg,
json_query). - No Jinja blocks (eg,
{% … %}).
People have used Ansible for Advent of Code before, but all of them used Jinja blocks. Jinja block loops are hundreds of times faster than Ansible loops, but I restricted myself to only Ansible loops for my solutions (opens in new tab).
Ansible loops aren’t designed for iterating 10,000 times on a single host. So being forced to abuse them led me to discover two curious things:
Looping over
include_taskshas a slow-down that scales quadratically with the number of iterations.Appending to lists or dicts in a loop has
O(N^2 * M)complexity for both time and memory (N=iterations,M=data_size). This easily causes out-of-memory problems.
Explanations below!
1. LoopsLink to this heading
Even looping just a single task is already slow. This takes 57 seconds:
# Use ANSIBLE_STDOUT_CALLBACK=selective
# to avoid testing terminal print speed.
- hosts: localhost
tasks:
- name: 'set var to true 50k times'
set_fact:
foo: true
loop: '{{ range(50000) | list }}'
# Executed in 57s.
To loop over several tasks, you’d use include_tasks to include another file:
- hosts: localhost
tasks:
# Looping over an `include_tasks`.
- include_tasks:
file: other_tasks.yml
loop: '{{ range(50000) | list }}'
If other_tasks.yml has the same single task as before, it runs 14x slower:
# other_tasks.yml:
- set_fact:
foo: true
# Executed in 14m13s.
There’s also a huge initialization overhead when looping include_tasks. In
this case, Ansible spends 2 minutes just preparing to start the loop!
Why?Link to this heading
Every loop iteration seems to copy and re-evaluate nearly everything (eg, config, vars, templates etc).
It’s even worse when looping over an include_tasks. Every iteration is
treated as a separate include (ie, each with its own copies of everything), and
even just starting the loop can take minutes.
The slow-down when looping include_tasks scales quadratically with the
number of iterations:
include_tasksgets deduplicated by searching a list for matches.- This is useful when running the same tasks on many hosts.
- Permalink to upstream ansible code. (opens in new tab)
- When looping
include_tasks, every iteration is unique (due to loop vars) so can’t be deduped:- iteration
1: search list (0items), no match so append - iteration
2: search list (1item), no match so append - iteration
N: search list (N-1items), no match so append - total comparisons:
0 + 1 + ... + (N-1) = N(N-1)/2 = O(N^2) N=50kiterations means 1.25 billion failed comparisons!
- iteration
MitigationsLink to this heading
You can sometimes avoid looping over a single task by chaining filters. For example:
# Fast
- set_fact:
result: >-
{{
items
| map('int')
| select('gt', 0)
}}
# Slow
- set_fact:
result: >-
{{
result | default([]) + [item | int]
}}
loop: '{{ items }}'
when: item | int > 0
When you need to loop over more than one task, it’s natural to reach for
include_tasks. But in some cases you can avoid this by using:
product()(opens in new tab) orcombinations()(opens in new tab)- task-level
when:andvars:(both are re-evaluated on each loop iteration) - dynamic variable names (with
varslookup (opens in new tab)) - inline If Expressions (opens in new tab) (eg,
{{ 0 if true else 1 }})
2. Appending to lists/dictsLink to this heading
Let’s say we want to append data into a list in a loop. For example, looping every combination of inputs from an Advent of Code puzzle (which can sometimes mean 100k+ iterations) and storing the results of a calculation:
- hosts: localhost
tasks:
- name: 'list of 100 integers'
set_fact:
my_list: '{{ range(100) | list }}'
result: []
- name: 'append 1000 times'
set_fact:
result: '{{ result + [my_list] }}'
loop: '{{ range(100) | list }}'
What happens if we increase the number of iterations?
| Iterations | Runtime | Peak RAM |
|---|---|---|
| 100 | 1 second | 10 MiB |
| 1,000 | 70 seconds | 1 GiB |
| 10,000 | 2 hours (killed) | >64 GiB |
Appending to create a list of only 1M integers
(100 * 10000)
exhausted all the RAM on my desktop! The same thing applies when appending to
a dict in a loop with combine().
Why?Link to this heading
Complexity is O(N^2 × M) (where N=iterations,
M=data_size) for both time and memory!
In Python, you can use .append() or .extend() instead, but not in
Ansible. Using + or combine() in a loop is O(N^2) which is already
bad, but the reality below is even worse.
This happens for two reasons:
- Ansible’s finalization (ie, template result processing) recursively
recreates nested lists/dicts.
- Each inner
my_listis recreated as a new list object. - This happens for all nested lists on every iteration, not just the new one!
- Each inner
- Ansible stores every iteration’s result in a list until the loop
finishes.
- iteration
1: Ansible storesresult, which has one copy ofmy_list - iteration
2: Ansible stores anotherresult, which has two copies- that’s now
1 + 2 = 3copies ofmy_list
- that’s now
- after iteration
N, Ansible’s internal store now has1 + 2 ... + N = N(N+1)/2copies ofmy_list
- iteration
Relevant code from upstream Ansible.
From Ansible’s _jinja_bits.py (opens in new tab):
# L1024, recursively recreate nested lists:
def _finalize_list(o, mode):
for v in o:
if v is not Omit:
yield _finalize_template_result(v, mode)
From Ansible’s task_executor.py (opens in new tab):
def _run_loop(self, items):
results = []
for item_index, item in enumerate(items):
# L310, result includes ansible_facts:
res = self._execute(variables=task_vars)
# L366, append every iteration's result:
results.append(res)
return results
Another way to think about it is that the first factor prevents data sharing and the second factor prevents garbage collection.
The combined effect is catastrophic. After N iterations there are 1 + 2 + ... + N references to my_list, and each of those is a separate copy (rather
than a shared reference):
N=100iterations = 5k copies ofmy_listin memoryN=1,000iterations = 500k copiesN=10,000iterations = 50 million copies!
Each iteration recreates the entire result so far as a new list object. So
not only does it take increasingly longer, but it also uses up increasingly
more RAM until you question your life choices.
MitigationsLink to this heading
Don’t use Ansible for Advent of Code!
But if you must, a workaround is to append in batches to smaller lists and combine them at the end with a single expression:
- use
batch()(opens in new tab) - use dynamic variable names (with
varslookup (opens in new tab)) - combine in a single expression to avoid repeated list copying:
# Combine batch_0, batch_1 ... batch_N
- set_fact:
result: >-
{{
range(num_batches)
| map('string')
| map('regex_replace', '^', 'batch_')
| map('extract', vars)
| sum(start=[])
}}