Finding Relationships for Struggling IaC Templates

Does the thought of reliable, automated and repeatable infrastructure get you excited? A little bit flustered? Maybe even salivating at the mouth? We were all three after we were finding out about Azure Resource Manager (ARM) Templates, Microsoft’s implementation of Infrastructure as Code (IaC), which lets you do precisely that: treat infrastructure as code. You can use these templates to deploy anything from subnets to SQL Server 2014 Always On Availability Groups. We found that these templates didn’t exactly do everything we were after, so we decided to start making our own.

The Situation

We were a few months into the development of ARM templates for our Azure Factory when we realised that some of our resources, such as VMs, VNETs, and Storage Accounts, in individual templates had more similarities than dissimilarities.  

The exact same section in two separate templates.

It felt silly defining the same resources multiple times across our library of templates, especially considering the very real possibility that a new feature could be introduced to a resource in the future, or an old feature could be made obsolete and removed. If either of these things happened, we’d have to scan through each template that implemented that resource and change the code.

Being good developers means we like to keep things DRY rather than WET (we leave the second one for the weekend), so we decided to take advantage of IaC and modularize some of these resources into their own templates. 

We were able to do this because a template can call other templates (stored in a private Azure Storage, or in GitHub , and deploy them. Microsoft calls these linked templates, but we just call them “child templates”.

Using a linked/child template.

This was perfect for managing our library of templates since we only had to change a template in one location if it needed modification.

Pictured: Single member being used in three places

As an aside, the reality is that every infrastructure deployment is different, and we occasionally need to modify our base templates to accommodate for the unique requirements of a project. However, these base templates work in 90% of cases or are easily adjusted to meet the requirements.

This linking means you can have a template that links to a template that links to a template that links to ANOTHER template. The problem with this is that a template only shows its direct child templates, not its grandchildren onwards, nor its parent templates (if it is called by any other templates at all). Therefore there’s no way to quickly and easily view its template hierarchy.

There are two problems caused by this. Firstly, you can’t analyze a template deployment unless you perform a manual search. With modularized templates (which were meant to be the “right thing to do”), complex deployments are very quickly going to make you feel like Alice in Wonderland chasing a rabbit named “templateLink”.

The second problem is that if you modify a template, especially one that’s close to the bottom of multiple complex infrastructure deployments, you’re going to have to propagate that change through all of its parents. A problem similar to the previous one, except you’re going to have to perform a manual search on all of your templates, to find any that use the modified template at any point of its deployment. If you fail to propagate the change in a particularly lengthy deployment, it might not fail until an hour or so in, causing a big waste of time (but you were working on something else while waiting for those VMs to deploy, right?).

You could use a lot of post-it notes, strings, and thumb-tacks to manage your template relationships, but you’ll probably end up looking something like this guy:

If we’re being honest, we were looking a little bit like him until we found a fix for our relationship issues.

Designing the Solution

We needed a way to show all of our template relationships. While text would work fine for us developers; we needed to visualise it for the bosses – kind of like a picture book so they’d be able to understand.

One of our suspiciously fresh-faced managers (photo source)

As previously mentioned, each template only knows its direct children; not its parents (if any), nor its grandchildren onwards (if any). Thus, we were dealing with a collection of trees, known as a forest, and needed to implement a recursive solution.

On top of this, some of our templates were used by other templates multiple times. Since a template’s list of children never changes, we only needed to calculate its children once. Evaluating it again (or again and again) would give us the same results as the first time. Therefore, we could use that solution every time we encountered that template instead of redundantly recalculating it.

Can you see how this solution ends up being part of the solution for a larger template deployment? This is the perfect setup for dynamic programming, which often goes hand-in-hand with recursion.

We only need to find evaluate “C” once when finding children for “A”. We can then use that solution when evaluating “Z” and “M” for children.

So we’ve got an idea of how we’re going to solve this problem, but how are we going to access and scan the templates in the first place?

We work on our templates using VSTS. Commits to our main branches trigger a build function that copies them to our Azure Storage Account. As well as demonstrating to our clients how we use Azure Storage, it also provides a secure way to access our templates.

Each file, or blob as they’re called in Azure, is accessible through a direct link, but only if that direct link is appended by a SAS token. This token is generated by us and kept private.


Example of a direct link.

These direct links are used by parent templates to call child templates. These links could also be used for API calls, such as in our Azure Factory.

We could create a list of direct links for all of the templates and iterate over that to find their children. But what if we added or removed a template, or changed the SAS token? We’d have to make the changes manually (lots of room for human error) or automate the process somehow.

As it turns out, we can avoid maintaining this list of links entirely by directly iterating over the Storage Account using the PowerShell module. We don’t have to worry about changes to templates affecting our results since we’re always working with the most up-to-date files.

######################################################

# Sets up access to blobs in the storage account $storageAccountContext = New-AzureStorageContext $storageAccountName -StorageAccountKey $storageAccountKey

 $storageAccountContainer = Get-AzureStorageContainer -Context $storageAccountContext

 $blobs = Get-AzureStorageBlob -Container $containerName -Context $storageAccountContext

######################################################

Although we’re always working with the most up-to-date files, how do we know if we’ve still got the most up-to-date results? Changes to our templates will get uploaded to our Storage Account, but the template relationships won’t be updated until we run our “find all relationships” method.

This is where Azure Functions steps in.

Put succinctly, these allow you to run scripts in the cloud that are triggered by a timer (run every five/ten/sixty minutes), a GET request to a url whenever a blob is uploaded or updated in Azure Storage, and more.

Having our script triggered whenever we update our Storage Account means we’ve always got the latest template relationships.

One last thing to get our templates set up for hierarchical success. We created variables for any external resources a template called. These could be child templates (what we were interested in), or for .zip files containing DSC for VMs.

These variable names were standardized by following this pattern: <template-name-url>. For example, “101-vm-md-url” or “201-sql-server-cluster-url”. The aforementioned direct links populate them.

Following this naming pattern made it easy to find child template url variables when parsing the templates.

Keeping all of our templates url variables in one place.

Using the variables section to store the url links, and any other data when possible, is a good practice anyway, so you’ve got a one-stop-shop for string management. Problems arise when you hardcode values twice or more in the template, and change one of them and not the other. It’s akin to using magic numbers.

The (Pseudo)Code

The code is written in PowerShell in an Azure Function. A hash table is created to store each template and their direct children.

The code scans through our entire Storage Account to find template files. If a template is found, it’s checked to see if it already exists in the hash table. If it doesn’t, it’s fed into our recursive function to see if it has any child templates. If it does exist in the hash table, it means that it has already been evaluated for children in our recursive function and doesn’t need to be rechecked (dynamic programming).

In the recursive function, the template is downloaded using its direct link and parsed to see there are any child templates. If a child template is found, it is added to the hash table as a value of the parent template, e.g., parent: [child_1]. The child template is then checked to see it has its entry in the hash table, for the same reasons as the templates in the previous paragraph. If it already exists in the hash table, leave it (dynamic programming); but if it doesn’t, that means it’s yet to be evaluated and so is fed into the recursive function to check for its children.

The terminating condition of the recursive function is either having a template with children that have all already been evaluated (meaning we don’t run the recursive function on anything), or a template that has no children.

Our function is a depth-first search of a tree combined with dynamic programming.

As always, trying to describe code like this makes about as much sense as putting your shoes on before your socks, so let’s look at some pseudocode:

######################################################

create hash table

scan through directory of templates

if (template found)

if (template not already in hash table)

                        run recursive function

 

/ /recursive function:

add template to hash table as a key

parse template for child templates

if (child template found)

            add to parent template’s list of child templates in the hash table

            if (child template not already in hash table)

                        run recursive function on child template

######################################################

When this script completes, we’ll have a hash table of all of our templates and their direct children. Here’s one template and its direct children:

 

Displaying the Data

It’s straightforward to manipulate the table into a form we want from here. In particular, we wanted to visualise the direct parent-child relations. We decided to use Azure Tables (a key-value NoSQL database), for storing the data, in combination with Power BI, for displaying the data.

To show the relationships in Power BI, we needed to establish every parent-child relationship individually. This meant converting the hash table seen just above into the following for the Azure Table:

We defined our Azure Table as a data source for Power BI, created a Network Navigator visualization, and voilà!

And if we want to show the children of children as well:


We finally did it!

It took a surprising amount of work to implement this, especially since the problem itself seems so innocuous: show us how the templates are related. We eventually managed to do it using a full Microsoft/Azure stack, which included:

  • ARM Templates – our IaC to deploy resources in the Azure cloud
  • Azure Storage Account – for saving our templates
  • Azure Functions – where we stored our script, triggered whenever we updated the Storage Account
  • Azure Tables – to store the direct parent-child relations as a data source for Power BI
  • Power BI – for visually displaying the relationships

However, it definitely isn’t clear what the hierarchy of this template deployment is (it starts with the centre node “201-Palo-Alto-VM-new-Vnet”). It probably would’ve been more appropriate to use something like DOT for this instead of Power BI, but we wanted to demonstrate an end-to-end Microsoft/Azure solution. It’s easy enough to put it into DOT format anyway since we’ve already got a hash table with all the direct parent-child relationships.

 

 

 

 

 

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Get in touch

Your Name (required)

Your Email (required)

Subject

Your Message

Book a Demo

Your Name (required)

Your Email (required)

Phone Number(Optional)

Ask a Question

Your Name (required)

Your Email (required)

Phone Number(Optional)

Your Message

Book Assessment

Evaluate your modern workplace security posture and validate it against current best practices with a Microsoft Secure Score Assessment, from cubesys

Your Name (required)

Your Email (required)

Phone Number(Optional)

Book your Windows Analytics Deployment

Your Name (required)

Your Email (required)

Phone Number(Optional)