An Open View

Practical JQ: creating Route53 hostnames from AWS tags

In some circumstances, you might want to create hostnames based on available machine meta‑data. While this is clearly not the ideal case, this article reviews an excellent example of available tooling.


The plan...

We will show you how to pull AWS tags, manipulate the value of one of the tags, and publish that as the hostname. This assumes that you have an AWS account, credentials and that you have configured the aws‑cli with aws configure. JQ, what's that?

JQ is a command line JSON parsing and query engine. It might be the closest thing to a Swiss Army Knife for JSON format data manipulation. Since it seems that every API and interface created uses JSON as its data interchange format, knowledge of such a tool goes a long way. If you plan on using shell logic, text tools like sed, awk and grep tend to be awkward and error prone for JSON-formatted text; you need a tool like jq.

...and Route53?

Route53 is the AWS DNS service. It provides a facility where machines can be located by name, and then takes input that would be familiar to anyone who has used BIND in the past. While it can do significantly more than just name resolution, we are simply adding basic records here.

A note on tool versions

We are using jq version 1.5 in this example. This is needed since we are using regular expressions in a particular way. If you don't need to reformat the content of the AWS tag (string munging), then the rest of the recipe should work just fine with previous versions. Use jq --version to check what you are running.

Our AWS cli version is aws-cli/1.8.8 Python/2.7.10 Darwin/17.3.0, similarly available via aws --version.

Getting the AWS data

In order to fetch the tags and instance metadata, we use the following command:

aws ec2 describe-instances --filters Name=tag:Name,Values=ANSLAB\* Name=instance-state-name,Values=running

Notice that we filter the instances both by their state (running) and by the associated tag prefix. We could filter both of those in the upcoming jq expression, but this seems like a reasonable way to scope the problem and make the jq expression a bit easier to write.

Input for processing

This is a significantly truncated version of the records returned. This shows the basic schema of the records we are trying to process:

    "Reservations": [
            "Instances": [
                    "Monitoring": {
                        "State": "disabled"
                    "PublicDnsName": "", 
                    "State": {
                        "Code": 16, 
                        "Name": "running"
                    "EbsOptimized": false, 
                    "Tags": [
                            "Value": "ANSLAB-student02-node02", 
                            "Key": "Name"
                  ... more instances ...
            "ReservationId": "r-00000000000000000", 
            "Groups": [], 
            "OwnerId": "555555555555"

   ... more reservations ...


Expectations for output

Our goal is to transform this input into a description of the DNS resource records we want. We will then pass this on to another aws‑cli command in order to publish the names into our DNS zone. We are assuming that you have created a Route53 hosted zone already, and have a zone to publish to. Even if you don't have one, creating the records will show the capability.

In order to view the basic Route53 request schema, you can run aws route53 change-resource-record-sets --generate-cli-skeleton, though you may end up with some fields more than you need for basic records. What we want to generate is similar to this:

  "Comment": "A comment for our job",
  "Changes": [
      "Action": "CREATE",
      "ResourceRecordSet": {
        "Name": "",
        "Type": "CNAME",
        "TTL": 600,
        "ResourceRecords": [
            "Value": ""
   ... as many records as we need to add ...

If you plan on updating existing records, you can change the action to "UPSERT". If you plan to clean up, you could set it to "DELETE". This is the basic boilerplate for adding our DNS entries.

The recipe

JQ query constructs used

We'll need to make use of the following syntax elements:

Converting the tags

Now that we have our input data, the second part of the pipeline is the filter command. I'll show you the whole command and then we can break it into pieces:

jq -r ' { Comment: "add nice names for instances", Changes: [ .Reservations[].Instances[]| [ .PublicDnsName + ".", (.Tags[]|select(.Key=="Name").Value|sub("ANSLAB-student(?<studentnum>[0-9][0-9])-(?<nodetype>node[0-9]|main)";"\(.nodetype).s\(.studentnum)")) ] | { Action: "CREATE", ResourceRecordSet: { Name: .[1], Type: "CNAME", TTL: 600, ResourceRecords: [ { Value: .[0] } ] } } ] } '

Wow, that's a big bite, huh? The thing to realize is that query, iteration, and output construction are all wrapped into one, so there are really layers from the inside out. It's not as bad as it looks, so let's take this a piece at a time.

Extracting the tags

Let's start with the heart of the matter, getting the tags out of the AWS list of instances:

.Reservations[].Instances[] | [ .PublicDnsName + ".", (.Tags[] | select(.Key=="Name").Value | sub( ... ) ]

This statement looks into the structure and pulls a list of the instances. This is then passed to the next two-part query (the first pipe symbol). The comma indicates that the stream of records should go to both parts in parallel (tee split).

Formatting the tags

Let's look at the next expression:

  1. .PublicDnsName + "."
  2. ,
  3. (.Tags[]|select(.Key=="Name").Value

The first expression is simple: it returns the concatenation of the existing DNS name with a period. This is how we create a fully qualified DNS name, meaning that it should never have anything added to it during a DNS search.

The second expression grabs the Tags section of the Instance record, then does a select on the structure to pull out only the keypair where the key is the "Name" of the instance. This is the tag that is recognized by the AWS console, so it's rather common to use. In our case we don't need to worry about it not being set since the filter on the aws‑cli command only returns records where this is present. (There will never be a null value here).

Munging the name

The result of the Name tag is piped into the part that converts the string into a hostname:

|sub("ANSLAB-student(?<studentnum>[0-9][0-9])-(?<nodetype>node[0-9]|main)"; "\(.nodetype).s\(.studentnum)")) ]

The sub function takes two arguments: an expression to match and a replacement. It matches a string, which in this case allows us to grab sections of the tag and use them in the replacement result.

  1. sub(
  2. "ANSLAB-student(?<studentnum>[0-9][0-9])-(?<nodetype>node[0-9]|main)";
  3. "\(.nodetype).s\(.studentnum)")
  4. )

Happily, the library used for this supports using descriptive names for the captured fields. The names assigned here are only available inside of the expression. This is known as a capture group in the domain of regular expressions. We are grabbing the student number and the machine type and placing it into the new DNS name.

Package and pass

Everything is placed into a two-item list by the surrounding square brackets ([ ]), which we will the pass to the next stage. This is done to simplify the expression.

Building the output

Building output with jq is simply using the bracket and square bracket expressions, along with literal values "Key": "value" in order to represent the output. As an example, the following would create the dictionary structure exactly as written in jq:

"dict1": "one", "dict2": [ "a", "b" ] }

In our query, we use the following expression:

{ Comment: "add nice names for instances", Changes: 
   [ <get tags> | <make two element array> ] | 
 { Action: "CREATE", 
        { Name: .[1],
          Type: "CNAME",
          TTL: 600,
          ResourceRecords: [ 
               Value: .[0] 

This shows the "engine" in the center and how it is wrapped with the output logic. The .[0] and .[1] refer to the array we created in the previous steps, meaning the fully qualified name and the munged DNS name respectively.

The expanded spacing makes it easier to read and recognize, but all of the spacing is irrelevant. JSON does not require any particular formatting.

Publishing the hostnames

Now that we have our records in a workable format, all that is needed is to put them into our zone. For that we turn back to our aws‑cli:

aws route53 change-resource-record-sets --hosted-zone-id ZZZZZZZZZZZZZ --cli-input-json file://inputfile

Once I feel confident in the process, I like to combine the steps into one pipeline. I pass the filename on the DNS update command as --change-batch=file:///dev/stdin which allows me to run the entire process as one shell pipeline.

...and here it is:

aws ec2 describe-instances --filters Name=tag:Name,Values=ANSLAB\* Name=instance-state-name,Values=running |jq -r ' { Comment: "add nice names for instances", Changes: [ .Reservations[].Instances[]| [ .PublicDnsName + ".", (.Tags[]|select(.Key=="Name").Value|sub("ANSLAB-student(?<studentnum>[0-9][0-9])-(?<nodetype>node[0-9]|";"\(.nodetype).s\(.studentnum)")) ] | { Action: "CREATE", ResourceRecordSet: { Name: .[1], Type: "CNAME", TTL: 600, ResourceRecords: [ { Value: .[0] } ] } } ] } ' | aws route53  change-resource-record-sets  --hosted-zone-id ZZZZZZZZZZZZZ --change-batch=file:///dev/stdin


Eric Noriega

Eric Noriega is a Senior consultant with Vizuri. With over 20 years of experience with Unix, Linux, and operations, he currently focuses on DevOps, Virtual and Cloud based automation and orchestration, and containerized workloads.