The core of any business is the underlying infrastructure. Over the years, the infrastructure landscape has evolved at a significant pace. Initially, there were traditional data centres. Then there were data centre co-location services. These days Infrastructure-as-a-Service and cloud platforms are more popular among businesses. Amazon has also changed how we run infrastructure by introducing Amazon Web Services (AWS), along with Microsoft Azure and Google's Cloud Platform (GCP). Hence, most businesses are migrating their workloads from on-premise data centres to various cloud platforms. However, some businesses adopt a hybrid approach where applications are deployed across on-premise data centres and (multiple) cloud platforms.
To deploy and manage these complex infrastructure configurations, businesses use Infrastructure-as-Code (IaC) and automation. When you use IaC to create hundreds of resources in cloud platforms such as AWS or GCP, how do you know that the code deploys those resources as expected? Do these resources adhere to compliance and security measures? In short, we don't know for sure, hence there's a requirement for proper infrastructure testing procedure.
Purpose of Infrastructure testing and compliance
In the last 5-10 years, with the growth of cloud platforms such as AWS, Azure and GCP, engineers have adapted to automate their infrastructure deployments. Automation empowered engineers to deploy infrastructure to cloud platforms rapidly and consistently, which reduced infrastructure deployment time from months to hours, but lacked methods to ensure the quality, reliability, and compliance of the deployments.
If we look at some of the IaC tools available, they all come with some form of testing before deploying infrastructure into the platform.
Before jumping into infrastructure testing, let's take a few mins and explore IaC.
What is Infrastructure as Code (IaC)?
In simple term Infrastructure-as-Code as defined by ThoughtWorks:
“means writing code (which can be done using a high level language or any descriptive language) to manage configurations and automate provisioning of infrastructure in addition to deployments.”
IaC has transformed infrastructure provisioning from a point-and-click method to dynamic code, which enables infrastructure teams to manage their infrastructure the same way developers manage their source code making it efficient, cost-effective, and secure.
Below are some popular IaC tools, as well as their feature set:
|Type||Orchestration||Config Management||Config Management||Config Management||Config Management|
|Client Only||Client/Server||Client/Server||Client Only|
Lifecycle (state) Management
Most organisations use at least one or more IaC tools from the above list. If you analyse these tools, it is evident that each tool has its strengths and weaknesses which suit each organisation differently. We have tried most of these tools and have a preference towards Terraform due to ease of use in multi-cloud deployments. We also heavily use Ansible for configuration management. As a result, the rest of this post will focus on the use of Terraform, however the same concepts can be applied to the majority of IaC tools.
Here is a common scenario: you write and deploy a three-tier architecture into AWS using Terraform. Meanwhile, someone from another team changes a security group rule from the AWS console. When you run the Terraform plan again, it'll detect the changes to the security group rule. One purpose of the Terraform plan is to check the difference between the current state of the infrastructure in a Terraform state file, and desired state which you have defined in your code. The Terraform plan will then show the deviation of the desired state of a resource from its current state. During the Terraform apply process, it will remove the rogue rule, aligning the AWS infrastructure with Terraform configuration.
Is it enough?
Terraform audits what is in Terraform state and marks changes that are not in its state as a drift, however Terraform can only audit what's known to it. Enter Inspec.
Chef InSpec is a "free and open-source framework for testing and auditing your applications and infrastructure" according to its website. "Chef InSpec works by comparing the actual state of your system with the desired state that you express in easy-to-read and easy-to-write Chef InSpec code. Chef InSpec detects violations and displays findings in the form of a report, but puts you in control of remediation."
According to its developers, some of the crucial capabilities of Chef Inspec are:
- Make test language and profiles available to all teams dealing with infrastructure
- Easy to use, easy to see results
- Combine a testing language with native support for test collections (aka profiles)
- Built-in but optional control metadata
- Run any test locally and remotely
- Enable easy integration into any CI/CD pipeline tool
Terraform and Inspec
Terraform can enable Amazon Virtual Private Cloud (VPC) flow logs for a particular VPC if you have defined it in your desired state. However, Terraform is not capable of making the existence of flow logs mandatory. In comparison, Inspec is capable of defining controls which explicitly check VPC flow logs enabled for a given list of VPC IDs. Inspec can also be used to perform other checks, e.g. ensure there are no security rules which allows port 22 access from the public. Terraform can then create security group rules as defined in the desired state, or Terraform might remove/update rules if the known state doesn’t match the desired state.
The table below is a comparison between a few Terraform and Inspec capabilities from a compliance perspective.
Type of instance requested
Verify VPC has flow log enabled
Instance has tag ‘ENV = PROD’
Security groups are correct
No security group allows port 22 from ‘0.0.0.0/0’
Root IAM has password policy and MFA enabled
Ensure public Load balancer SG allows port 80 and 443 from ‘0.0.0.0/0’
Inspec/Terraform Usage Examples
The following section will explain how to get Inspec and Terraform running, and perform some of the checks mentioned above.
Details description of Inspec install can be found here.
However, I prefer to use a Docker container to run Inspec commands, mainly because then I don't have to worry about installing various dependencies on my laptop.
I have created a simple Dockerfile based on official chef/inspec image, which you can find here.
git clone https://github.com/akilada/inspec-docker.git
docker build -t inspec:4.16.0 .
docker run -t -d -v ~/Documents/inspec:/share inspec:4.16.0
docker exec -it <container_id> /bin/sh
Note – Mount local folder of the Inspec profile into the container
Recently I had to set up an AWS ECS cluster to deploy a Sonarqube container. This environment consists of a VPC, Subnets, Security Groups, NAT Gateway, Internet Gateway, ALB, RDS, ECS and EC2. Since it covered most of the services use commonly, I decided to use it to try out Inspec.
I created a repository which includes the Terraform code to provision and configure the aforementioned components. The steps below clone the repository and perform steps to initialise then apply the Terraform scripts. It uses Terragrunt, a thin wrapper for Terraform "that provides extra tools for keeping your Terraform configuration dry, working with multiple Terraform modules, and manage remote state."
git clone https://github.com/akilada/terraform-aws-ecs.git
export AWS_PROFILE=test terragrunt init
export AWS_PROFILE=test terragrunt apply
Inspec is capable of using Terraform outputs as variables. In Terraform, many attributes are computed during Terraform apply. For example, some of the attributes are load balancer DNS address, RDS endpoint address, EC2 IP addresses, security groups IDs and subnet IDs. Terraform output enables you to show these attributes after its creation, however you need to define Terraform outputs in your configuration. I have defined a set of outputs in ./outputs.tf. To export these outputs as a JSON file, run the following command:
terragrunt output -json > <inspec_profile_path>
Inspec can create profiles for environments in which you plan to run audit and compliance checks. Some of the advantages of Inspec profiles are organisation of controls, dependencies, and code reuse for a particular environment.
To create a profile, run the following command:
inspec init profile profiles/DemoECS
Inspec tests need to be defined under control folder in the new Inspec profile. Inspec tests check whether AWS resources meet compliance.
Below are some of the tests created for the demo AWS environment:
The purpose of the VPC control is to:
- A given VPC ID exists in an AWS account
- The CIDR range of the VPC matches the CIDR defined in the desired state
- Ensure VPC flow logs are enabled for all VPCs in the AWS account
Security Groups Check
The purpose of the security groups control is to:
- Check if the ELB security group exists
- Compare description/group_name with what’s defined in Terraform
- Validate the number of ingress/egress rules in a security group
- Check security groups belong to a particular VPC ID
- Ensure there are no security group rules that allow port 22 from public
The purpose of the subnet control is to:
- Check VPC ID, CIDR range and availability zone for each subnet for a particular VPC
The purpose of the IAM control is to:
- Check whether MFA is enabled for the root user
- Check IAM password policy
- Report IAM access keys older than 90 days
- Find IAM users with console access but don't have MFA enabled
The purpose of the RDS control is to:
- Ensure various attributes such as the DB name, engine type, engine version, storage type match what’s defined in the Terraform desired state
Execute Inspec Tests
Once you have written all the Inspec tests, run the tests against the environment using the following command:
inspec check profiles/DemoECS
inspec exec profiles/DemoECS -t aws://ap-southeast-2 --input-file profiles/DemoECS/attributes.yml
The following screens show the sort of output expected after running the tests.
Taking It Further
When it comes to security compliance, there are two sides: reactive compliance, and proactive compliance. Inspec falls in to the reactive compliance category. This means you have to deploy your infrastructure first before running tests. This characteristic makes Inspec the most ideal tool to use in automation pipelines.
Consequently, we can use Inspec with CI/CD tools such as Jenkins to continuously test your deployed infrastructure to ensure it complies with compliance policies. Furthermore, it’s easy to write and read Inspec tests. Inspec tests are human-readable, and you don’t have to be a software engineer to understand tests.
As previously stated, Inspec works with all three major cloud platforms. Inspec is also capable of running tests in various operating systems. It has a growing ecosystem where developers have added various plugins and resource types, and has the capability to support custom libraries. This, in turn, allows users to write custom libraries and import these in to a profile if Inspec doesn't have out of the box support for a particular resource type.
In this post I have introduced and shown how to use two tools for provisioning and testing infrastructure. The two tools presented, Terraform and Inspec have enormous potential and compliment each other in quickly, securely and reliably deploying infrastructure, while also ensuring the deployments adhere to a variety of configurable compliance checks. While there are a number of similar tools in the market, I chose these two for the purpose of this post as they are easy to use for the techniques demonstrated. Feel free to ask any questions regarding this post, or get in touch to see in more detail what we could unlock for your organisation.