Setup Terraform Remote State Using S3 & DynamoDB

Terraform Remote State Using S3 & DynamoDB

In this step by step guide, we are going to learn the Terraform remote state management using S3 backend and DynamoDb for state locking.

Prerequisites

  1. AWS Account
  2. AWS CLI access with AWS admin permissions
  3. Terraform installed on your system.

All the code examples used in this guide are part of the Terraform AWS Git Repository. Clone/Fork the repository for your reference.

git clone https://github.com/techiescamp/terraform-s3-backend

Note: The ami used for example in this guide is from the us-west-2 region. If you are using a different regions for testing, replace the ec2 ami accordingly.

What is Terraform State?

Terraform state is a JSON file that contains the complete information about the infrastructure managed by Terraform

The state file keeps track of all the infrastructure resource details and its current state. It does it by mapping the real-world resources to your configuration. It also contains metadata about the resources so that it knows the dependencies between the resources.

You can call it as the source of truth for your infrastructure managed by Terraform.

When you make changes to Terraform configurations, Terraform uses the state file to decide on what changes to be made to the infrastructure.

One key thing to note is, the state file may contain sensitive information about your infrastructure. So it is essential to keep the state file secure.

The Terraform state file get generated automatically when we perform the terraform apply command.

By default, the state file will be stored locally where your terraform files are and the default name of the state file is terraform.tfstate.

Let’s do a hands-on so we will get a better understanding of the Terraform state.

Terraform State File Example

I am going to deploy an EC2 instance using Terraform to understand the state file using an example.

Here is the main.tf file.

resource "aws_instance" "terraform-state-test" {
  ami           = "ami-0cf2b4e024cdb6960"
  instance_type = "t2.micro"
}

This will create a t2.micro ubuntu instance on the AWS in the us-west-2 region

Initialize the Terraform code.

terraform init

To look at the preview of the infrastructure

terraform plan

Lets apply the configuration and create the instance.

terraform apply

Once the infrastructure is deployed, we can see the generated state file in the current directory as shown below.

terraform state

Let’s view what is inside the state file.

terraform state file

This is just a small portion of the state file, the actual state file is quite long and has sensitive information.

Storing state files on your local system is not a good practice because,

  1. You cant collaborate with other developers on infrastructure management unless you have centralized state file.
  2. You may loose the state file due to accidental deletions, file system corruption etc.
  3. You cant implement CI/CD using local state files.

Terraform Remote State & State Lock

Terraform remote state refers to storing the state file in a remote location such as an s3 bucket instead of local workstation.

With remote state you can collaborate with other developers. Also CI/CD systems can make use of the centralised state file during provisioning and deployments.

In this example, we are considering s3 as the remote state backend and DynamoDB for state locking mechanism.

So what is state lock?

We need state locking to ensure one terraform process modifies the resource at a time. If multiple terraform process uses the same state file, it could lead to conflicts and inconsistencies in the state file. (race conditions)

To avoid conflict if more than one team member is deploying a change simultaneously, we use the locking mechanism of the DynamoDB table.

Here is how DynamoDB state locking works.

  1. When terraform wants to modify a resources, it acquires a lock in DynamoDB by creating an entry in DynamoDB table with a specific lock ID (e.g., “lock-abc123”).
  2. If the lock is successfull, terraform gets the access to the state file from s3
  3. Once all the resource modications are done, Terraform updates the state file and releases the DynamoDB lock.

For example, when the developer X executes the terraform code, DynamoDB will lock the state and developer Y should wait until the execution is completed.

Also, DynamoDB has a timeout period to prevent permanent lock-outs. This is helpful in cases where a lock is acquired by terraform and it holds the lock due to abnormal process termination.

Terraform remote backend with s3 and DynamoDB animated workflow

Lets get started with remote state hands-on setup.

Provision S3 Bucket

First we need to create a S3 bucket.

The major advantage of using a remote backend for the state file is its native encryption and versioning mechanism.

Here is the Terraform code to provision an S3 bucket with versioning enabled so that the state file will not be overridden. Replace terraform-state-test-976 with a unique bucket name.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

resource "aws_s3_bucket" "terraform-state" {
  bucket = "terraform-state-dcube"
}

resource "aws_s3_bucket_versioning" "terraform-state" {
  bucket = aws_s3_bucket.terraform-state.id
  versioning_configuration {
    status = "Enabled"
  }
}

Now, to initialize the Terraform code, use the following command:

terrafrom init

To provision the S3 bucket, use the following command:

terraform apply

Now, the bucket will be created in S3 and we can see the bucket in the console.

s3 bucket for the terraform state

Locking Terraform State using DynamoDB

Now we need to create a DynamoDB Table named state_lock_table to implement state locking functionality.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

resource "aws_dynamodb_table" "state_lock_table" {
  name           = "terraform_state_lock"
  billing_mode   = "PAY_PER_REQUEST"
  hash_key       = "LockID"
  attribute {
    name = "LockID"
    type = "S"
  }
}

PROVISIONED and PAY_PER_REQUEST are the types of billing modes available in the DynamoDB table. PAY_PER_REQUEST means that you only pay for the actual read and write requests made to the table, rather than provisioning a fixed capacity. These options depends upon your requirements.

hash_key = "LockID" defines the partition key (also known as the hash key) for the DynamoDB table. The partition key is used to uniquely identify each item in the table. Here, the partition key is set to LockID.

type = "S" means, data type string.

Now, Initialize and deploy the code.

terraform init

terraform apply

To see the Database, open DynamoDB and choose the Tables tab.

dynamodb table for terraform state lock

Terraform Backend Configuration

Now that we have the S3 bucket and DynamoDB table ready, we can test the remote state using an EC2 provisioning example.

There is more than one way to implement remote state with your Terraform configuration. Let’s look at each method.

Note: For demonstration purposes, I have added ec2 instance creation in us-west region. Change the ec2 resource parameters as per your requirement.

Method 1: Add configuration block inside the Terraform code

In this method, you have to add the remote backend configuration in the main.tf file where you have the resource declarations.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

terraform {
  backend "s3" {
    bucket         = "terraform-state-dcube"
    region         = "us-west-2"
    dynamodb_table = "terraform_state_lock"
    key            = "dev/ec2.tfstate"
    encrypt        = true
  }
}

resource "aws_instance" "terraform-state-test" {
  ami           = "ami-0cf2b4e024cdb6960"
  instance_type = "t2.micro"
}

Here, the first block highlighted in bold the backend configuration. It is followed by a resource to create an aws instance.

If you are managing multiple environments such as dev, prod, test, and qa then you need to separate them by directory to keep the state files on the S3 bucket.

In the backend configuration, the value of key is dev/ec2.tfstate, dev is the directory and ec2.tfstate is the custom state file name for better understanding.

Lets deploy the ec2 instance with the s3 backed and state lock to validate the remote state.

terraform init
terraform plan
terraform apply

As shown in the image below, when you run the plan you will see a message “Acquiring state lock

terraform plan with state lock

Check the S3 bucket to ensure that the state file is stored in the Bucket.

tfstate on the s3 bucket

We have enabled the encryption so that the state file will be secure. Also you should provide IAM s3 bucket access to only required members.

Once the state file is stored in the Bucket, then the next time when you perform a plan or apply, Terraform will fetch the state from the bucket and after the execution the current state will be updated on the Bucket as a new file.

You can also check DynamoDB table lock entry data using Explore Table Items option.

terraform state lock dynamodb

Clean up the instance.

terraform destroy

You can also add the backend configuration to a different backend.tf file in the same directory as main.tf. Terraform will automatically pick up the backend configuration from the file.

ec2
├── backend.tf
└── main.tf

Method 2: Dynamically Pass Backend Parameters Using terraform init Command

When it comes to real world project use cases, we cannot hard code the backend parameters to the terraform configuration file. The CI/CD system should be able to dynamically pass the backend parameters in the run time.

This way you can manipulate the remote state file location, environment names and other parameters in the run time based on your requirements.

Here, you add the backend block inside the Terraform configuration file, but you don’t mention any other backend details inside the file except the backend type as given below.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

terraform {
  backend "s3" {
  }
}

resource "aws_instance" "terraform-state-test" {
  ami           = "ami-0cf2b4e024cdb6960"
  instance_type = "t2.micro"
}

The remaining backend information will be given with the terraform initialization command as given below.

terraform init \
    -backend-config="key=dev/ec2.tfstate" \
    -backend-config="bucket=terraform-state-dcube" \
    -backend-config="region=us-west-2" \
    -backend-config="dynamodb_table=terraform_state_lock"

After the initialization, you can directly perform terraform apply or terraform destroy command.

Note: Terraform backend configuration does not support variables, locals, or data sources

Method 3: Use the Backend Configuration From File

In this method, you can store the backend configurations in a separate file, and use the path of the file with the initialization command.

You don’t have to remember backend configuration details, every time you initialize the command, also keeping them in a separate file gives you more isolation.

The backend configuration block should be present inside the Terraform main file is necessary for this method too.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

terraform {
  backend "s3" {}
}

resource "aws_instance" "terraform-state-test" {
  ami           = "ami-0cf2b4e024cdb6960"
  instance_type = "t2.micro"
}

We have to create a file backend.hcl to store the common backend configuration details.

bucket         = "terraform-state-test-9765"
region         = "us-west-2"
dynamodb_table = "terraform_state_lock"
encrypt        = true

Now, we can initialize the Terraform code with key value and the absolute path of the backend configuration file.

terraform init -backend-config="key=dev/jenkins-agent.tfstate" \
-backend-config=backend.hcl

Terraform State Versioning Test

We know that the S3 Bucket has a Terraform state file of the EC2 deployment. To test state versioning, we will modify the ec2 resource by adding a instance tag.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

terraform {
  backend "s3" {
    bucket         = "terraform-state-test-9765"
    region         = "us-west-2"
    dynamodb_table = "terraform_state_lock"
    key            = "dev/ec2.tfstate"
    encrypt        = true
  }
}

resource "aws_instance" "terraform-state-test" {
  ami           = "ami-0cf2b4e024cdb6960"
  instance_type = "t2.micro"
  tags = {
     Name = "test-instance"
  }
}

Initialize and apply the Terraform code.

terraform state validation

Now, the state file will be modified. We can check the state file versions from the S3 Bucket as shown below.

terraform state files

Enable versioning will help to go back to the previous state versions if required.

Overriding State Lock

There are situation where you might end up having lock issues.

You can override the terrform state lock using -lock=false while executing terraform commands as given below.

Note: It is not a recommended approach and you have to cautious while using the flag in actual project environments.

terraform init -lock=false
terraform plan -lock=false
terraform apply -lock=false

Conclusion

I believe this guide gave you an overall idea of how to manage your Terraform remote state file using s3 and DynamoDB. You can try all the backend configuration methods and choose one that satisfies your requirements.

If you face any issues or if you need any suggestion, drop a comment below.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like