Moving HashiCorp Terraform state file
This isn't a blog about board games, but HashiCorp Terraform makes me think of the Terraforming Mars board game!
Terraforming Mars is a board game designed by Jacob Fryxelius and published by FryxGames in 2016. The game is set in the future, where players take on the role of corporations that work together to terraform Mars and make it habitable for human life. You can check more here: BoardGameGeek
Terraform state file
Infrastructure as Code (IaC) enables us to use code to manage infrastructure resources. This approach makes it easier to manage complex infrastructures, reduce manual errors, and increase efficiency.
These days HashiCorp Terraform is one of the popular IaC tools. It supports a wide range of Cloud providers and services including AWS, Azure, GCP, K8S, and many others. This enables infrastructure engineers to manage their infrastructure resources in a consistent way, regardless of the cloud provider they are using.
Terraform provides a state management mechanism to track the state of the infrastructure resources. This allows us to understand the current state of the infrastructure, identify changes that have been made, and easily make updates. Terraform stores the current state of the infrastructure in a file called tfstate. This state is used by Terraform to map real world resources to the configuration, keep track of metadata, and to improve performance for large infrastructures.
This state file is stored locally by default in a file called terraform.tfstate
. Terraform utilizes the state file to generate plan and carry out modifications to the infrastructure. Terraform performs a refresh before carrying out any action to update the state with the current state of the infrastructure. That’s why we see Refreshing state…
in each Terraform plan output.
$ terraform plan
aws_dynamodb_table.dynamodb_locktable: Refreshing state... [id=terraforming-mars-locktable]
aws_s3_bucket.s3_tfstate: Refreshing state... [id=terraforming-mars-tfstate]
You can read more about Terraform state purpose here: Purpose of Terraform State
Terraform backend
Terraform enables us to collaborate with other members of our team by using version control systems such as Git. This makes it easier to share infrastructure code, review changes, and ensure that everyone is working on the same version of the infrastructure. However, using a local Terraform state file can be challenging because everyone must make sure to pull the latest tfstate file locally and ensure that nobody else is running Terraform at the same time.
To solve this issue, Terraform introduces remote state. Using remote state, the state file can be written to a remote data store. Now, teammates can collaborate on a project without any concern about the latest tfstate file version. Remote state is implemented by a backend or by Terraform Cloud.
Terraform supports various types of backends, including AWS S3, Azure Blob Storage, and HashiCorp Consul. These backends provide remote storage for Terraform state files, making it easier to manage infrastructure resources across teams and environments. When using a remote backend, Terraform can read the current state and apply changes to the infrastructure based on that state.
However, what if Terraform executed concurrently? Terraform has an ability to lock the state file. Whenever there is a possibility of writing state, the process of state locking occurs automatically. Backends are responsible for providing an API for state locking and state locking is optional.
Default backend
Terraform uses a backend called local
as the default option, which stores the state data locally as a file on the disk. It means that we do not need to add backend block configuration. For example, the below code block shows the terraform block configured with aws provider:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
}
The state file in Terraform is typically stored locally in the current project directory. However, you may wonder how to store the tfstate file in a different location. This can be accomplished by specifying a backend configuration in your Terraform code, which tells Terraform where to store the state file:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
backend "local" {
path = "local_path/terraform.tfstate"
}
}
By adding the backend configuration block and running terraform init
, you will get an error message indicates change in backend configuration:
$ terraform init
Initializing the backend...
Error: Backend configuration changed
A change in the backend configuration has been detected, which may require migrating existing state.
If you wish to attempt automatic migration of the state, use "terraform init -migrate-state".
If you wish to store the current configuration with no changes to the state, use "terraform init -reconfigure".
The error message simply explains the root cause and the possible solutions. The -migrate-state
option will attempt to copy existing state to the new backend, and depending on what changed, may result in interactive prompts to confirm migration of workspace states. On the other hand, the -reconfigure
option disregards any existing configuration, preventing migration of any existing state.
If you are trying to move the state file from the default working directory to your custom directory,
-migrate-state
is the correct option.
Migrating to a remote backend
Now, how can we move the local state file of a current project to a remote backend? As we understood, using a remote backend can help improve collaboration, scalability, security, and ease of management when working with Terraform.
I would like to divide the supported backends into two categories: Local and Remote. In the Local group, the state file is stored locally (default or using a local configuration). The Remote group includes options such as Terraform Cloud, AWS S3, Azurerm, and others.
HashiCorp says that remote backend is unique among all other Terraform backends. Read more about it here: Terraform Remote Backend
In this demonstration, I try to use AWS S3 backend. AWS S3 backend supports state locking via AWS DynamoDB. It means that it doesn’t support state locking out of the box.
As an example of an in-the-box locking feature, Azurerm supports state locking and consistency checking with Azure Blob Storage native capabilities.
Implementation
In the first step, let's create resources an AWS to support storing Terraform project state file and status. Based on a project experience, I have a project called iac-base
includes all the base infrastructure for other projects deployment. The below code block shows iac-base
resources:
# S3 bucket
resource "aws_s3_bucket" "s3_tfstate" {
bucket = "terraforming-mars-tfstate"
}
# S3 bucket ACL
resource "aws_s3_bucket_acl" "s3_acl" {
bucket = aws_s3_bucket.s3_tfstate.id
acl = "private"
}
# S3 bucket encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "s3_encryption" {
bucket = aws_s3_bucket.s3_tfstate.bucket
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
}
}
}
resource "aws_s3_bucket_versioning" "s3_bucket_versioning" {
bucket = aws_s3_bucket.s3_tfstate.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_lifecycle_configuration" "s3_bucket_retention_policy" {
bucket = aws_s3_bucket.s3_tfstate.id
depends_on = [aws_s3_bucket_versioning.s3_bucket_versioning]
rule {
status = "Enabled"
id = "retention_policy"
noncurrent_version_expiration {
noncurrent_days = 180
}
}
}
resource "aws_s3_bucket_public_access_block" "bucket_block_public" {
bucket = aws_s3_bucket.s3_tfstate.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
# DynamoDB
resource "aws_dynamodb_table" "dynamodb_locktable" {
name = "terraforming-mars-locktable"
hash_key = "LockID"
billing_mode = "PROVISIONED"
write_capacity = 1
read_capacity = 1
attribute {
name = "LockID"
type = "S"
}
}
The above code block creats a AWS S3 bucket based on the best practices and a DynamoDB table for state locking. After applying the configuration, your base resources to store projects state files is ready. Now we are ready to migrate projects state file from local to AWS S3 remote backend. Modify your Terraform code block to add AWS remote backend configuration:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
backend "s3" {
bucket = "terraforming-mars-tfstate"
key = "terraform.state"
region = "eu-west-1"
encrypt = true
dynamodb_table = "terraforming-mars-locktable"
}
}
I created base resources in
eu-west-1
region. You should use the correct region based on your configuration.I also migrate
iac-base
Terraform state file to this remote backend.
Migration from local to a remote backend is EASIER than moving resources from Earth to Mars. You only need to run terraform init
. Terraform detects the new backend configuration, and asks about migrating:
$ terraform init
Initializing the backend...
Do you want to copy existing state to the new backend?
Pre-existing state was found while migrating the previous "local" backend to the
newly configured "s3" backend. No existing state was found in the newly
configured "s3" backend. Do you want to copy this state to the new "s3"
backend? Enter "yes" to copy and "no" to start with an empty state.
Enter a value: yes
Releasing state lock. This may take a few moments...
Successfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.
Initializing provider plugins...
- Reusing previous version of hashicorp/aws from the dependency lock file
- Using previously-installed hashicorp/aws v4.57.0
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
One of the core concepts of the IaC is about writing one time, and use several times. For example, you can use the same resource implementation to deploy in several environments such as Development, Test, Stage, or Production. Then it comes to a concept called multi-account and multi-backend architecture. I will discuss this concept in a separate blog.
Changing the S3 Bucket
I always say it is better to consider all details before implementation. A design document including all the project details can prevent most future issues. For instance, naming conventions is one of my criteria. But it might happen that you should change the state S3 bucket to another bucket. In this case, Terraform can move your state file from one bucket to another bucket using terraform init -migrate-state
. In the below code block, I try to move state file from terraforming-mars-tfstate
bucket to terraforming-venus-next-tfstate
:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
backend "s3" {
bucket = "terraforming-venus-next-tfstate"
key = "terraform.state"
region = "eu-west-1"
encrypt = true
dynamodb_table = "terraforming-venus-next-locktable"
}
}
Best practices
Enable encryption for S3 bucket: Using encryption for the state file in the S3 bucket. State files can contain secrets, keys, etc. in plaintext. So, it is important to keep it encrypted. AWS S3 backend supports different encryption methods:
encrypt
- Enable server side encryption of the state file.kms_key_id
- Amazon Resource Name (ARN) of a Key Management Service (KMS) Key to use for encrypting the state. Note that if this value is specified, Terraform will need kms:Encrypt, kms:Decrypt and kms:GenerateDataKey permissions on this KMS key.sse_customer_key
- The key to use for encrypting state with Server-Side Encryption with Customer-Provided Keys (SSE-C). This is the base64-encoded value of the key, which must decode to 256 bits. This can also be sourced from theAWS_SSE_CUSTOMER_KEY
environment variable, which is recommended due to the sensitivity of the value. Setting it inside a terraform file will cause it to be persisted to disk in terraform.tfstate.
Enable S3 bucket versioning: Enabling bucket versioning on the S3 bucket is strongly advised as it enables recovery of the state in case of unintended deletions and mistakes.
Enable retention lifecycle policy: As S3 bucket versioning enables, it is wise to have a retention lifecycle policy to delete the old state file objects. You can add noncurrent_version_expiration
policy based on your project/organization definition.
Suggested structure for single-environment projects: This is only a suggestion based on my experience with how to have a structure for your Terraform projects. As we discussed it is a good idea to have a project called iac-based
including all your base configurations. For example, resources for your Terraform backend:
$ tree
.
├── iac-base
│ ├── search-planet-x-state.tf
│ ├── tf-mars-state.tf
│ └── tf-venus-state.tf
├── searching-for-planet-x
├── terraforming-mars
└── terraforming-venus