We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 5
Chapter 6 - Data Sources
Data sources in detail
A data source in Terraform is used to fetch data from a resource that is not managed by the current Terraform project, so that value can be used in the current Terraform project. You can sort of think of it as a read only resource that already exists. The object exists but you want to read some properties of that object for use in your project. Lets dive into an example. Create a new folder on disk, create a file called main.tf inside it and paste in the following code (or grab the code from the folder data_source_example_01 in the samples repository): 1 provider "aws" { 2 region = "eu-west-1" 3 version = "~> 2.27" 4 } 5 6 data "aws_s3_bucket" "bucket" { 7 bucket = "kevholditch-already-exists" 8 } 9 10 resource "aws_iam_policy" "my_bucket_policy" { 11 name = "my-bucket-policy" 12 13 policy = <<POLICY 14 { 15 "Version": "2012-10-17", 16 "Statement": [ 17 { 18 "Action": [ 19 "s3:ListBucket" 20 ], 21 "Effect": "Allow", 22 "Resource": [ 23 "${data.aws_s3_bucket.bucket.arn}" 24 ] 25 } 26 ] Chapter 6 - Data Sources 23 27 } 28 POLICY 29 } As you can see from the above project a data source block starts with the word data. The next word is the type of data source. We are using a aws_s3_bucket data source, which is used to look up an S3 bucket. After the data source type, we give the data source an identifier in this case "bucket". The identifier is used to reference the data source inside the Terraform project. The data source block is then opened with a {. You then specify any properties you want Terraform to use to search for the resource. We are using the complete name of the S3 bucket we are looking for. You then close the data source block with }. Rather than creating the bucket as we did before this time we are referencing a bucket that already exists. So before you run the above project you will need to create an S3 bucket with the name that you specify inside the data block. In the example above the bucket would be called kevholditch-already-exists. Name the bucket anything you want but then paste the name into the bucket property in the data source. At the bottom of this project we are creating an AWS IAM policy which gives permissions to list the bucket that we looked up in the data source. There are a couple of new concepts in the aws_- iam_policy resource that I want to introduce. The IAM policy itself is a multi line string enclosed in between <<POLICY and POLICY. This is how you define a multi line string in Terraform. You open the multi line string with << then you place any identifier you wish as a single word. I have used POLICY in the example above as I am defining an IAM policy but you could have used anything like <<STATEMENT or <<IAM. You then start your multi line string on the next line and to finish it you use the opening identifier without the <<. Note the closing marker must be at the start of a new line otherwise it is a syntax error. Inside the IAM policy we are using the S3 bucket data source. We are taking the arn from the S3 bucket so that we can use it in our IAM policy. You will notice that to get the value we are using the interpolation syntax ${data.aws_s3_bucket.bucket.arn}. The opening ${ and closing } is needed because we are inside a multi line string so it is telling terraform that we want it to evaluate the value of this and not use it as a string literal. The format of a data source expression is data.<data_type>.<data_identifier>.<attribute_name>. You can get a full list of the attributes that a data resource provides from the documentation website of the provider. To run this project go to the terminal and cd into the folder where you created the main.tf file. Run terraform init to initialise Terraform and then terraform apply. When Terraform runs you will see that it only created a single resource, the IAM policy. This is because the S3 bucket we are using is created outside of Terraform. How are data sources useful? As your Terraform project gets large it can be sensible to break it up into smaller projects to make it easier to maintain. When this happens you can use data sources to reference resources created Chapter 6 - Data Sources 24 in other Terraform projects and still use them. In this case it would always be better to use a data source than to compute the arn yourself which would be possible with something like an S3 bucket. This is because you want Terraform to fail if for some reason the bucket no longer exists. By using a data source you get this behaviour. Imagine you want to create a new AWS EC2 instance using an AMI image from a private repository. You could hard code the name of the AMI image when creating the instance and then manually update it when a new AMI image is released. This would work but it would be quite cumbersome and would require a code change every time you wanted to use the latest version of the AMI image. By using a data source you could set it up so that it always reads the repository and gets the latest version of the AMI image when you run Terraform. You could then reference that data source when creating the EC2 instance and ensure that you always have the latest version of the image. Another reason you may want to use a data source is if you are migrating existing infrastructure to Terraform and you want to reference a resource that is not part of your Terraform project yet. As previously stated it is always better to use a data source rather than compute the value yourself. You want Terraform to know that there is a dependency on the resource. As you want your Terraform apply to fail if the resource cannot be found or if the attribute it returns changes then Terraform will realise when you run apply and update your project with the new value. Chapter 7 - Outputs Outputs explained An output in your Terraform project shows a piece of data after Terraform successfully completes. Outputs are useful as they allow you to echo values from the Terraform run to the command line. For example, if you are creating an environment and setting up a bastion jump box as part of that environment then its handy to be able to echo the public IP address of the newly created bastion to the command line. Then after the Terraform apply finishes you get given the IP of the newly created bastion ready for you to ssh straight onto it. Lets start with an example of outputs. Create a new folder to put our new Terraform project into and create a single file called main.tf and paste in the following code (or grab the code from outputs_- example_01 folder inside the examples repository): 1 output "message" { 2 value = "Hello World" 3 } Try running this project by opening your terminal. Changing directory into the folder that you created where the main.tf file is and then running terraform init and terraform apply. You will see that Terraform runs and then prints the following: 1 Apply complete! Resources: 0 added, 0 changed, 0 destroyed. 2 3 Outputs: 4 5 message = Hello World A couple of interesting things just happened. Firstly, did you notice that Terraform did not pause to ask you if you wanted to do the apply? The reason for this is that Terraform realised there was nothing to do so therefore there was nothing to ask you! You can see from the message above that Terraform states that nothing changed (0 added, 0 changed, 0 destroyed.). You then see Outputs: and under there Terraform prints out the values of all of the outputs you have defined. We defined a single output with the identifier message and gave it the value Hello world so that is what Terraform printed. To define an output you open an output block by using the output keyword. You then start the output block with {. You are only allowed to set a single property called value. Whatever value you Chapter 7 - Outputs 26 give to the value property will be outputted to the console after a successful Terraform apply. You then close the output block with }. Note outputs are used in modules too and have slightly different semantics, this is covered in the chapter on modules. Outputting resource properties The first example is pretty basic and in the real world probably not very useful. Outputs are much more useful when used to output the values of resources that have been created as part of a Terraform run. Lets create another Terraform project and do that. Create a new folder and create a single file called main.tf then paste in the following code (or copy the folder outputs_example_02 in the examples repository): 1 provider "aws" { 2 region = "eu-west-1" 3 } 4 5 resource "aws_s3_bucket" "first_bucket" { 6 bucket = "kevholditch-bucket-outputs" 7 } 8 9 output "bucket_name" { 10 value = aws_s3_bucket.first_bucket.id 11 } 12 13 output "bucket_arn" { 14 value = aws_s3_bucket.first_bucket.arn 15 } 16 17 output "bucket_information" { 18 value = "bucket name: ${aws_s3_bucket.first_bucket.id}, bucket arn: $ {aws_s3_bucke\ 19 t.first_bucket.arn}" 20 } Lets walk through the above code. The provider and resource should be familiar to you. We are simply defining the AWS provider to be used with the eu-west-1 region and setting up an S3 bucket. Feel free to change the name of the bucket to whatever you wish. Next we define an output called bucket_name. In the bucket_name we are going to output the name of the bucket by using the attribute of the S3 bucket resource that we create. We use the same technique to output the ARN of the bucket that we create in the output bucket_arn. In both of those examples because we are directly using the attribute we can just set it equal to value without any quotes. The last output Chapter 7 - Outputs 27 bucket_information prints an interpolated string which will give us the bucket name and bucket arn. As this value is a string with interpolated values we have to surround in quotes and ${ }. Open a terminal, go into the directory where you have defined that project, run terraform init, terraform apply and confirm by typing yes and pressing enter. Terraform runs, creates the S3 bucket and gives me the following output under the Outputs: heading: 1 bucket_arn = arn:aws:s3:::kevholditch-bucket-outputs 2 bucket_information = bucket name: kevholditch-bucket-outputs, bucket arn: arn:aws:s3\ 3 :::kevholditch-bucket-outputs 4 bucket_name = kevholditch-bucket-outputs Terraform got the values from the S3 bucket that it created and outputted them when the run completed. Terraform prints the outputs in alphabetical order, not the order that you define them in your project. That is a good point to make, that Terraform does not care which order you define the blocks in your project. Try reordering them and running terraform apply again. You will notice that Terraform will say that there is nothing to do. Exporting all attributes As of Terraform 0.12> (which this book is based on), Terraform allows you to output an entire resource or data block. To do this take the example that we just had and add the following output (in the examples repository it is outputs_example_03 if you want to just get the code): 1 output "all" { 2 value = aws_s3_bucket.first_bucket 3 } Run the project again (terraform apply) and you will notice that you see an output called all that has all of the attributes that are exported by the aws_s3_bucket resource. Sometimes it can be handy just to output the whole resource to the console. Normally when you are debugging something and you want to see what the value of one of the properties is.