Get value out of a tolist([ data object in Terraform - variables

I have this terraform output:
output "cloud_connection" {
value = data.cloud_connection.current.connection[0]
}
$ terraform output
cloud_connection = tolist([
{
"id" = "123"
"settings" = toset([
{
"vlan_id" = 100
},
])
"connection_type" = "cloud"
},
])
I need to get the vlan_id to reuse it later on.
output "cloud_connection" {
value = data.cloud_connection.current.connection[0].settings[0].vlan_id
}
$ terraform output
cloud_connection = tolist([
tolist([
100,
]),
])
The problem is I can't seem to be able to get the vlan id out of a list.
When I try:
output "cloud_connection" {
value = data.connection.current.connection[0].settings[0].vlan_id[0]
}
I am getting this error:
│ Elements of a set are identified only by their value and don't have any separate index or key to select
│ with, so it's only possible to perform operations across all elements of the set.
How can I get the vlan_id alone?
Thanks

Assuming that you know that you have at least 1 element in each of the nested collections, the idiomatic way would be to use splat expression with some flattening:
output "cloud_connection" {
value = flatten(data.connection.current.connection[*].settings[*].vlan_id)[0]
}

You can use the function join, assuming you´re getting the following output:
$ terraform output
cloud_connection = tolist([
tolist([
100,
]),
])
For example:
output "cloud_connection" {
value = join(",",data.cloud_connection.current.connection[0].settings[0].vlan_id)
}
That returns:
cloud_connection = "100"

Related

How can we dynamically generate a list of map in terraform?

I have a list of rules which i want to generate at runtime as it depends on availability_domains where availability_domains is a list
availability_domains = [XX,YY,ZZ]
locals {
rules = [{
ad = XX
name = "service-XX",
hostclass = "hostClassName",
instance_shape = "VM.Standard2.1"
...
},{
ad = YY
name = "service-YY",
hostclass = "hostClassName",
instance_shape = "VM.Standard2.1"
...
}, ...]
}
Here, all the values apart from ad and name are constant. And I need rule for each availability_domains.
I read about null_resource where triggers can be used to generate this but i don't want to use a hack here.
Is there any other way to generate this list of map?
Thanks for help.
First, you need to fix the availability_domains list to be a list of strings.
availability_domains = ["XX","YY","ZZ"]
Assuming availability_domains is a local you just run a forloop on it.
locals {
availability_domains = ["XX","YY","ZZ"]
all_rules = {"rules" = [for val in local.availability_domains : { "ad" : val, "name" : "service-${val}" , "hostclass" : "hostClassName", "instance_shape" : "VM.Standard2.1"}] }
}
or if you dont want the top level name to the array then this should work as well
locals {
availability_domains = ["XX","YY","ZZ"]
rules = [for val in local.availability_domains : { "ad" : val, "name" : "service-${val}" , "hostclass" : "hostClassName", "instance_shape" : "VM.Standard2.1"}]
}

For loop with If condition in Terraform

I'm writing a module to create multiple S3 Buckets with all the related resources. Currently, I'm a little bit stuck on the server side encryption as I need to parametrize the KMS key id for a key that is not still created.
The variables passed to the module are:
A list of S3 buckets
A map with the KMS created
The structure of the S3 buckets is
type = list(object({
bucket = string
acl = string
versioning = string
kms_description = string
logging = bool
loggingBucket = optional(string)
logPath = optional(string)
}))
}
The structure of the KMS map is similar to
kms_resources = {
0 = {
kms_arn = (known after apply)
kms_description = "my-kms"
kms_id = (known after apply)
}
}
This variable is an output from a previous module that creates all the KMS. The output is created this way
output "kms_resources" {
value = {
for kms, details in aws_kms_key.my-kms :
kms => ({
"kms_description" = details.description
"kms_id" = details.key_id
"kms_arn" = details.arn
})
}
}
As you can see the idea is that, on the S3 variable, the user can select his own KMS key, but I'm struggling to retrieve the value. At this moment, resource looks like this
resource "aws_s3_bucket_server_side_encryption_configuration" "my-s3-buckets" {
count = length(var.s3_buckets)
bucket = var.s3_buckets[count.index].bucket
rule {
apply_server_side_encryption_by_default {
kms_master_key_id = [
for k in var.kms_keys : k.kms_id
if k.kms_description == var.s3_buckets[count.index].kms_description
]
sse_algorithm = "aws:kms"
}
}
}
I thought it was gonna work but once terraform is giving me Inappropriate value for attribute "kms_master_key_id": string required.. I would also like that if the value does not exist the kms_master_key_id is set by default to aws/s3
The problem seems to be that you try to give a list as kms_master_key_id instead of just a string. The fact that you alternatively want to use "aws/s3" actually makes the fix quite easy:
kms_master_key_id = concat([
for k in var.kms_keys : k.kms_id
if k.kms_description == var.s3_buckets[count.index].kms_description
], ["aws/s3"])[0]
That way you first get a list with keys / that key that matches the description and then append the default s3 key. Afterwards you simply pick the first element of the list, either the first actual key or if none matched you pick the default key.

Terraform - Iterate over a list generated from a for_each on a data block

UPDATE -> To all folks with this particular problem, I found the solution, it is at the end of this question.
UPDATE 2 -> The last solution that I presented here was WRONG, see the update at the end.
I am retrieving a list of cidr_blocks from a data block to use as a value on a aws_ec2_transit_gateway_route, but so far I have been unable to iterate through that list to get individual values and set it on the appropriate place.
The important piece of my data_block.tf looks like this:
data "aws_vpc" "account_vpc" {
provider = aws.dev
count = "${length(data.aws_vpcs.account_vpcs.ids)}"
id = element(tolist(data.aws_vpcs.account_vpcs.ids), 0)
}
data "aws_subnet_ids" "account_subnets" {
provider = aws.dev
vpc_id = element(tolist(data.aws_vpcs.account_vpcs.ids), 0)
}
data "aws_subnet" "cidrblocks" {
provider = aws.dev
for_each = data.aws_subnet_ids.account_subnets.ids
id = each.value
}
And the part where I intend to use it is this one, tgw_rt.tf:
resource "aws_ec2_transit_gateway_route" "shared-routes" {
provider = aws.shared
#count = length(data.aws_subnet.cidrblocks.cidr_block)
#destination_cidr_block = lookup(data.aws_subnet.cidrblocks.cidr_block[count.index], element(keys(data.aws_subnet.cidrblocks.cidr_block[count.index]),0), "127.0.0.1/32")
#destination_cidr_block = data.aws_subnet.cidrblocks.cidr_block[count.index]
#destination_cidr_block = [data.aws_subnet.cidrblocks.*.cidr_block]
destination_cidr_block = [for s in data.aws_subnet.cidrblocks : s.cidr_block]
/* for_each = [for s in data.aws_subnet.cidrblocks: {
destination_cidr_block = s.cidr_block
}] */
#destination_cidr_block = [for s in data.aws_subnet.cidrblocks : s.cidr_block]
#destination_cidr_block = data.aws_subnet.cidrblocks.cidr_block[count.index]
transit_gateway_attachment_id = aws_ec2_transit_gateway_vpc_attachment.fromshared.id
transit_gateway_route_table_id = aws_ec2_transit_gateway_route_table.shared.id
}
The part in comments is what I have tried so far and nothing worked.
The error that is happening currently when using that uncommented part is
Error: Incorrect attribute value type
on modules/tgw/tgw_rt.tf line 20, in resource "aws_ec2_transit_gateway_route" "shared-routes":
20: destination_cidr_block = [for s in data.aws_subnet.cidrblocks : s.cidr_block]
|----------------
| data.aws_subnet.cidrblocks is object with 3 attributes
Inappropriate value for attribute "destination_cidr_block": string required.
I would really appreciate it if one of the terraform gods present here could shed some light on this problem.
SOLUTION - THIS IS WRONG Since it was complaining about it being an object with 3 attributes (3 Cidr blocks), to iterate i had to use this:
destination_cidr_block = element([for s in data.aws_subnet.cidrblocks : s.cidr_block], 0)
CORRECT SOLUTION The solution was to add a small part to #kyle suggestion, I had to use an object to represent the data and convert it to a map, you rock #kyle:
for_each = {for s in data.aws_subnet.cidrblocks: s.cidr_block => s}
destination_cidr_block = each.value.cidr_block
Thank you all in Advance
I haven't used data.aws_subnet, but I think you were close with your for_each attempt-
resource "aws_ec2_transit_gateway_route" "shared-routes" {
...
for_each = [for s in data.aws_subnet.cidrblocks: s.cidr_block]
destination_cidr_block = each.value
...
}

Modify field at any level with circe-optics

I am trying to transform the "model" field at any level with circe-optics and I'm having trouble in achieving this.
Input:
{
"model":"ModelExample1",
"test": {
"model":"ModelExample2"
}
}
Expected Ouput:
{
"model":"AAAA-ModelExample1",
"test": {
"model":"AAAA-ModelExample2"
}
}
Circe optics do not provide a recursive modification function out of the box. However, you can make one:
import io.circe.optics.JsonPath._
val modifyModel: Json => Json = root.model.string.modify("AAAA-" + _)
def modifyAllModels(value: Json): Json =
root.each.json.modify(modifyAllModels)(modifyModel(value))
The modification will be applied to all keys, not just test - if you don't want that, swap each for test in modifyAllModels.

BigQuery UDF memory exceeded error on multiple rows but works fine on single row

I'm writing a UDF to process Google Analytics data, and getting the "UDF out of memory" error message when I try to process multiple rows. I downloaded the raw data and found the largest record and tried running my UDF query on that, with success. Some of the rows have up to 500 nested hits, and the size of the hit record (by far the largest component of each row of the raw GA data) does seem to have an effect on how many rows I can process before getting the error.
For example, the query
select
user.ga_user_id,
ga_session_id,
...
from
temp_ga_processing(
select
fullVisitorId,
visitNumber,
...
from [79689075.ga_sessions_20160201] limit 100)
returns the error, but
from [79689075.ga_sessions_20160201] where totals.hits = 500 limit 1)
does not.
I was under the impression that any memory limitations were per-row? I've tried several techniques, such as setting row = null; before emit(return_dict); (where return_dict is the processed data) but to no avail.
The UDF itself doesn't do anything fancy; I'd paste it here but it's ~45 kB in length. It essentially does a bunch of things along the lines of:
function temp_ga_processing(row, emit) {
topic_id = -1;
hit_numbers = [];
first_page_load_hits = [];
return_dict = {};
return_dict["user"] = {};
return_dict["user"]["ga_user_id"] = row.fullVisitorId;
return_dict["ga_session_id"] = row.fullVisitorId.concat("-".concat(row.visitNumber));
for(i=0;i<row.hits.length;i++) {
hit_dict = {};
hit_dict["page"] = {};
hit_dict["time"] = row.hits[i].time;
hit_dict["type"] = row.hits[i].type;
hit_dict["page"]["engaged_10s"] = false;
hit_dict["page"]["engaged_30s"] = false;
hit_dict["page"]["engaged_60s"] = false;
add_hit = true;
for(j=0;j<row.hits[i].customMetrics.length;j++) {
if(row.hits[i].customDimensions[j] != null) {
if(row.hits[i].customMetrics[j]["index"] == 3) {
metrics = {"video_play_time": row.hits[i].customMetrics[j]["value"]};
hit_dict["metrics"] = metrics;
metrics = null;
row.hits[i].customDimensions[j] = null;
}
}
}
hit_dict["topic"] = {};
hit_dict["doctor"] = {};
hit_dict["doctor_location"] = {};
hit_dict["content"] = {};
if(row.hits[i].customDimensions != null) {
for(j=0;j<row.hits[i].customDimensions.length;j++) {
if(row.hits[i].customDimensions[j] != null) {
if(row.hits[i].customDimensions[j]["index"] == 1) {
hit_dict["topic"] = {"name": row.hits[i].customDimensions[j]["value"]};
row.hits[i].customDimensions[j] = null;
continue;
}
if(row.hits[i].customDimensions[j]["index"] == 3) {
if(row.hits[i].customDimensions[j]["value"].search("doctor") > -1) {
return_dict["logged_in_as_doctor"] = true;
}
}
// and so on...
}
}
}
if(row.hits[i]["eventInfo"]["eventCategory"] == "page load time" && row.hits[i]["eventInfo"]["eventLabel"].search("OUTLIER") == -1) {
elre = /(?:onLoad|pl|page):(\d+)/.exec(row.hits[i]["eventInfo"]["eventLabel"]);
if(elre != null) {
if(parseInt(elre[0].split(":")[1]) <= 60000) {
first_page_load_hits.push(parseFloat(row.hits[i].hitNumber));
if(hit_dict["page"]["page_load"] == null) {
hit_dict["page"]["page_load"] = {};
}
hit_dict["page"]["page_load"]["sample"] = 1;
page_load_time_re = /(?:onLoad|pl|page):(\d+)/.exec(row.hits[i]["eventInfo"]["eventLabel"]);
if(page_load_time_re != null) {
hit_dict["page"]["page_load"]["page_load_time"] = parseFloat(page_load_time_re[0].split(':')[1])/1000;
}
}
// and so on...
}
}
row = null;
emit return_dict;
}
The job ID is realself-main:bquijob_4c30bd3d_152fbfcd7fd
Update Aug 2016 : We have pushed out an update that will allow the JavaScript worker to use twice as much RAM. We will continue to monitor jobs that have failed with JS OOM to see if more increases are necessary; in the meantime, please let us know if you have further jobs failing with OOM. Thanks!
Update : this issue was related to limits we had on the size of the UDF code. It looks like V8's optimize+recompile pass of the UDF code generates a data segment that was bigger than our limits, but this was only happening when when the UDF runs over a "sufficient" number of rows. I'm meeting with the V8 team this week to dig into the details further.
#Grayson - I was able to run your job over the entire 20160201 table successfully; the query takes 1-2 minutes to execute. Could you please verify that this works on your side?
We've gotten a few reports of similar issues that seem related to # rows processed. I'm sorry for the trouble; I'll be doing some profiling on our JavaScript runtime to try to find if and where memory is being leaked. Stay tuned for the analysis.
In the meantime, if you're able to isolate any specific rows that cause the error, that would also be very helpful.
A UDF will fail on anything but very small datasets if it has a lot of if/then levels, such as:
if () {
.... if() {
.........if () {
etc
We had to track down and remove the deepest if/then statement.
But, that is not enough. In addition, when you pass the data into the UDF run a "GROUP EACH BY" on all the variables. This will force BQ to send the output to multiple "workers". Otherwise it will also fail.
I've wasted 3 days of my life on this annoying bug. Argh.
I love the concept of parsing my logs in BigQuery, but I've got the same problem, I get
Error: Resources exceeded during query execution.
The Job Id is bigquery-looker:bquijob_260be029_153dd96cfdb, if that at all helps.
I wrote a very basic parser does a simple match and returns rows. Works just fine on a 10K row data set, but I get out of resources when trying to run against a 3M row logfile.
Any suggestions for a work around?
Here is the javascript code.
function parseLogRow(row, emit) {
r = (row.logrow ? row.logrow : "") + (typeof row.l2 !== "undefined" ? " " + row.l2 : "") + (row.l3 ? " " + row.l3 : "")
ts = null
category = null
user = null
message = null
db = null
found = false
if (r) {
m = r.match(/^(\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d\d\d (\+|\-)\d\d\d\d) \[([^|]*)\|([^|]*)\|([^\]]*)\] :: (.*)/ )
if( m){
ts = new Date(m[1])/1000
category = m[3] || null
user = m[4] || null
db = m[5] || null
message = m[6] || null
found = true
}
else {
message = r
found = false
}
}
emit({
ts: ts,
category: category,
user: user,
db: db,
message: message,
found: found
});
}
bigquery.defineFunction(
'parseLogRow', // Name of the function exported to SQL
['logrow',"l2","l3"], // Names of input columns
[
{'name': 'ts', 'type': 'timestamp'}, // Output schema
{'name': 'category', 'type': 'string'},
{'name': 'user', 'type': 'string'},
{'name': 'db', 'type': 'string'},
{'name': 'message', 'type': 'string'},
{'name': 'found', 'type': 'boolean'},
],
parseLogRow // Reference to JavaScript UDF
);