AWS EventBridge Rule Ignoring Key Prefix & Suffix Matching - amazon-s3

I have an Event Bridge rule created where when I drop a file into an S3 bucket it will trigger a Step function.
I only want to trigger this rule when:
A file is in a folder called files/ (prefix: "files/")
The file is a CSV (suffix: ".csv")
However this rule is being triggered for any files regardless of their suffix and prefix. For instance I dropped a .pdf file in and it triggered the step function.
{
"detail-type": ["Object Created"],
"source": ["aws.s3"],
"detail": {
"bucket": {
"name": ["my-files-bucket"]
},
"object": {,
"key": [{
"prefix": "files/"
}, {
"suffix": ".csv"
}]
}
}
}

This is the expected behaviour. EventBridge treats multiple values in brackets as an OR condition. Events will match your pattern if the object key begins with files/ OR ends with .csv.
As far as I know, it's not possible to apply an AND condition to a single field.

Related

Cloudwatch rule to match ssm hierarchy

I'd like to create a cloudwatch rule to trigger an action whenever a SSM parameter in a given hiearchy is updated (in my example anything in the /config hierarchy)
If I put a rule matching the whole name of the parameter the action gets triggered correctly.
I tried the following thus far:
{
"source": [
"aws.ssm"
],
"detail-type": [
"Parameter Store Change"
],
"detail": {
"name": [
"/config/",
"/config/*",
"/config/%"
],
"operation": [
"Update"
]
}
}
Is there any way to achieve such behaviour ?
Not exactly what you want, but you can leave off the "name" array entirely. You will then get notifications for all parameters, and then filter from the message receive side.

GCP Bigquery: Can't query stackdriver access logs exported in cloudstorage because invalid json field "#type"

I store the access log of a pixel image in a cloudstorage bucket dev-access-log-bucket using the standard "sink"
so the files looks like this requests/2019/05/08/15:00:00_15:59:59_S1.json
and one line looks like this (I formatted the json, but it's on one line normmaly) :
{
"httpRequest": {
"cacheLookup": true,
"remoteIp": "93.24.25.190",
"requestMethod": "GET",
"requestSize": "224",
"requestUrl": "https://dev-snowplow.legalstart.fr/one_pixel_image.png?user_id=0&action=purchase&product_id=0&money=10",
"responseSize": "779",
"status": 200,
"userAgent": "python-requests/2.21.0"
},
"insertId": "w6wyz1g2jckjn6",
"jsonPayload": {
"#type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
"statusDetails": "response_sent_by_backend"
},
"logName": "projects/tracking-pixel-239909/logs/requests",
"receiveTimestamp": "2019-05-08T15:34:24.126095758Z",
"resource": {
"labels": {
"backend_service_name": "",
"forwarding_rule_name": "dev-yolaw-pixel-forwarding-rule",
"project_id": "tracking-pixel-239909",
"target_proxy_name": "dev-yolaw-pixel-proxy",
"url_map_name": "dev-urlmap",
"zone": "global"
},
"type": "http_load_balancer"
},
"severity": "INFO",
"spanId": "7d8823509c2dc94f",
"timestamp": "2019-05-08T15:34:23.140747307Z",
"trace": "projects/tracking-pixel-239909/traces/bb55577eedd5797db2867931f8de9162"
}
all of these once again are standard GCP things, I did not customize anything here.
So now I want to do some requests on it from Bigquery, I create a dataset and an external table configured like this :
External Data Configuration
Source URI(s) gs://dev-access-log-bucket/requests/*
Auto-detect schema true (note: I don't know why it puts true though i've manually defined it)
Ignore unknown values true
Source format NEWLINE_DELIMITED_JSON
Max bad records 0
and the following manual schema:
timestamp DATETIME REQUIRED
httpRequest RECORD REQUIRED
httpRequest. requestUrl STRING REQUIRED
and when I run a request
SELECT
timestamp
FROM
`path.to.my.table`
LIMIT
1000
I got
Invalid field name "#type". Fields must contain only letters, numbers, and underscores, start with a letter or underscore, and be at most 128 characters long.
How can I work around this without needing to pre-process the log to not have the "#type" field in it ?

Validating correctness of $ref in json schema

The requirement is to validate given json schema that there are no dangling $ref pointing to the definitions within the file.
{
"$schema": "http://json-schema.org/draft-6/schema#",
"definitions": {
"date": {
"type": "string",
"pattern": "^(0?[1-9]|[12][0-9]|3[01])\\-(0?[1-9]|1[012])\\-\\d{4}$"
},
},
"properties": {
"my_date": {"$ref": "#/definitions/dat"}
}
}
Here, there is a typo in the reference (dat instead of date). I want to catch such instances rather than having a run time failure.
Library being used: https://github.com/java-json-tools/json-schema-validator
You could validate that the use of $ref resolves by digesting the JSON, recursivly extracting the value of $ref, splitting on slash, and checking the path exists.
This COULD get more complicated as you might have external references which target URLs.
I can't give you any code as I don't know JAVA. It doesn't seem like what you want is specifically available using that library.

Sublime Text 3 - apply shortcut only to specific file types

I am using Sublime Text 3, and I installed JSFormat to format my .js files and configured the key binding like this:
{ "keys": ["ctrl+shift+f"], "command": "js_format" }
Now, I also want to be able to format my .css and .html files, so I found this shortcut:
{ "keys": ["ctrl+shift+f"], "command": "reindent" , "args": { "single_line": false } }
I want to use js_format for my .js files and use reindent for my .css and .html files.
Is it possible to specify a file type per shortcut?
Update
This apparently no longer works in Sublime Text 4.
Update
I've since discovered that this is a duplicate of Sublime Text 3: how to bind a shortcut to a specific file extension?
Original Answer
Add a context:
{
"keys": ["ctrl+shift+f"],
"command": "js_format",
"context": [
{
"key": "selector",
"operator": "equal",
"operand": "source.js"
}
]
}
The important part is setting operand to source.js. You can replace js with whatever file extension you want. You can also specify additional sources with commas. For example, this would cause a command to apply to all .html and .css files:
{ "key": "selector", "operator": "equal", "operand": "source.html, source.css" }
See the unofficial documentation on key bindings.

AWS data pipeline activity with multiple inputs

As part of an Amazon AWS data pipeline, I have a hive activity using two unstaged S3 data nodes as input. What I want is to be able to set two script variables on the activity, each pointing to an input data node, but I can't get the syntax right. With the single input, I could write the following and it would work just fine:
INPUT_FOO=#{input.directoryPath}
When I add the second input, I run into a problem of how to reference them since they are now an array of inputs, as you can see in the pipeline definition below. Essentially, I want to achieve the following, but can't figure out the correct syntax:
INPUT_FOO=#{input[1].directoryPath}
INPUT_BAR=#{input[2].directoryPath}
Here's the activity portion of the pipeline definition:
{
"id": "ActivityId_7u1sR",
"input": [
{
"ref": "DataNodeId_iYnxf"
},
{
"ref": "DataNodeId_162Ka"
}
],
"schedule": {
"ref": "DefaultSchedule"
},
"scriptUri": "#{myS3ScriptLocation}calculate-results.q",
"name": "Perform Calculations",
"runsOn": {
"ref": "EmrClusterId_jHeiV"
},
"scriptVariable": [
"INPUT_SOURCE1=#{input[1].directoryPath}",
"OUTPUT=#{output.directoryPath}Results/",
"INPUT_SOURCE2=#{input[2].directoryPath}"
],
"output": {
"ref": "DataNodeId_2jY6v"
},
"type": "HiveActivity",
"stage": "false"
}
I plan to keep the tables unstaged and take care of table creation in the hive script so that it's easier to run each Hive activity in isolation as well as in the pipeline itself.
Here's the error I see when using array syntax:
Unable to resolve input[1].directoryPath for object ActivityId_7u1sR'
As it stands now, this scenario is not supported, but a feature request was added to support it in the future.