So what I want to do is set a gpio pin on my rpi whenever an s3 bucket adds or deletes a file. I currently have a lambda function set to trigger whenever this occurs. The problem now is getting the function to set the flag. What I currently have in my lambda function is this. But nothing is coming through on my device shadow. My end goal is to have a folder on my rpi stay in sync with the bucket whenever a file is added or deleted without any user input or a cron job.
import json
import boto3
def lambda_handler(event, context):
client = boto3.client('iot-data', region_name='us-west-2')
# Change topic, qos and payload
response = client.publish(
topic='$aws/things/MyThing/shadow/update',
qos=1,
json.dumps({"state" : { "desired" : { "switch" : "on" }}})
)
Go to the CloudWatch Log for your lambda function, what do it says there?
Since you are intending to update the shadow document, have you tried the function "update_thing_shadow"?
Related
When creating a Bigquery Data Transfer Service Job Manually through the UI, I can select an option to delete source files after transfer. When I try to use the CLI or the Python Client to create on-demand Data Transfer Service Jobs, I do not see an option to delete the source files after transfer. Do you know if there is another way to do so? Right now, my Source URI is gs://<bucket_path>/*, so it's not trivial to delete the files myself.
For me works this snippet (replace YOUR-... with your data):
from google.cloud import bigquery_datatransfer
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "YOUR-CRED-FILE-PATH"
transfer_client = bigquery_datatransfer.DataTransferServiceClient()
destination_project_id = "YOUR-PROJECT-ID"
destination_dataset_id = "YOUR-DATASET-ID"
transfer_config = bigquery_datatransfer.TransferConfig(
destination_dataset_id=destination_dataset_id,
display_name="YOUR-TRANSFER-NAME",
data_source_id="google_cloud_storage",
params={
"data_path_template":"gs://PATH-TO-YOUR-DATA/*.csv",
"destination_table_name_template":"YOUR-TABLE-NAME",
"file_format":"CSV",
"skip_leading_rows":"1",
"delete_source_files": True
},
)
transfer_config = transfer_client.create_transfer_config(
parent=transfer_client.common_project_path(destination_project_id),
transfer_config=transfer_config,
)
print(f"Created transfer config: {transfer_config.name}")
In this example, table YOUR-TABLE-NAME must already exist in BigQuery, otherwise the transfer will crash with error "Not found: Table YOUR-TABLE-NAME".
I used this packages:
google-cloud-bigquery-datatransfer>=3.4.1
google-cloud-bigquery>=2.31.0
Pay attention to the attribute delete_source_files in params. From docs:
Optional param delete_source_files will delete the source files after each successful transfer. (Delete jobs do not retry if the first effort to delete the source files fails.) The default value for the delete_source_files is false.
I am using cloudwatch dashboards to have and aggregated view of various services running for the application. I have number widget for the EnvironmentHealth metrics which displays the enumerated values but not the health codes as OK, WARN .. On the beanstalk monitoring option I can see the service health but want the same on dashboard. Please help.
This is what I expect
This is what I see
CloudWatch metrics for ElasticBeanstalk EnvironmentHealth only returns numbers. 0 is OK, 1 is info, 15 warning etc.
Not sure this is the best way but using custom widget with a lambda funcation works fine. The lambda function get the info of EB status and return it to CW dashboards.
Create Lambda function
I used python 3.9. Default setting is ok but note attach lambda IAM policy AWSElasticBeanstalkReadOnly to get EB status.
Lambda python code is below:
set EnvironmentName for your EB env
font size="56 px" color="#444444" are same as CW dashboard widget
Create in the same region as EB env.
import boto3
def lambda_handler(event, context):
client = boto3.client("elasticbeanstalk")
response = client.describe_environment_health(
EnvironmentName="<your-env-name>", AttributeNames=["HealthStatus"]
)
output = (
"""<font size="56 px" color="#444444">"""
+ response["HealthStatus"]
+ """</font>"""
)
return output
Create Custom widget on CloudWatch dashboard
Add widget -> Custom widget -> Next -> Select the Lambda function above -> Create widget
This is my dashboard.
How to activating data pipeline when new files arrived on S3.For EMR scheduling using triggered using SNS when new files arrived on S3.
You can execute data pipeline without using SNS. When files will be arrived into S3 Location
Create S3 event which should invoke lambda function.enter image description here
Create Lambda Function (Make sure the role which you will give that has s3 , lambda, data pipeline permissions).
Paste below code in lambda function to execute data pipeline (mention your data pipeline_id)
import boto3
def lambda_handler(event, context):
try :
client = boto3.client('datapipeline',region_name='ap-southeast-2')
s3_client = boto3.client('s3')
data_pipeline_id="df-09312983K28XXXXXXXX"
response_pipeline = client.describe_pipelines(pipelineIds=[data_pipeline_id])
activate = client.activate_pipeline(pipelineId=data_pipeline_id,parameterValues=[])
except Exception as e:
raise Exception("Pipeline is not found or not active")
We had a third party create a python based image thumbnail script that we set up to trigger on an S3 ObjectCreated event. We then imported a collection of close to 5,000 images after testing the script, but the sheer volume of the image files ended up filling up the lambda test space during the import and only about 12% of the images ended up having thumbnails created for them.
We need to manually create thumbnails for the other 88%. While I have a php based script I can run from EC2, it's somewhat slow. It occurs to me that I could create them 'on demand' and could avoid having to create thumbnails for all of the files that didn't auto-create already during the import.
Some of the files may never be accessed again by a customer - the existing lambda thumbnailer already has a slight delay that I account for in the javascript setTimeout retry loop, but before invoking this loop, I could conceivably check if it's a recent upload -- e.g. within the last 10 seconds -- whenever a thumbnail is not found then trigger the lambda manually before starting the retry loop.
But to do this, I need to have the ability to trigger the Lambda script with the parameters similar to the event trigger. It appears as though their script is only accessing the bucket name and key from the event values:
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
Being unfamiliar with lambda and still somewhat new to the sdk, I am not sure how I do a lambda trigger that would include those values for the python script.
I can use either the php sdk or the javascript sdk. (or even the cli)
Any help is appreciated.
I think I figured it out, copying the data structure in the python references to create a bare-bones payload and triggering it as event:
$lambda = $awsSvc->getAwsSdkCached()->createLambda();
// bucket = event['Records'][0]['s3']['bucket']['name']
// key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
$bucket = "mybucket";
$key = "somefolder/someimage.jpg";
$payload_json = sprintf('{"Records":[{"s3":{"bucket":{"name":"%s"},"object":{"key":"%s"}}}]}', $bucket, $key);
$params = array(
'FunctionName' => 'ThumbnailGenerator',
'InvocationType' => 'Event',
'LogType' => 'Tail',
'Payload' => $payload_json
);
$result = $lambda->invoke($params);
I'm trying to organize a large number of CloudWatch alarms for maintainability, and the web console grays out the name field on an edit. Is there another method (preferably something scriptable) for updating the name of CloudWatch alarms? I would prefer a solution that does not require any programming beyond simple executable scripts.
Here's a script we use to do this for the time being:
import sys
import boto
def rename_alarm(alarm_name, new_alarm_name):
conn = boto.connect_cloudwatch()
def get_alarm():
alarms = conn.describe_alarms(alarm_names=[alarm_name])
if not alarms:
raise Exception("Alarm '%s' not found" % alarm_name)
return alarms[0]
alarm = get_alarm()
# work around boto comparison serialization issue
# https://github.com/boto/boto/issues/1311
alarm.comparison = alarm._cmp_map.get(alarm.comparison)
alarm.name = new_alarm_name
conn.update_alarm(alarm)
# update actually creates a new alarm because the name has changed, so
# we have to manually delete the old one
get_alarm().delete()
if __name__ == '__main__':
alarm_name, new_alarm_name = sys.argv[1:3]
rename_alarm(alarm_name, new_alarm_name)
It assumes you're either on an ec2 instance with a role that allows this, or you've got a ~/.boto file with your credentials. It's easy enough to manually add yours.
Unfortunately it looks like this is not currently possible.
I looked around for the same solution but it seems neither console nor cloudwatch API provides that feature.
Note:
But we can copy the existing alram with the same parameter and can save on new name
.