Amazon SNS Notifications on S3 objects - amazon-s3

We would like to publish a message to SNS Topic subscribers, and the notification can happen in different forms. One is if a new object is placed there will be one notification and if a object is updated then there will be another notification and the same way if an object deleted.
Example:
SNS Topic: MyTopic
S3 Path: s3://mybucket/myfolder1/myfolder2/myfolder3/
Task:
1. So if I add a file1 to this S3 path, then a notifications should be sent out to all subscribers of MyTopic.
2. If file1 has updated, then a notification should be sent out
3. If file1 has deleted, then a notification should be sent out.
Is there a IAM policy we can define that matches to all these criteria.

Related

Download all objects from S3 Bucket and send content to SQS

I am using python boto3 to get all the objects in a bucket but it returns the keys and not the content
I have a service which reads messages from SQS (a duplicate message is also present in s3 bucket) and does some operations. I have lost some sqs messages because of some failures and sqs 14 day policy.
The files have json data, with each file ranging from 4-8kb.
Now I want to re-drive all the objects from s3 to SQS.
Is there a way to get content of all files and then transfer them to SQS ?
Turning John Rotenstein's comment into an answer:
Is there a way to get content of all files and then transfer them to SQS ?
No. You will have to write something yourself, probably in the same way that you stored it in the first place. There is no automated method to move data from S3 to SQS

redis pub sub only for a certain set of keys?

I have a use case in which I want to enable notification only for a certain set of keys, so that when those keys expire I can get a notification from redis.
I have followed this answer to implement this.
I have set parameter notify-keyspace-events to "Ex"
To accomplish this I am adding keys that I want notification for in DB-0 and the other keys in DB-1. But I am recieveing notification for both the DBs. Is there any way to just get notification from a particular DB?
According to redis documentation :
"Redis can notify Pub/Sub clients about events happening in the key space.
This feature is documented at http://redis.io/topics/notifications
For instance if keyspace events notification is enabled, and a client
performs a DEL operation on key "foo" stored in the Database 0, two
messages will be published via Pub/Sub:
PUBLISH keyspace#0:foo del
PUBLISH keyevent#0:del foo
"
But I am receiving notification from both DB-0 and DB-1.
PS : I know I can filter keys in my application, but I store too many expiring keys in redis and sending notification for all the expiring will increase load on my redis server.
I think you subscribed to a pattern that matches all DBs' notification message, e.g. PSUBSCRIBE __key*__:*.
In fact, you can specify the db index in the subscribed pattern: PSUBSCRIBE __keyspace#0__:* and PSUBSCRIBE __keyevent#0__:*. This way, you'll only received notification of DB0.

Publish a list of "objects" to SNS in one message, then subscribe with a master lambda, feed to slaves

Wondering if this is recommended.
With a master lambda, subscribe to an SNS topic where I publish a message which contains a list of source/destination pairs for s3. Say 100 per message.
The master lambda with then loop though those pairs and call a slave worker for each item in the list which will do the copy of S3 objects from source to destination.
I was originally trying to use SQS but SQS is not an event source for lambda. Cloudwatch Events are too murky as how you pass actual data as payload.
So wondering if my approach above is valid and will hold up, or if there's a better alternative.
Thanks

Get notified when user uploads to an S3 bucket? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Notification of new S3 objects
We've got an app that stores user data on S3. The part of our app that handles the uploads is decoupled from the part that processes the data. In some cases, the user will be able to upload data directly to S3 without going through our app at all (this may happen if they have their own S3 account and supply us with credentials).
Is it possible to get notified whenever the contents of an S3 bucket change? It would be cool if somehow a message could get sent that says "this file was added/updated/deleted: foo".
Short of that, is there some timestamp somewhere I could poll that would tell the last time the bucket was updated?
If I can't do either of these things, then the only alternative is the crawl the entire bucket and look for changes. This will be slow and expensive.
Update 2014-11:
As Alan Illing points out in the comments, AWS now supports notifications from S3 to SNS, which can be forwarded automatically to SQS: http://aws.amazon.com/blogs/aws/s3-event-notification/
S3 can also send notifications to AWS Lambda to run your own code directly.
Original response that predicted S3->SNS notifications:
If Amazon supported this, they would use SNS to send out notifications that an object has been added to a bucket. However, at the moment, the only bucket event supported by S3 and SNS is to notify you when Amazon S3 detects that it has lost all replicas of a Reduced Redundancy Storage (RRS) object and can no longer service requests for that object.
Here's the documentation on the SNS events supported by S3:
http://docs.amazonwebservices.com/AmazonS3/latest/dev/NotificationHowTo.html
Based on the way that the documentation is written, it looks like Amazon has ideas for other notification events to add (like perhaps your idea for finding out when new keys have been added).
Given that it isn't supported directly by Amazon, the S3 client that uploads the object to S3 will need to trigger the notification, or you will need to do some sort of polling.
Custom event notification for uploads to S3 could be done using SNS if you like to get near-real-time updates for processing, or it can be done through SQS if you like to let the notifications pile up and process them out of a queue at your own pace.
If you are polling, you could reduce the number of keys you need to request by having the client upload with a prefix of, say, "unprocessed/..." followed by the unique key. Your polling software can then query just S3 keys starting with that prefix. When it is ready to process, it could change the key to "processing/..." and then later to "processed/..." or whatever. Objects in S3 are currently renamed by copy+delete operations performed by S3.

How can I get notification about new S3 objects?

I have a scenario where we have many clients uploading to s3.
What is the best approach to knowing that there is a new file?
Is it realistic/good idea, for me to poll the bucket ever few seconds?
UPDATE:
Since November 2014, S3 supports the following event notifications:
s3:ObjectCreated:Put – An object was created by an HTTP PUT operation.
s3:ObjectCreated:Post – An object was created by HTTP POST operation.
s3:ObjectCreated:Copy – An object was created an S3 copy operation.
s3:ObjectCreated:CompleteMultipartUpload – An object was created by the completion of a S3 multi-part upload.
s3:ObjectCreated:* – An object was created by one of the event types listed above or by a similar object creation event added in the future.
s3:ReducedRedundancyObjectLost – An S3 object stored with Reduced Redundancy has been lost.
These notifications can be issued to Amazon SNS, SQS or Lambda. Check out the blog post that's linked in Alan's answer for more information on these new notifications.
Original Answer:
Although Amazon S3 has a bucket notifications system in place it does not support notifications for anything but the s3:ReducedRedundancyLostObject event (see the GET Bucket notification section in their API).
Currently the only way to check for new objects is to poll the bucket at a preset time interval or build your own notification logic in the upload clients (possibly based on Amazon SNS).
Push notifications are now built into S3:
http://aws.amazon.com/blogs/aws/s3-event-notification/
You can send notifications to SQS or SNS when an object is created via PUT or POST or a multi-part upload is finished.
Your best option nowadays is using the AWS Lambda service. You can write a Lambda using either node.js javascript, java or Python (probably more options will be added in time).
The lambda service allows you to write functions that respond to events from S3 such as file upload. Cost effective, scalable and easy to use.
You can implement a pub-sub mechanism relatively simply by using SNS, SQS, and AWS Lambda. Please see the below steps. So, whenever a new file is added to the bucket a notification can be raised and acted upon that (everything is automated)
Please see attached diagram explaining the basic pub-sub mechanism
Step 1
Simply configure the S3 bucket event notification to notify an SNS topic. You can do this from the S3 console (Properties tab)
Step 2
Make an SQS Queue subscribed to this topic. So whenever an object is uploaded to the S3 bucket a message will be added to the queue.
Step 3
Create an AWS Lambda function to read messages from the SQS queue. AWS Lambda supports SQS events as a trigger. Therefore, whenever a message appears in the SQS queue, Lambda will trigger and read the message. Once successfully read the message it will be automatically deleted from the queue. For the messages that can't be processed by Lambda (erroneous messages) will not be deleted. So those messages will pile up in the queue. To prevent this behavior using a Dead Letter Queue (DLQ) is a good idea.
In your Lambda function, add your logic to handle what to do when users upload files to the bucket
Note: DLQ is nothing more than a normal queue.
Step 4
Debugging and analyzing the process
Make use of AWS Cloud watch to log details. Each Lambda function creates a log under log groups. This is a good place to check if something went wrong.