kinesis data firehose log data is in encrypted form - amazon-s3

strong textI create a aws Cross-Account Log Data Sharing with Subscriptions.
By follow this link
After create kinesis stream create Kinesis Data Firehose delivery streams to save logs in s3 bucket.
logs files creating in S3 bucket but in encrypted form .
And at sender side no KMS key id ..
How can i see the logs..
Also not able to decrypt in base64 manually..
Updated:
I found that logs store in S3 bucket have "Content-Type application/octet-stream". when i update content-type to "text/plain" ..
is there any way set in bucket level content type or configure in kinesis data stream or firehose
Is there any way to set content-type kinesis stream or set the default content-type for s3 folder?

The data you posted appears to be compressed (I'd need a short file sample to be able to say that for certain). If I were you I'd look into compression settings for the log stream.
Here, it references some different compressions available for data: https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html
In a pinch, if you save those data in a file and give it the ".gz" extension, does the file become readable? (I'm not too hopeful since it says that the default compression scheme is Snappy and not GZIP, and I might be mistaken but I think I see a ZIP preamble after some kind of header in your screenshot).

Related

S3 : Put Vs Multipart Notification clarification

Wanted to confirm the behavior I am seeing. I am using Amplify.Storage.uploadFile to upload a file. The file being uploaded can be of any size . Seems like the amplify sdk decides the mechanism for upload depending upon file size. I am listening to SQS notifications on upload. This is the behavior that I see
Enable only Multipart upload complete notification:
Smaller file size : receive nothing
Larger file size: receive single ObjectCreated:CompleteMultipartUpload event
Problem: I miss out on the smaller file.
Enable both PUT and Multipart upload complete notification
Smaller file size: receive put event
Larger file size : get multiple ObjectCreated:CompleteMultipartUpload events
Problem: I don’t know which of the notifications to listen to for the larger file size. I don’t know if anything is guaranteed about the timing of the multiple notification. Can I assume to try to read the file and if the multipart upload has not truly finished then I wouldn’t be able to download the file and hence I can ignore the notification ?
Thoughts ?

Mismatch between my kinesis data and s3 data. Why is this?

The data when I get-records on my kinesis:
aws kinesis get-shard-iterator --shard-id shardId-000000000000 --shard-iterator-type TRIM_HORIZON --stream-name <kinesis_stream> --profile sandbox
aws kinesis get-records --shard-iterator <some long iterator> --profile sandbox
looks like this:
{
"SequenceNumber": "49597879057469488670276149632780729413492497034093002754",
"ApproximateArrivalTimestamp": 1563920035.139,
"Data": "<some very long data encoded/encrypted/",
"PartitionKey": "84b15621-f823-43f6-acc7-069a2acfdea1"
}
This kinesis is linked to a kinesis firehouse which is linked to s3 but my bucket objects look like this:
{"type":"DatabaseActivityMonitoringRecords","version":"1.0","databaseActivityEvents":"<some long event encrypted/encoded>"}
Why is there this mismatch? Where is the transformation from kinesis to s3 taking place? What is get-records actually getting me? What does the kinesis data represent? What does my s3 events represent?
For context, I am using an aurora database with database activity stream connected to kinesis -> kinesis firehouse -> s3.
Please see the answer I posted here:
How is data in kinesis decrypted before hitting s3
These 2 questions are very similar.
Why is there this mismatch?
All data in the Kinesis stream are base64 encoded. So base on the client you are using to view it, you may either see the encoded or decoded version. Ie. Node library will decode it for you. AWS CLI does not decode the message
Where is the transformation from kinesis to s3 taking place?
The internal AWS event handlers will perform the decoding before storing it into S3. You can't see it, but essentially its just a lambda moving data from Kinesis -> S3 for you.
What is get-records actually getting me?
It gives you your data, plus information on "where" it is in the Kinesis stream.
What does the kinesis data represent?
What does my s3 events represent?
They should both represent your data. Kinesis responses come with additional decoration to identify where it is in the stream so you can go back and find it later. S3 stores the raw decoded data

Download all objects from S3 Bucket and send content to SQS

I am using python boto3 to get all the objects in a bucket but it returns the keys and not the content
I have a service which reads messages from SQS (a duplicate message is also present in s3 bucket) and does some operations. I have lost some sqs messages because of some failures and sqs 14 day policy.
The files have json data, with each file ranging from 4-8kb.
Now I want to re-drive all the objects from s3 to SQS.
Is there a way to get content of all files and then transfer them to SQS ?
Turning John Rotenstein's comment into an answer:
Is there a way to get content of all files and then transfer them to SQS ?
No. You will have to write something yourself, probably in the same way that you stored it in the first place. There is no automated method to move data from S3 to SQS

Sending Files through Kafka

We are using Kafka streams to send image files from one web service to other to maintain persistency. During the process we convert image into array of bytes and then convert them into Base64 encoded string and send it as a message along with other file metadata. Due to large image file size our Kafka is going down.
Is there any better option to stream these images through Kafka along with the file metadata?
You might want to use compression of data by the Kafka producer. Spring Cloud Stream Kafka binder allows specifying the compression type using the property: spring.cloud.stream.kafka.bindings.<channelName>.producer.compressionType
This property accepts snappy and gzip as values for compression and none for no compression (which is default).

Get notified when user uploads to an S3 bucket? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Notification of new S3 objects
We've got an app that stores user data on S3. The part of our app that handles the uploads is decoupled from the part that processes the data. In some cases, the user will be able to upload data directly to S3 without going through our app at all (this may happen if they have their own S3 account and supply us with credentials).
Is it possible to get notified whenever the contents of an S3 bucket change? It would be cool if somehow a message could get sent that says "this file was added/updated/deleted: foo".
Short of that, is there some timestamp somewhere I could poll that would tell the last time the bucket was updated?
If I can't do either of these things, then the only alternative is the crawl the entire bucket and look for changes. This will be slow and expensive.
Update 2014-11:
As Alan Illing points out in the comments, AWS now supports notifications from S3 to SNS, which can be forwarded automatically to SQS: http://aws.amazon.com/blogs/aws/s3-event-notification/
S3 can also send notifications to AWS Lambda to run your own code directly.
Original response that predicted S3->SNS notifications:
If Amazon supported this, they would use SNS to send out notifications that an object has been added to a bucket. However, at the moment, the only bucket event supported by S3 and SNS is to notify you when Amazon S3 detects that it has lost all replicas of a Reduced Redundancy Storage (RRS) object and can no longer service requests for that object.
Here's the documentation on the SNS events supported by S3:
http://docs.amazonwebservices.com/AmazonS3/latest/dev/NotificationHowTo.html
Based on the way that the documentation is written, it looks like Amazon has ideas for other notification events to add (like perhaps your idea for finding out when new keys have been added).
Given that it isn't supported directly by Amazon, the S3 client that uploads the object to S3 will need to trigger the notification, or you will need to do some sort of polling.
Custom event notification for uploads to S3 could be done using SNS if you like to get near-real-time updates for processing, or it can be done through SQS if you like to let the notifications pile up and process them out of a queue at your own pace.
If you are polling, you could reduce the number of keys you need to request by having the client upload with a prefix of, say, "unprocessed/..." followed by the unique key. Your polling software can then query just S3 keys starting with that prefix. When it is ready to process, it could change the key to "processing/..." and then later to "processed/..." or whatever. Objects in S3 are currently renamed by copy+delete operations performed by S3.