Amazon S3, streaming video while still uploading it - amazon-s3

I wanted to know if it would be possible to stream video while you are uploading it.
For example I have a 100MB video uploading to s3, the first 50MB are uploaded, so can a client start reproducing the video through cloudfront even tho it's not yet fully uploaded?
Or does S3 first wait for the upload to completely finish, then assemble the video file, and then publish it?
Thanks!

S3 provides read-after-write consistency for PUTS of new objects. The data will not be able to read until the write is complete.
Amazon S3 provides read-after-write consistency for PUTS of new
objects in your S3 bucket in all regions with one caveat. The caveat
is that if you make a HEAD or GET request to the key name (to find if
the object exists) before creating the object, Amazon S3 provides
eventual consistency for read-after-write.
S3 consistency model

Related

Is there another way around than using Amazon s3 storage for Amazon Transcribe?

I need to transcribe a series of videos to text for one of my applications. Is there any way of using local storage to proceed with a transcription job in Amazon transcribe?
At the moment, you need to download the transcription from S3 after the transcribe job finished.

aws s3 sync cli ignoring multipart upload config when syncing between buckets

I'm trying to sync a large number of files from one bucket to another, some of the files are up to 2GB in size after using the aws cli's s3 sync command like so
aws s3 sync s3://bucket/folder/folder s3://destination-bucket/folder/folder
and verifying the files that had been transferred it became clear that the large files had lost the metadata that was present on the original file in the original bucket.
This is a "known" issue with larger files where s3 switches to multipart upload to handled the transfer.
This multipart handeling can be configured via the .aws/config file which has been done like so
[default]
s3 =
multipart_threshold = 4500MB
However when again testing the transfer the metadata on the larger files is still not present, it is present on any of the smaller files so it's clear that I'm heating the multipart upload issue.
Given this is an s3 to s3 transfer is the local s3 configuration taken into consideration at all?
As an alternative to this is there a way to just sync the metadata now that all the files have been transferred?
Have also tried doing aws s3 cp with no luck either.
You could use Cross/Same-Region Replication to copy the objects to another Amazon S3 bucket.
However, only newly added objects will copy between the buckets. You can, however, trigger the copy by copying the objects onto themselves. I'd recommend you test this on a separate bucket first, to make sure you don't accidentally lose any of the metadata.
The method suggested seems rather complex: Trigger cross-region replication of pre-existing objects using Amazon S3 inventory, Amazon EMR, and Amazon Athena | AWS Big Data Blog
The final option would be to write your own code to copy the objects, and copy the metadata at the same time.
Or, you could write a script that compares the two buckets to see which objects did not get their correct metadata, and have it just update the metadata on the target object. This actually involves copying the object to itself, while specifying the metadata. This is probably easier than copying ALL objects yourself, since it only needs to 'fix' the ones that didn't get their metadata.
Finally managed to implement a solution for this and took the oportunity to play around with the Serverless framework and Step Functions.
The general flow I went with was:
Step Function triggered using a Cloudwatch Event Rule targetting S3 Events of the type 'CompleteMultipartUpload', as the metadata is only ever missing on S3 objects that had to be transfered using a multipart process
The initial Task on the Step Function checks if all the required MetaData is present on the object that raised the event.
If it is present then the Step Function is finished
If it is not present then the second lambda task is fired which copies all metadata from the source object to the destination object.
This could be achieved without Step Functions however was a good simple exercise to give them a go. The first 'Check Meta' task is actually redundant as the metadata is never present if multipart transfer is used, I was originally also triggering off of PutObject and CopyObject as well which is why I had the Check Meta task.

Streaming data to an object in S3

We have a input stream which need to be written to S3. This stream has large data and I cannot keep it in memory. We don't want to write to local disk and then transfer to S3 because of security reasons.
Is there a way to stream data to s3 object?
I think our problem can be solved using s3 multipart upload. But, that is used for different purpose - uploading large files. Instead is there a out of the box way to stream data to s3?
This stream has large data and I cannot keep it in memory.
So multipart upload is the correct way to solve this.

S3 bucket does not append new data objects

I'm trying to send all my AWS IoT incoming sensor value messages to the same s3 bucket, but despite turning on versioning in my bucket, the file keeps getting overwritten and showing only the last input sensor value rather then all of them. I'm using "Store messages in an Amazon S3 bucket" direct from the AWS IoT console. Any easy way to solve this problem?
So after further research and speaking with Amazon Dev support you actually cant append records tot he same file in S3 from the IoT console directly. I mentioned this was a feature most IoT developers would want as a default, and he said it would likely be possible soon but not way to do it now. Anyway the simplest workaound I tested is to set up a Kinesis stream with a firehose to a S3 bucket. This will be constrained by an adjustable data size and stream duration but it works well otherwise. It also allows you to insert a Lambda functino for data transform if needed.

AWS S3. Multipart Upload. Can i start downloading file until it's 100% uploaded?

Actually title was a question :)
Do AWS S3 support file streaming in case if file is not 100% uploaded? Client #1 split files into small chunks and start uploading them using Multipart Upload. Client #2 start downloading them from S3. So, as result client #2 don't need to wait until client #1 has uploaded the whole file.
Is it possible to do it without additional streaming server?
This is not natively supported by S3.
S3 allows the individual parts of a multipart upload to be uploaded sequentially, or in parallel, or even out of their logical order, over an essentially unlimited period of time.
It is not until you send the CompleteMultipartUpload request that the parts are verified by S3 as all being present, and having the correct checksums, that the final object is assembled from the parts, and is created in the bucket (or overwrites the former object with the same key, if there was one) if the parts are all present and their integrity is intact. Until then, the object -- as an object at the designated key -- does not technically exist, so it can't be downloaded.