Nuxeo Document Upload Chunks Returns Incorrect Status - file-upload

We have setup nuxeo in 3-tier cluster mode. While uploading large files, we are breaking the file into smaller chunks and then uploading the chunks one-by-one.
Generally, when we upload each chunk, Nuxeo returns “Resume Incomplete (308)” as status code for first chunk and “Created (201)” as status code ONLY for the last chunk when the upload completes.
However, on one of our environments, we are getting 201 (Created) as the status code from nuxeo server during upload of chunks. Can anyone help us identify what are we missing here?

Related

html MediaRecorder save last n seconds

I'm using MediaRecorder and receive video by chunks in dataavailable. I would like to do 2 things with these chunks:
Save only the last n sec of a video. I noticed that just concatenating all the chunks in a video file generates a valid video file. Though, as soon as I start to remove some chunks from the beginning (let's say taking only the last 10 chunks to save only the last 10s of the video) then the resulting file is not valid anymore. I expect this is due to missing metadata or something.
I'm trying to send these chunks using socket io and read them from another browser using MediaSource. It kind of worked as long as I read the chunks from the beginning again. If I start reading after the server started to send the chunks, I receive chunks but not from the beginning and it fails to display the video.
Are there ways to avoid these problems?
Thank you!

apache nifi S3 PutObject stuck

Sorry if this is a dumb question, very new to nifi.
Have set up a process group to dump sql queries to CSV and then upload them to S3. Worked fine with small queries, but appears to be stuck with larger files.
The input queue to the PutS3Object processor has a limit of 1GB, but the file it is trying to put is almost 2 GB. I have set the multi-part parameters in the S3 processor to be 100M but it is still stuck.
So my theory is the S3PutObject needs a complete file before it starts uploading. Is this correct? Is there no way to get it uploading in a "streaming" manner? Or do I just have to up the input queue size?
Or am I on the wrong track and there is something else holding this all up.
The screenshot suggests that the large file is in PutS3Object's input queue, and PutS3Object is actively working on it (from the 1 thread indicator in the top-right of the processor box).
As it turns out, there were no errors, just a delay from processing a large file.

any storage service like amazon s3 which allows upload /Download at the same time on large file

My requirement to upload large file (35gb), when the upload is in progress need to start the download process on the same file. Any storage service which allows develop .net application
Because Amazon s3 will not allow simultaneously upload and download on
You could use Microsoft Azure Storage Page or Append Blobs to solve this:
1) Begin uploading the large data
2) Concurrently download small ranges of data (no greater than 4MB so the client library can read it in one chunk) that have already been written to.
Page Blobs need to be 512 byte aligned and can be read and written to in a random access pattern, whereas AppendBlobs need to be written to sequentially in an append-only pattern.
As long as you're reading sections that have already been written to you should have no problems. Check out the Blob Getting Started Doc: https://azure.microsoft.com/en-us/documentation/articles/storage-dotnet-how-to-use-blobs/ and some info about Blob Types: https://msdn.microsoft.com/library/azure/ee691964.aspx
And feel free to contact us with any follow up questions.

Uploading large 800gb json file from remote server to elasticsearch

I'm trying to upload an 800gb json file from a remote server to my local server but elasticsearch keeps getting killed.
Im using this code to upload data
curl -XPOST http://localhost:9200/carrier/data/ -d#carrier.json
Is this because a post request cant handle 800 gb or a configuration ive missed somewhere. Ive also mapped everything appropriately as smaller files upload easily.
In order to index a document, elasticsearch needs to allocate this document in memory first and then buffer it in an analyzed form again. So, you typically looking at double the size of the memory for the documents that you are indexing (it's more complex than that, but 2x is a good approximation). So, unless you have 1.6tb of memory on your machine I shouldn't try to index 800gb documents. If you have several documents in this json, you need to split them into chunk and send to elasticsearch using multiple Bulk Requests.

AWS S3. Multipart Upload. Can i start downloading file until it's 100% uploaded?

Actually title was a question :)
Do AWS S3 support file streaming in case if file is not 100% uploaded? Client #1 split files into small chunks and start uploading them using Multipart Upload. Client #2 start downloading them from S3. So, as result client #2 don't need to wait until client #1 has uploaded the whole file.
Is it possible to do it without additional streaming server?
This is not natively supported by S3.
S3 allows the individual parts of a multipart upload to be uploaded sequentially, or in parallel, or even out of their logical order, over an essentially unlimited period of time.
It is not until you send the CompleteMultipartUpload request that the parts are verified by S3 as all being present, and having the correct checksums, that the final object is assembled from the parts, and is created in the bucket (or overwrites the former object with the same key, if there was one) if the parts are all present and their integrity is intact. Until then, the object -- as an object at the designated key -- does not technically exist, so it can't be downloaded.