Can Plupload bypass Cloudflare's 100 mb upload limit? - cloudflare

I am trying to upload files that are larger than 100 MB through Cloudflare's network.
I want everything to run through Cloudflare's network because I don't want my website's IP to be known to the world.
Plupload can be used to chunk files before uploading them to the server.
This is what it says on Plupload's home page.
Upload in Chunks
Files that have to be uploaded can be small or huge - about several
gigabytes in size. In such cases standard upload may fail, since
browsers still cannot handle it properly. We slice the files in chunks
and send them out one by one. You can then safely collect them on the
server and combine into original file.
As a bonus this way you can overcome a server's constraints on
uploaded file sizes, if any.
The last part is what catches my eyes.
So can I use Plupload to bypass the 100 MB limit set by Cloudflare?

I've tested this out and you can pass CloudFlare's limit by using plupload's chunking. CloudFlare limits a single file upload that is over 100MB so if we chunk it to say 90MB we would be sending 90MB file through CloudFlare's and that's not an issue.

Yes, chunking your uploads can work, I used ResumableJS to get around the upload limit.

Related

Which file is consuming most of the bandwidth?

My website is consuming much more bandwidth than it supposed to be. From Weblizer or awstats of WHM/ cPanel I can monitor the bandwidth usage, which type of files (jpg, png, php, css etc.) is consuming the bandwidth. But I couldn't get any specific file name. My assumption is the bandwidth usage is done by referral spaming. But from the "Visitors" page of cPanel I can see only last 1000 hits. Is there any way from where I can see that which image or css file is consuming the bandwidth.
If there is a particular file which you think is consuming the most bandwidth, then you use apachetop tool.
yum install apachetop
then run
apachetop -f /var/log/apache2/domlogs/website_name-ssl.log
replace website_name with which you wish too.
It will basically pick the entries from domlogs (which saves requests being served from websites, you may read more about domlogs here).
This will show the file which is being requested the most in real time basis and might give you an idea if particular image/php etc file has maximum requests.
Domlogs is a way to find which file request by which bot etc is being carried out. Your initial investigation may start from this point.

When to use S3 Presigned Url vs Upload through Backend

I read Amazon s3: direct upload vs presigned url and was wondering when use a direct upload from the backend to s3 vs a presigned url.
I understand that the direct upload requires extra bandwidth (user -> server -> s3) but I believe its more secure. Does the savings in bandwidth with the presigned url justify the slight drawback with security (i.e. with stuff like user messages)?
I am also checking the file types on the backend (via magic numbers) which I think is incompatible with presigned urls. Should this reason alone result in not using urls?
In addition I have a file size limit of 5 MB (not sure if this is considered large?). Would there be a significant difference in terms of performance and scalability (i.e. thousands to millions of files sent per hour) between using presigned urls vs direct upload.
You question sounds like you're asking for opinion, so, mine is as follows:
It depends on how secure you need it to be and what you consider is safe. I was wondering about the same questions and I believe that in my case, in the end, it is all secured by SSL encryption anyway (which is enough for me), so I prefer to save my servers bandwidth and memory usage.
Once more it depends on your own system requirements. Anyway, if any upload fails, S3 will be returning an error cause after the request failure. If checking file type is a MUST and checking it on your backend is the only way to do it, you already have your answer.
In a scenario with millions of files (with close to 5MB each) being sent every hour, I would recommend direct upload, because that would be a lot of RAM usage to receive and resend every file.
There are a few more advantages of uploading directly to S3 as you can read here

How to ignore Cloudflare for my uploader?

On my site I upload media files. My uploader chunks these uploads into smaller sized files and once the upload is completed, the original will be created by merging the chunks.
The issue I run into is that when Cloudflare is enabled, the chunking request takes an awful long amount of time. An example is displayed here: http://testnow.ga
Every uploaded 5mb it chunks the file. This process saves the downloaded file on the server, then sends an AJAX request to the client and another 5mb upload request starts. The waiting (TTFB) in this particular case ranges anywhere from 2-10 seconds. Now, when the chunk size is 50mb for example, the waiting can be up to two minutes.
How can I speed up this process with Cloudflare? How can I ignore that specific /upload URL to not talk to Cloudflare?
Ps: the reason I'm not asking at Cloudflare is because I did a week ago and again a few days ago and haven't gotten a response yet. Thanks!
One option is to use a subdomain to submit the data to. At cloudflare, greycloud that dns entry. Then data is sent directly to your server bypassing cloudflare.

How to upload large files to mediawiki in an efficient way

We have to upload a lot of virtual box images witch are between 1G and 6G.
So i would prefer to use ftp for upload and then include the files in mediawiki.
Is there a way to do this?
Currently I use a jailed ftp user who can upload to a folder and then use the UploadLocal extension to include the files.
But this works only for files smaller then around 1G. If we upload bigger files we get a timeout and even by setting execution_time of PHP to 3000s the including stops after about 60s with a 505 gateway time out (witch is also the only thing appearing in the logs).
So is there a better way of doing this?
You can import files from shell using maintenance/importImages.php. Alternatively, upload by URL by flipping $wgAllowCopyUploads, $wgAllowAsyncCopyUploads and friends (requires that job queue be run using cronjobs). Alternatively, decide if you need to upload these files into MediaWiki at all, because just linking to them might suffice.

AWS S3 and AjaXplorer

I'm using AjaXplorer to give access to my clients to a shared directory stored in Amazon S3. I installed the SD, configured the plugin (http://ajaxplorer.info/plugins/access/s3/) and could upload and download files but the upload size is limited to my host PHP limit which is 64MB.
Is there a way I can upload directly to S3 without going over my host to improve speed and have S3 limit, no PHP's?
Thanks
I think that is not possible, because the server will first climb to the PHP file and then make transfer to bucket.
Maybe
The only way around this is to use some JQuery or JS that can bypass your server/PHP entirely and stream directly into S3. This involves enabling CORS and creating a signed policy on the fly to allow your uploads, but it can be done!
I ran into just this issue with some inordinately large media files for our website users that I no longer wanted to host on the web servers themselves.
The best place to start, IMHO is here:
https://github.com/blueimp/jQuery-File-Upload
A demo is here:
https://blueimp.github.io/jQuery-File-Upload/
This was written to upload+write files to a variety of locations, including S3. The only tricky bits are getting your MIME type correct for each particular upload, and getting your bucket policy the way you need it.