Dropbox API requests performance issues - dropbox

The issue we dealing with is when moving or copying files in Dropbox server to another folder in Dropbox server.
The API requires to send request for each file separately. That takes way too long.
Maybe You provide some kind of batch request so I could move more than one files per request?
I also know the ability to move all folder content, but it doesn't work on our case, cause we need only subset of files to move.
If we try to flush many request at once threw several connections, we get 'Server Unavailable' or 'File Locked' errors and need to repeat request again.
Tl;DR;
To move 1000 files that already are in Dropbox server it takes over 30 minutes.
What possible solutions You have to increase the performance?

The Dropbox API now provides a batch endpoint for moving files. You can find the documentation here:
https://www.dropbox.com/developers/documentation/http/documentation#files-move_batch

Related

When to use S3 Presigned Url vs Upload through Backend

I read Amazon s3: direct upload vs presigned url and was wondering when use a direct upload from the backend to s3 vs a presigned url.
I understand that the direct upload requires extra bandwidth (user -> server -> s3) but I believe its more secure. Does the savings in bandwidth with the presigned url justify the slight drawback with security (i.e. with stuff like user messages)?
I am also checking the file types on the backend (via magic numbers) which I think is incompatible with presigned urls. Should this reason alone result in not using urls?
In addition I have a file size limit of 5 MB (not sure if this is considered large?). Would there be a significant difference in terms of performance and scalability (i.e. thousands to millions of files sent per hour) between using presigned urls vs direct upload.
You question sounds like you're asking for opinion, so, mine is as follows:
It depends on how secure you need it to be and what you consider is safe. I was wondering about the same questions and I believe that in my case, in the end, it is all secured by SSL encryption anyway (which is enough for me), so I prefer to save my servers bandwidth and memory usage.
Once more it depends on your own system requirements. Anyway, if any upload fails, S3 will be returning an error cause after the request failure. If checking file type is a MUST and checking it on your backend is the only way to do it, you already have your answer.
In a scenario with millions of files (with close to 5MB each) being sent every hour, I would recommend direct upload, because that would be a lot of RAM usage to receive and resend every file.
There are a few more advantages of uploading directly to S3 as you can read here

How can I list all uploads for a project?

I would like to access the list of all uploads that have been added to a given project on my company GitLab server.
I don't mean versionned files, I mean attached files: binaries and other types of files that have been attached to issues, merge requests, etc.
It's OK if I have to use the API for that.
What I've tried
My first approach was through GET /projects/:id/repository/files/:file_path, but that's for the versionned files.
Then, I found out about POST /projects/:id/uploads, but that's only for uploading and not for listing already uploaded files.
Is there a way to list all those uploaded files?
I believe this is not possible.
There is an open issue for retrieving specific files which has not received much attention:
https://gitlab.com/gitlab-org/gitlab-ce/issues/55520
Hopefully, in the future, there will eventually be an endpoint
GET /projects/:id/uploads
I had the same question and after getting in touch with gitlab support they confirmed that this is not currently implemented (as of now, November 2021), and forwarded me the 3 following feature requests :
API list all files on a project : https://gitlab.com/gitlab-org/gitlab/-/issues/197361
Attachment Manager : https://gitlab.com/gitlab-org/gitlab/-/issues/16229
Retrieve uploaded files using API : https://gitlab.com/gitlab-org/gitlab/-/issues/25838
A workaround seems to be to export the whole project, and you'll find the uploads in that archive, and you'll be able to list them.

How to ignore Cloudflare for my uploader?

On my site I upload media files. My uploader chunks these uploads into smaller sized files and once the upload is completed, the original will be created by merging the chunks.
The issue I run into is that when Cloudflare is enabled, the chunking request takes an awful long amount of time. An example is displayed here: http://testnow.ga
Every uploaded 5mb it chunks the file. This process saves the downloaded file on the server, then sends an AJAX request to the client and another 5mb upload request starts. The waiting (TTFB) in this particular case ranges anywhere from 2-10 seconds. Now, when the chunk size is 50mb for example, the waiting can be up to two minutes.
How can I speed up this process with Cloudflare? How can I ignore that specific /upload URL to not talk to Cloudflare?
Ps: the reason I'm not asking at Cloudflare is because I did a week ago and again a few days ago and haven't gotten a response yet. Thanks!
One option is to use a subdomain to submit the data to. At cloudflare, greycloud that dns entry. Then data is sent directly to your server bypassing cloudflare.

How to upload large files to mediawiki in an efficient way

We have to upload a lot of virtual box images witch are between 1G and 6G.
So i would prefer to use ftp for upload and then include the files in mediawiki.
Is there a way to do this?
Currently I use a jailed ftp user who can upload to a folder and then use the UploadLocal extension to include the files.
But this works only for files smaller then around 1G. If we upload bigger files we get a timeout and even by setting execution_time of PHP to 3000s the including stops after about 60s with a 505 gateway time out (witch is also the only thing appearing in the logs).
So is there a better way of doing this?
You can import files from shell using maintenance/importImages.php. Alternatively, upload by URL by flipping $wgAllowCopyUploads, $wgAllowAsyncCopyUploads and friends (requires that job queue be run using cronjobs). Alternatively, decide if you need to upload these files into MediaWiki at all, because just linking to them might suffice.

How do I get a status report of all files currently being uploaded via a HTTP form on an Apache Server?

How do I get a status report of all files currently being uploaded via HTTP form based file upload on an Apache Server?
I don't believe you can do this with Apache itself. The upload looks like nothing more than a POST as far as Apache cares. There are modules and other servers that do special processing to uploads so you may have some luck there. It would probably be easier to keep track of it in your application.
Check out SWFUpload, its uses Flash (in a nice way) to assist with managing multiple uploads.
There are events you can monitor for how many files of a set have been uploaded.