Why do we have to set responseType when using XMLHttpRequest? - xmlhttprequest

I implemented a HTML which upload a file, and then download another file from server.
When handling the "download part", i noticed that if i download a binary file, i have to set responseType to blob or the file will be broken.
What confused me is that, HTTP header contains content-type which could tell XMLHttpRequest what type of file the server is sending. Why i have to set it manually? I don't understand the logic because it's server's turn to tell what the file type is, rather than predicted by client
const xhr = new XMLHttpRequest();
xhr.responseType = 'blob'
.......
xhr.onload = function(e) {
if (this.status == 200) {
var blob = new Blob([this.response]); // if i don't set responseType, this.response will be broken
let a = document.createElement("a");

Your question made me realise I'm not entirely sure of the following, but it is how I have always looked at it.
responseType sets the type of xhr.response so you can process it as a Blob; it lets you retrieve the results of the xhr request as a Blob. If you don't set it, xhr.response will be Text.
A server may have sent the data the right way based on a mime type, but it still only sends a stream of bytes with a mime type; the interpretation of the received bytes lies on your end, and the received data won't automatically be of the type Blob based on the mime type.
Blob is not a file type on the server; the server may know and send the mime types of files, but Blob isn't one of them, and xhr.response won't be a Blob just because the mime type suggests that a Blob would be the right type.
Also, you may want to process the xhr.response differently from what can be inferred from the mime type, and in that sense, it is a kind of override (though not with the same functionality as xhr.overrideMimeType()).

Related

Error: Network Connection Lost - saving form data (file) to R2 bucket

I have this handler in my worker:
const data = await event.request.formData();
const key = data.get('filename');
const file = data.get('file');
if (typeof key !== 'string' || !file) {
return res.send(
{ message: 'Post body is not valid.' },
undefined,
400
);
}
await BUCKET.put(key, file);
return new Response(file);
If I comment out the await BUCKET.put(key, file); line, then I get the response of the file as expected. But with that line in the function, I get the error:
Uncaught (in promise) Error: Network connection lost.
I have confirmed that by changing the put to a get, I can retrieve files from that bucket, so there doesn't seem to be a problem with the connection itself.
Are you still having this problem? I'll need your account ID to figure out what's going on. If you DM me (Vitali) on Discord your account ID (& this SO link for context) I can probably help you out (or email me directly at cloudflare.com using vlovich as the account if you don't have/don't want to sign up on Discord). I'm the tech lead for R2.
EDIT 2022-09-07.
I just noticed that you're calling formData on the request. This is causing you to read the object into RAM. Workers has a 128 MiB limit so what's likely happening is that you're exceeding that limit (probably egregiously since we do give some buffer) and thus Cloudflare is terminating your Worker.
What you'll want to do is make sure you upload the file raw (not as a form) and access the raw ReadableStream. Alternatively, you can try writing a TransformStream to parse out the payload in a streaming fashion if you're confident the file payload (& any metadata you need) will come after the name. Usually it's easier to change your upload mechanism.

Sails Skipper: how to read and validate a csv file and exclude the invalid file types during upload?

I'm trying to write a controller that uploads a file to S3 location. However, before upload I need to validate if the incoming file type is a csv or not. And then I need to read the file to check for header colummns in the files etc. I got the type of the file as per below snippet:
req.file('foo')._files[0].stream
But, how to read the entire file stream and check for headers and data etc?There were other similar Qs like (Sails.js Skipper: How to read the uploaded file stream during upload?). But the solution mentioned is to use skipper-csv adapter(which i cannot use as I already use skipper-s3 to upload to s3).
Can someone please post an example on how to read the upstreams and perform any validations before the upload?
Here is how my problem got solved: I'm making a copy of the stream to validate before actual upload. And then checking my validations on the original stream and once passed, I upload the copied stream to my desired location.
For reading the Csv stream, I found a npm package: csv-parser(https://github.com/mafintosh/csv-parser) , which I felt easy to handle events like headers, data.
For creating the copy of the stream, I used the following logic:
const upstream = req.file('file');
const fileStreamMap = {};
const fileStreamMapCopy = {};
_.each(upstream._files, (file) => {
const stream = PassThrough();
const streamCopy = PassThrough();
file.stream.pipe(stream);
file.stream.pipe(streamCopy);
fileStreamMap[fileName] = stream;
fileStreamMapCopy[fileName] = streamCopy;
});
// validate and upload files to S3, if Valid.
validateAndUploadFile(fileStreamMap, fileStreamMapCopy);
}
validateAndUploadFile() contains my custom validation logic for my csv upload.
Also, we can use aws-sdk(https://www.npmjs.com/package/aws-sdk) for s3 upload.
Hope, this helps someone.

Why do some invalid MIME types trigger a "TypeError," and other invalid MIME types bypass the error and trigger an unprompted download?

I'm making a fairly simple Express app with only a few routes. My question isn't about the app's functionality but about a strange bit of behavior of an Express route.
When I start the server and use the /search/* route, or any route that takes in a parameter, and I apply one of these four content-types to the response:
res.setHeader('content-type', 'plain/text');
res.setHeader('content-type', 'plain/html');
res.setHeader('content-type', 'html/plain');
res.setHeader('content-type', 'html/text');
the parameter is downloaded as a file, without any prompting. So using search/foobar downloads a file named "foobar" with a size of 6 bytes and an unsupported file type. Now I understand that none of these four types are actual MIME types, I should be using either text/plain or text/html, but why the download? These two MIME types behave like they should, and the following MIME types with a type but no subtype all fail like they should, they all return an error of TypeError: invalid media type:
res.setHeader('content-type', 'text');
res.setHeader('content-type', 'plain');
res.setHeader('content-type', 'html');
Why do some invalid types trigger an error, and other invalid types bypass the error and trigger a download?
What I've found out so far:
I found in Express 4.x docs that res.download(path [, filename]) transfers the file at path as an “attachment,” and will typically prompt the user for the download, but this download is neither prompted nor intentional.
I wasn't able to find any situation like this in the Express docs (or here on SO) where running a route caused a file to automatically download to your computer.
At first I thought the line res.send(typeof(res)); was causing the download, but after commenting out lines one at a time and rerunning the server, I was able to figure out that only when the content-type is set to 'plain/text' does the download happen. It doesn't matter what goes inside res.send(), when the content-type is plain/text, the text after /search/ is downloaded to my machine.
Rearranging the routes reached the same result (everything worked as it should except for the download.)
The app just hangs at whatever route was reached before /search/foo, but the download still comes through.
My code:
'use strict';
var express = require('express');
var path = require('path');
var app = express();
app.get('/', function (req, res) {
res.sendFile(path.join(__dirname+'/index.html'));
});
app.get('/search', function(req,res){
res.send('search route');
});
app.get('/search/*', function(req, res, next) {
res.setHeader('content-type', 'plain/text');
var type = typeof(res);
var reqParams = req.params;
res.send(type);
});
var server = app.listen(process.env.PORT || 3000, function(){
console.log('app listening on port ' + process.env.PORT + '!');
});
module.exports = server;
Other Details
Express version 4.15.2
Node version 4.7.3
using Cloud9
am Express newbie
my repo is here, under the branch "so_question"
Why do some invalid types trigger an error...
Because a MIME-type has a format it should adhere to (documented in RFC 2045), and the ones triggering the error don't match that format.
The format looks like this:
type "/" subtype *(";" parameter)
So there's a mandatory type, a mandatory slash, a mandatory subtype, and optional parameters prefixed by a semicolon.
However, when a MIME type matches that format, it's only syntactically valid, not necessarily semantically, which brings us to the second part of your question:
...and other invalid types bypass the error and trigger a download?
That follows from that is written in RFC 2049:
Upon encountering any unrecognized Content-Type field, an implementation must treat it as if it had a media type of "application/octet-stream" with no parameter sub-arguments. How such data are handled is up to an implementation, but likely options for handling such unrecognized data include offering the user to write it into a file (decoded from its mail transport format) or offering the user to name a program to which the decoded data should be passed as input.
(emphasis mine)
The order in which you define your routes matters a lot in express, you probably need to move your default '/' route to be after the '/search/*' route.

ASP.NET Web API - Reading querystring/formdata before each request

For reasons outlined here I need to review a set values from they querystring or formdata before each request (so I can perform some authentication). The keys are the same each time and should be present in each request, however they will be located in the querystring for GET requests, and in the formdata for POST and others
As this is for authentication purposes, this needs to run before the request; At the moment I am using a MessageHandler.
I can work out whether I should be reading the querystring or formdata based on the method, and when it's a GET I can read the querystring OK using Request.GetQueryNameValuePairs(); however the problem is reading the formdata when it's a POST.
I can get the formdata using Request.Content.ReadAsFormDataAsync(), however formdata can only be read once, and when I read it here it is no longer available for the request (i.e. my controller actions get null models)
What is the most appropriate way to consistently and non-intrusively read querystring and/or formdata from a request before it gets to the request logic?
Regarding your question of which place would be better, in this case i believe the AuthorizationFilters to be better than a message handler, but either way i see that the problem is related to reading the body multiple times.
After doing "Request.Content.ReadAsFormDataAsync()" in your message handler, Can you try doing the following?
Stream requestBufferedStream = Request.Content.ReadAsStreamAsync().Result;
requestBufferedStream.Position = 0; //resetting to 0 as ReadAsFormDataAsync might have read the entire stream and position would be at the end of the stream causing no bytes to be read during parameter binding and you are seeing null values.
note: The ability of a request's content to be read single time only or multiple times depends on the host's buffer policy. By default, the host's buffer policy is set as always Buffered. In this case, you will be able to reset the position back to 0. However, if you explicitly make the policy to be Streamed, then you cannot reset back to 0.
What about using ActionFilterAtrributes?
this code worked well for me
public HttpResponseMessage AddEditCheck(Check check)
{
var request= ((System.Web.HttpContextWrapper)Request.Properties.ToList<KeyValuePair<string, object>>().First().Value).Request;
var i = request.Form["txtCheckDate"];
return Request.CreateResponse(HttpStatusCode.Ok);
}

Upload file to Solr with HttpClient and MultipartEntity

httpclient, httpmime 4.1.3
I am trying to upload a file through http to a remote server with no success.
Here's my code:
HttpPost method;
method = new HttpPost(solrUrl + "/extract");
method.getParams().setParameter("literal.id", fileId);
method.getParams().setBooleanParameter("commit", true);
MultipartEntity me = new MultipartEntity();
me.addPart("myfile", new InputStreamBody(doubleInput, contentType, fileId));
method.setEntity(me);
//method.setHeader("Content-Type", "multipart/form-data");
HttpClient httpClient = new DefaultHttpClient();
HttpResponse hr = httpClient.execute(method);
The server is Solr.
This is to replace a working bash script that calls curl like this,
curl http://localhost:8080/solr/update/extract?literal.id=bububu&commit=true -F myfile=#bububu.doc
If I try to set "Content-Type" "multipart/form-data", the receiving part says that there's no boundary (which is true):
HTTP Status 500 - the request was rejected because no multipart boundary was found
If I omit this header setting, the server issues an error description that, as far as I discovered, indicates that the content type was not multipart [2]:
HTTP Status 400. The request sent by the client was syntactically incorrect ([doc=null] missing required field: id).
This is related to [1] but I couldn't determine the answer from it. I was wondering,
I am in the same situation but didn't understand what to do. I was hoping that the MultipartEntity would tell the HttpPost object that it is multipart, form data and have some boundary, and I wouldnt set content type by myself. I didn't quite get how to provide boundaries to the entities - the MultipartEntity doesn't have a method like setBoundary. Or, how to get that randomly generated boundary to specify it in addHeader by myself - no getBoundary methor either...
[1] Problem with setting header "Content-Type" in uploading file with HttpClient4
[2] http://lucene.472066.n3.nabble.com/Updating-the-index-with-a-csv-file-td490013.html
I am suspicious of
method.getParams().setParameter("literal.id", fileId);
method.getParams().setBooleanParameter("commit", true);
In the first line, is fileId a string or file pointer (or something else)? I hope it is a string. As for the second line, you can rather set a normal parameter.
I am trying to tackle the HTTP Status 400. I dont know much Java (or is that .Net?)
http://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_Error