best way to serve local file via express.js server - express

Our server serves single json file which has 1GB size. When request incomes, server reads that file and if certain element is included(matched) then we server that file, but if not, just throws an error.
There are two tackling points,
every time request is comming, read same file every time. we just want to read certain line
serve the file to thousands of users "concurrently".
How to design, non-blocking, asynchronous, efficient way of serving file?
// pseudo code
(req, res) => {
fs.readFile('/path/to/file.json', (result) => {
if (!someValidationLogic(result, req.body.someParameter)) {
throw new Error();
}
return res.send(result);
});
};
this is bad because, it does not utilize 'stream' functionality of node.js

Related

Error: Network Connection Lost - saving form data (file) to R2 bucket

I have this handler in my worker:
const data = await event.request.formData();
const key = data.get('filename');
const file = data.get('file');
if (typeof key !== 'string' || !file) {
return res.send(
{ message: 'Post body is not valid.' },
undefined,
400
);
}
await BUCKET.put(key, file);
return new Response(file);
If I comment out the await BUCKET.put(key, file); line, then I get the response of the file as expected. But with that line in the function, I get the error:
Uncaught (in promise) Error: Network connection lost.
I have confirmed that by changing the put to a get, I can retrieve files from that bucket, so there doesn't seem to be a problem with the connection itself.
Are you still having this problem? I'll need your account ID to figure out what's going on. If you DM me (Vitali) on Discord your account ID (& this SO link for context) I can probably help you out (or email me directly at cloudflare.com using vlovich as the account if you don't have/don't want to sign up on Discord). I'm the tech lead for R2.
EDIT 2022-09-07.
I just noticed that you're calling formData on the request. This is causing you to read the object into RAM. Workers has a 128 MiB limit so what's likely happening is that you're exceeding that limit (probably egregiously since we do give some buffer) and thus Cloudflare is terminating your Worker.
What you'll want to do is make sure you upload the file raw (not as a form) and access the raw ReadableStream. Alternatively, you can try writing a TransformStream to parse out the payload in a streaming fashion if you're confident the file payload (& any metadata you need) will come after the name. Usually it's easier to change your upload mechanism.

What is the order of HTTP responses, data, and error, and are they guaranteed to be in that order?

I am debugging/testing the part of my app that sends an HTTP POST, with a file to upload, to a 3rd party server and need to be sure of the order of the info that server sends back.
Below is a very trimmed down sample of how I handle what is returned by the server.
At this moment, am debugging for files sent that exceed the LimitRequestBody size set in Apache. Yes, I do check file size in my app before sending, but am trying to debug for anything possible, i.e. malicious bot sending data outside of my app.
What I can't seem to find online is the lifecycle of what a server will send back in terms of the response, data, and error, and need to be sure I will get them back in this order:
Response
Data
and if there's an error:
Error (and then nothing else)
uploadSession.dataTask(with: upFile.toUrl!)
{ (data, response, error) in
if let response = response {
upLoadInvClass.upResp(resp: response)
}
if let error = error {
upLoadInvClass.upErr(error: error)
}
if let data = data {
upLoadInvClass.upData(data: data)
}
}.resume()

Nodejs how to pass parameters into an exported route from another route?

Suppose I module export "/route1" in route1.js, how would I pass parameters into this route from "/route2" defined in route2.js?
route1.js
module.exports = (app) => {
app.post('/route1', (req,res)=>{
console.log(req.body);
});
}
route2.js
const express = require('express');
const app = express();
//import route1 from route1.js
const r1 = require('./route1')(app);
app.post('/route2', (req, res) => {
//how to pass parameters?
app.use(???, r1) ?
})
In short, route 1 output depends on the input from route 2.
You don't pass parameters from one route to another. Each route is a separate client request. http, by itself, is stateless where each request stands on its own.
If you describe what the actual real-world problem you're trying to solve is, we can help you with some of the various tools there are to use for managing state from one request to the next in http servers. But, we really need to know what the REAL world problem is to know what best to suggest.
The general tools available are:
Set a cookie as part the first response with some data in the cookie. On the next request sent from that client, the prior cookie will be sent with it so the server can see what that data is.
Create a server-side session object (using express-session, probably) and set some data in the session object. In the 2nd request, you can then access the session object to get that previously set data.
Return the data to the client in the first request and have the client send that data back in the 2nd request either in query string or form fields or custom headers. This would be the truly stateless way of doing things on the server. Any required state is kept in the client.
Which option results in the best design depends entirely upon what problem you're actually trying to solve.
FYI, you NEVER embed one route in another like you showed in your question:
app.post('/route2', (req, res) => {
//how to pass parameters?
app.use(???, r1) ?
})
What that would do is to install a new permanent copy of app.use() that's in force for all incoming requests every time your app.post() route was hit. They would accumlate forever.

Soundcloud API /stream endpoint giving 401 error

I'm trying to write a react native app which will stream some tracks from Soundcloud. As a test, I've been playing with the API using python, and I'm able to make requests to resolve the url, pull the playlists/tracks, and everything else I need.
With that said, when making a request to the stream_url of any given track, I get a 401 error.
The current url in question is:
https://api.soundcloud.com/tracks/699691660/stream?client_id=PGBAyVqBYXvDBjeaz3kSsHAMnr1fndq1
I've tried it without the ?client_id..., I have tried replacing the ? with &, I've tried getting another client_id, I've tried it with allow_redirects as both true and false, but nothing seems to work. Any help would be greatly appreciated.
The streamable property of every track is True, so it shouldn't be a permissions issue.
Edit:
After doing a bit of research, I've found a semi-successful workaround. The /stream endpoint of the API is still not working, but if you change your destination endpoint to http://feeds.soundcloud.com/users/soundcloud:users:/sounds.rss, it'll give you an RSS feed that's (mostly) the same as what you'd get by using the tracks or playlists API endpoint.
The link contained therein can be streamed.
Okay, I think I have found a generalized solution that will work for most people. I wish it were easier, but it's the simplest thing I've found yet.
Use API to pull tracks from user. You can use linked_partitioning and the next_href property to gather everything because there's a maximum limit of 200 tracks per call.
Using the data pulled down in the JSON, you can use the permalink_url key to get the same thing you would type into the browser.
Make a request to the permalink_url and access the HTML. You'll need to do some parsing, but the url you'll want will be something to the effect of:
"https://api-v2.soundcloud.com/media/soundcloud:tracks:488625309/c0d9b93d-4a34-4ccf-8e16-7a87cfaa9f79/stream/progressive"
You could probably use a regex to parse this out simply.
Make a request to this url adding ?client_id=... and it'll give you YET ANOTHER url in its return json.
Using the url returned from the previous step, you can link directly to that in the browser, and it'll take you to your track content. I checked on VLC by inputting the link and it streams correctly.
Hopefully this helps some of you out with your developing.
Since I have the same problem, the answer from #Default motivated me to look for a solution. But I did not understand the workaround with the permalink_url in the steps 2 and 3. The easier solution could be:
Fetch for example user track likes using api-v2 endpoint like this:
https://api-v2.soundcloud.com/users/<user_id>/track_likes?client_id=<client_id>
In the response we can finde the needed URL like mentioned from #Default in his answer:
collection: [
{
track: {
media: {
transcodings:[
...
{
url: "https://api-v2.soundcloud.com/media/soundcloud:tracks:713339251/0ab1d60e-e417-4918-b10f-81d572b862dd/stream/progressive"
...
}
]
}
}
...
]
Make request to this URL with client_id as a query param and you get another URL with that you can stream/download the track
Note that the api-v2 is still not public and the request from your client probably will be blocked by CORS.
As mentioned by #user208685 the solution can be a bit simpler by using the SoundCloud API v2:
Obtain the track ID (e.g. using the public API at https://developers.soundcloud.com/docs)
Get JSON from https://api-v2.soundcloud.com/tracks/TRACK_ID?client_id=CLIENT_ID
From JSON parse MP3 progressive stream URL
From stream URL get MP3 file URL
Play media from MP3 file URL
Note: This link is only valid for a limited amount of time and can be regenerated by repeating steps 3. to 5.
Example in node (with node-fetch):
const clientId = 'YOUR_CLIENT_ID';
(async () => {
let response = await fetch(`https://api.soundcloud.com/resolve?url=https://soundcloud.com/d-o-lestrade/gabriel-ananda-maceo-plex-solitary-daze-original-mix&client_id=${clientId}`);
const track = await response.json();
const trackId = track.id;
response = await fetch(`https://api-v2.soundcloud.com/tracks/${trackId}?client_id=${clientId}`);
const trackV2 = await response.json();
const streamUrl = trackV2.media.transcodings.filter(
transcoding => transcoding.format.protocol === 'progressive'
)[0].url;
response = await fetch(`${streamUrl}?client_id=${clientId}`);
const stream = await response.json();
const mp3Url = stream.url;
console.log(mp3Url);
})();
For a similar solution in Python, check this GitHub issue: https://github.com/soundcloud/soundcloud-python/issues/87

Caching JSON with Cloudflare

I am developing a backend system for my application on Google App Engine.
My application and backend server communicating with json. Like http://server.example.com/api/check_status/3838373.json or only http://server.example.com/api/check_status/3838373/
And I am planning to use CloudFlare for caching JSON pages.
Which one I should use on header? :
Content-type: application/json
Content-type: text/html
Is CloudFlare cache my server's responses to reduce my costs? Because I'll not use CSS, image, etc.
The standard Cloudflare cache level (under your domain's Performance Settings) is set to Standard/Aggressive, meaning it caches only certain types by default scripts, stylesheets, images. Aggressive caching won't cache normal web pages (ie at a directory location or *.html) and won't cache JSON. All of this is based on the URL pattern (e.g. does it end in .jpg?) and regardless of the Content-Type header.
The global setting can only be made less aggressive, not more, so you'll need to setup one or more Page Rules to match those URLs, using Cache Everything as the custom cache rule.
http://blog.cloudflare.com/introducing-pagerules-advanced-caching
BTW I wouldn't recommend using an HTML Content-Type for a JSON response.
By default, Cloudflare does not cache JSON file. I've ended up with config a new page rule:
https://example.com/sub-directiory/*.json*
Cache level: Cache Everything
Browser Cache TTL: set a timeout
Edge Cache TTL: set a timeout
Hope it saves someone's day.
The new workers feature ($5 extra) can facilitate this:
Important point:
Cloudflare normally treats normal static files as pretty much never expiring (or maybe it was a month - I forget exactly).
So at first you might think "I just want to add .json to the list of static extensions". This is likely NOT want you want with JSON - unless it really rarely changed - or is versioned by filename. You probably want something like 60 seconds or 5 minutes so that if you update a file it'll update within that time but your server won't get bombarded with individual requests from every end user.
Here's how I did this with a worker to intercept all .json extension files:
// Note: there could be tiny cut and paste bugs in here - please fix if you find!
addEventListener('fetch', event => {
event.respondWith(handleRequest(event));
});
async function handleRequest(event)
{
let request = event.request;
let ttl = undefined;
let cache = caches.default;
let url = new URL(event.request.url);
let shouldCache = false;
// cache JSON files with custom max age
if (url.pathname.endsWith('.json'))
{
shouldCache = true;
ttl = 60;
}
// look in cache for existing item
let response = await cache.match(request);
if (!response)
{
// fetch URL
response = await fetch(request);
// if the resource should be cached then put it in cache using the cache key
if (shouldCache)
{
// clone response to be able to edit headers
response = new Response(response.body, response);
if (ttl)
{
// https://developers.cloudflare.com/workers/recipes/vcl-conversion/controlling-the-cache/
response.headers.append('Cache-Control', 'max-age=' + ttl);
}
// put into cache (need to clone again)
event.waitUntil(cache.put(request, response.clone()));
}
return response;
}
else {
return response;
}
}
You could do this with mime-type instead of extension - but it'd be very dangerous because you'd probably end up over-caching API responses.
Also if you're versioning by filename - eg. products-1.json / products-2.json then you don't need to set the header for max-age expiration.
You can cache your JSON responses on Cloudflare similar to how you'd cache any other page - by setting the Cache-Control headers. So if you want to cache your JSON for 60 seconds on the edge (s-maxage) and the browser (max-age), just set the following header in your response:
Cache-Control: max-age=60, s-maxage=60
You can read more about different cache control header options here:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control
Please note that different Cloudflare plans have different value for minimum edge cache TTL they allow (Enterprise plan allows as low as 1 second). If your headers have a value lower than that, then I guess they might be ignored. You can see the limits here:
https://support.cloudflare.com/hc/en-us/articles/218411427-What-does-edge-cache-expire-TTL-mean-#summary-of-page-rules-settings