express appending trailing slash to my static files - express

Somehow I got my express app into a broken state where it's appending a trailing slash to all static file requests. I'm using this:
app.use([/^\/api*/, '/*'], express.static(path.resolve('./public/dist'), { maxAge: 86400000 }));
Whenever a request comes in for a found resource (e.g. JS file), express appends a slash, causing browser to think it's an HTML file.

Related

Url with trailing slash is redirected back to the host url

i have sales channel with the domain "http://tld.de/staging" and only this domain, the only other sales channel is the default headless sales channel.
I created a custom route like that:
#Route("/path", name="frontend.path.index", options={"seo"="false"}, methods={"GET"})
And it works just fine when using the url http://tld.de/staging/path but when I use the url http://tld.de/staging/path/ (with trailing slash) it redirects me to http://tld.de/path
I already tried to add a second route for that method but that didn't work
#Route("/path/", name="frontend.path.index.trailing_slash", options={"seo"="false"}, methods={"GET"})
Did I miss something or is that just the default behaviour?
Additional information: The custom controller class extends the StorefrontController and there are not redirects happenening inside of the custom controller, I even added a dd() at the beginning of the method for testing. And If i add the trailing slash to the first route the url with the trailing slash works but the one without gets redirected instead.
The redirect of the url with the trailing slash to the one without is the default behavior of Symfony. If that redirect doesn't respect your base path, then it's probably an issue with your configuration.
However if you absolutely must serve both URLs without redirecting, then you could use wildcards to do so. Obviously this is unconventional but should work:
/**
* #Route("/path{trailingSlash}", name="frontend.path.index", methods={"GET"}, defaults={"trailingSlash"="/"}, requirements={"trailingSlash"="[/]{0,1}"})
*/

Gatsby public folder structure doesn't play nicely with S3 static website hosting

So, let's say I have a simple Gatsby website with no backend, just plain text with two pages, index and about-us
Now when I run gatsby build, the public folder structure is something like this:
├── index.html
├── about-us
│ ├── index.html
Problem is, this kind of structure does not play nicely with S3, if I make a request to mywebsite.com/about-us, it will actually return a 404. S3 with static hosting enabled does not automatically route to mywebsite.com/about-us/index.html, although if I manually browse to that page it would work, but having my routes like that is a nightmare.
Question is, is there some configuration in Gatsby to make it not generate subfolders like this? And instead just create a about-us.html in the root folder?
So, I want to achieve the following :
├── index.html
├── about-us.html
My src/pages structure is the following :
├── index.tsx
├── about-us.tsx
There is a gatsby plugin that kinda does what you are asking but make sure you are setting your canonical urls to prevent duplicate content being recorded by search engines.
This is common with gatsby and next js (with trailing slash enabled and a static export). They statically build sites as you are describing with /path/index.html and route to the path via /path/ (notice the trailing slash).
The simplest complete and non-plugin solution is to put your S3 site behind a CloudFront distribution and add a lambda function to redirect all traffic to the right route. I've ran this in production on some small sites my bill is less than 1 USD monthly.
Here is the Gatsby docs for this but they use CLI, there are also other references at the bottom of that page.
Here is the AWS docs on how to create a static site with CloudFront distribution.
After you have CloudFront in place, you will need to create a lambda function to behind the scenes redirect /route/ to /route/index.html but show /route/ in the URL and set the appropriate permissions.
Go to lambda and create a new function from scratch using node 14.x.x
Click on icon next to index.js
Cut and paste the code below and save
Lambda function - The function redirects non trailing slash (/route) to trailing slash (/route/) which is the same as /route/index.html. Also, redirects /route/index.html to /route for user friendly urls. Lastly, responds with a 301 to tell search engines that only the /route/ is valid.
"use strict";
exports.handler = (event, _, callback) => {
// Extract the request from the CloudFront event that is sent to Lambda#Edge
let request = event.Records[0].cf.request;
// Extract the URI from the request
let oldUri = request.uri;
// If URI is a file
const isFile = /\/[^/]+\.[^/]+$/.test(oldUri);
// If not a file request and does not end with / redirect to /
if (!isFile && !oldUri.endsWith("/")) {
return callback(null, {
body: "",
status: "301",
statusDescription: "Moved Permanently",
querystring: request.querystring,
headers: {
location: [
{
key: "Location",
value: `${oldUri}/`,
},
],
},
});
}
// Match any '/' that occurs at the end of a URI. Replace it with a default index
request.uri = oldUri.replace(/\/$/, "/index.html");
// Return to CloudFront
return callback(null, request);
};
Note - the lambda function was taken from somewhere and I modified it awhile back for this case and I don't remember where it came from - I just cut an pasted if from my lambda.
After you do the above, you will need to edit your CloudFront distribution to accept the lambda function.
Go to CloudFront and select your distribution
Select the the Behaviors tab, select the default behavior, and click the edit button
Lambda Function Associations set - CloudFront Event to Origin Request - then set the Lambda function ARN to the function you just created.
Lastly, you need to set the IAM permissions so your distribution can access your lambda function.
You can configure an index document. Then if you visit mywebsite.com/about-us, S3 will first look for an object about-us. If the about-us object is not found, it searches for an index document, about-us/index.html.

How does ExpressJS determine if a requested resource is a "static" file?

I'm using JSPM to manage my client side dependencies and serving files using ExpressJS
My Directory structure is
node_modules
routes
views
app.js
public
css
images
js
main.js
jspm_packages
system.js
npm
angular2#2.0.0-beta.7.js
I have static route setup in my Express app.js as follows:
app.use(express.static(path.join(__dirname, 'public')));
As expected when I request for GET /jspm_packages/system.js
it serves the file correctly
however when I request GET /jspm_packages/npm/angular2#2.0.0-beta.7.js
It gives me a 404 - not found.
I suspect some of those special characters in the file name are messing up express from resolving the request as a "static" file and using the correct static route.
How can I test if express is marking the request as "static"?
How can I overwrite the express regex (or whatever mechanism) express is using to mark a request as "static"?
How can I write a custom middleware using my own regex and forward the request to static instead?
thanks.
When working with express, you must make sure right middleware is registered in right order.
As per your question, How does express identifies a resource as static?, Actually express does not determine if its static or not, it does not even understand request types, what express does is execute proper middleware for given request.
When express receives a request, It goes and starts executing matching middleware in sequence until it runs out of them.
So in practice, you'll always register your static middleware first (just after request parsers and all), before your dynamic routes. Like shown below ...
var express = require('express');
var path = require('path');
var cookieParser = require('cookie-parser');
var bodyParser = require('body-parser');
var routes = require('./routes/index');
var app = express();
// view engine setup
app.set('views', path.join(__dirname, 'views'));
app.set('view engine', 'jade');
app.use(bodyParser.json());
app.use(bodyParser.urlencoded({ extended: false }));
app.use(cookieParser());
//Register your static middleware
app.use(express.static(path.join(__dirname, 'public')));
//Other cool code
app.use('/', routes);
Now, request will first go through express.static middleware, if it does not find a matching file, it will call next internally and pass request to next middleware in chain.
Meaning, suppose you have a static file in public directory named users and you have a route named routes.get('/users' ..... Now when user requests /users, the request will first pass to express.static and if it finds users file (which it does in this case), our route that we registered using routes.get will never get called. Now if you do the same but just remove the file named users from the public folder, then express.static middleware won't be able to find a matching file and will pass request to next middleware in chain i.e. routes.get(/users' ...`
Express does not assume or identify a resource as static by matching or using regular-expressions. If express.static middleware finds it, it will serve it else it will pass on the request to the next middleware in queue.
Though I was pretty sure, I tried to reproduce your issue, and I was served with the file with all the special characters in it, just as shown in following image.
Please reconfirm following:
Your express.static middleware gets registered first.
The resource you are trying to access exists, i.e. physical path exsists. (resource url are case insensitive, i.e. /USERS and /users both will match a file /public/users if it exists.
Make sure you don't have a typo.
If this does not resolve your issue, please share your app.js file, some content if you can't share all of it.
Hope this helps! Let me know if you need further assistance ... :)

SEO issue and remove a trailing slash from URL

We have a web site written in ASP.NET. When you open the following page:
http://concert.local/elki/
You can see the slash "/" at the end. We need to remove it in order to have:
http://concert.local/elki
I've tried some things to make it work, but it doesn't help. For example, when I add the following code in Global.asax.cs file:
protected void Application_BeginRequest(Object sender, EventArgs e)
{
if (HttpContext.Current.Request.Url.ToString().Contains("http://concert.local/elki/"))
{
HttpContext.Current.Response.Status = "301 Moved Permanently";
HttpContext.Current.Response.AddHeader("Location", Request.Url.ToString().ToLower().Replace("http://concert.local/elki/", "http://concert.local/elki"));
}
}
The following error comes up:
Firefox has detected that the server is redirecting the request for this address in a way that will never complete.
This problem can sometimes be caused by disabling or refusing to accept cookies.
There is also the following code:
<asp:Content ID="Content1" runat="server" ContentPlaceHolderID="ContentHead">
<link rel="canonical" href="http://concert.local/elki" />
</asp:Content>
That puts canonical stuff in page header.
How can I get the following URL:
http://concert.local/elki
?
Check out this answer: url trailing slash and seo
It basically says that Google prefers the trailing slash. Just code consistently and you should be fine.
Here is the official answer from Google. Effectively, they dont care whether you have a trailing slash or not
http://googlewebmastercentral.blogspot.fr/2010/04/to-slash-or-not-to-slash.html
Google treats each URL above separately (and equally) regardless of
whether it’s a file or a directory, or it contains a trailing slash or
it doesn’t contain a trailing slash.

Double slash at beginning of javascript include

I have been looking at the html5 boilerplate and noticed that the jquery include url starts with a double slash. The url is //ajax.googleapis.com/ajax/libs/jquery/1.5.1/jquery.min.js
Why is the http: missing?
I hate answering with a link but this explains it - http://paulirish.com/2010/the-protocol-relative-url/
Using a protocol relative URL like "//mydomain/myresource" will ensure that the content will be served via the same scheme as the hosting page. It can make testing a bit more awkward if you ever use FILE:// and then some remote locations as they will obviously resolve back to FILE. Never the less it does resolve the mixed insecure/secure content messages you can cause by not using it.
So that if the .html is accessed via HTTPS; the page will not have any unsecured script.