How to fix incorrect nginx s3 reverse proxy paths? - amazon-s3

I'm working on an nginx s3 reverse proxy container image to proxy frontend files (Angular apps) from s3 behind an Application Load Balancer. The frontend files are located in the specific folder of the given app name in the s3 bucket. These are angular apps which are built using standard angular commands. The dist contents are uploaded to s3 and then the ALB route paths, along with the nginx locations map to those app folders in s3. For example, here is my nginx conf file:
server {
listen 80;
listen 443 ssl;
ssl_certificate /etc/ssl/nginx-server.crt;
ssl_certificate_key /etc/ssl/nginx-server.key;
server_name timemachine.com;
sendfile on;
default_type application/octet-stream;
resolver 8.8.8.8;
server_tokens off;
location ~ ^/app1/(.*) {
proxy_http_version 1.1;
proxy_buffering off;
proxy_ignore_headers "Set-Cookie";
proxy_hide_header x-amz-id-2;
proxy_hide_header x-amz-request-id;
proxy_hide_header x-amz-meta-s3cmd-attrs;
proxy_hide_header Set-Cookie;
proxy_set_header Authorization "";
proxy_intercept_errors on;
rewrite ^/app1/?$ /app1/index.html;
proxy_pass https://<s3 bucket name here>;
break;
}
}
So there is a corresponding bucket folder /app1 in s3 which has the dist contents and is serving up the index.html. And on the ALB, there are two route paths. The first is /app1 which redirects to https:{port}//app1/ and then the second route path /app1/* which just forwards to the nginx reverse proxy container deployed via ECS Fargate.
This is not using cloudfront. The bucket is proxied internally on https and specific permissions are set on the bucket to be accessible w/in the given VPC.
The angular apps have specific modules, but the issue is since Im not saving any of this content in the container, I can't just do a try_files, or set an index to make this work, since all of this is proxied from s3 and the content is accessed differently.
I can access the app at with the given proxy configuration above, but for other paths, say when I navigate to the part of the apps where its /app1/account and then do a refresh, the page throws an access denied on the bucket and I just get the standard xml page in the browser.
How do I get this to work with all of those other relative paths without having to add each of those paths to nginx or the ALB routes? In other words, I dont want to have to add
location /app1/account {
}
and so on, or something like that. Yes, Im sort of new to nginx, so im still figuring things out.
I was expecting the above proxy to work with all paths on /app1 but im unsure what other route paths need to be added to the ALB or if the regex is off, or what else needs to be added to the nginx conf file.
All that to say, when I enter this
https://timemachine.com/app1
or this,
https://timemachine.com/app1/
both work and just rewrite to the index.html which is good.
After this, when I click on another icon in the UI that directs to another path on /app1/, I get directed to the page correctly at...
https://timemachine.com/app1/news
but then on a refresh on this path, instead of hitting url https://timemachine.com/app1/news, with all the data shown when I accessed this through UI, the url stays at https://timemachine.com/app1/news but the page defaults to s3 bucket access denied on that route(.xml).
The goal is just to be able to reload on the pages I can already access without the UI blowing up and defaulting to the access denied message. So I would like to be able to just enter https://timemachine.com/app1/news, which will display the content, then do a refresh and see the content again.
The are various modules within the angular apps and so these are relative paths, which may be part of the problem.
NOTE: All files, aside from assets folder, are in the base app1 bucket folder. So https://<s3_bucket_name>/app1 (with app1 being the folder).

Angular's docs indicate to use the Frontend Controller pattern for static files like so:
Use try_files, as described in Front Controller Pattern Web Apps,
modified to serve index.html:
try_files $uri $uri/ /index.html;
Obviously, that won't work here (since the files aren't local to nginx) so my understanding is we're looking for equivalent logic to that for when the files are hosted elsewhere.
Route not-assets to index.html
All assets are in the /assets/ folder - so the simplest solution is to look for anything starting with not-that and proxy those requests to the html file for the response:
server {
location ~ /app1/ {
rewrite ^/app1/(?!assets/) /index.html;
proxy_pass https://domain/bucket/app1/;
}
}
That regex means that:
/app1/assets/some.css gets proxied to https://domain/bucket/app1/assets/some.css
/app1/ gets proxied to https://domain/bucket/app1/index.html
/app1/something/else gets proxied to https://domain/bucket/app1/index.html
etc.
Do note that this is going to make your app respond HTTP 200 OK with html to almost any url - which may be confusing.
If there are any problems setting this up, enable the nginx debug log to see to what url requests are being proxied, and determine the difference from what's desired.

Related

Nginx -> Apache 2 authentication -> return to Nginix

We have a nginx and an apache2 server.
Apache2 is configured to manage Kerberos (Active Directory) authentication.
We have a website managed by nginx with a reserved area.
I would know if this is possible:
the user goes to main site managed by nginx
from main site, there is a link to "/login" mapped to apache2:
location /login/ {
proxy_pass http://apache2server/testlogin;
}
when the login is successful, apache2 is configured to go to another nginx webpage, using proxypass too:
ProxyPass /testlogin http://nginxserver/logindone.php
ProxyPassReverse /testlogin http://nginxserver/logindone.php
I wonder if this is the right solution to the problem.
The best way you can implement an external authentication to your NGiNX website is using auth_request directive.
Basically, you can protect any request doing a subrequest to any external web server. The subrequest must return HTTP code 2XX to allow proceeding to the content, and any other HTTP code returned will deny access.
To accomplish that, be sure you've NGiNX with auth_request enabled (compiled with --with-http_auth_request_module). To check that, use the following command at shell:
nginx -V 2>&1 | grep "http_auth_request_module"
Add the auth_request directive to the location you want to protect, specifying an internal location where the authorization subrequest will be forwarded to, using:
location /system/ {
auth_request /auth;
#...
}
So, when a request is made to /system/ location, the system will create a subrequest to /auth location. Now we need to create the internal /auth location. We can use the following example below:
location = /auth {
internal;
proxy_pass http://my.app.webserver/auth_endpoint;
proxy_pass_request_body off;
proxy_set_header Content-Length "";
#...
}
Here, we created the /auth internal location. We used the internal directive to disable external NGiNX access (any external request to /auth will not be processed by this location). Also, we removed the request body content and set the request length to zero, removing any original request variable. We do a subrequest to http://my.app.webserver/auth_endpoint passing all requested cookies, so your backend application could determine if user has access or not.
If you need to know the original requested URI, you can add it on an extra HTTP header at subrequest adding:
proxy_set_header X-Original-URI $request_uri;
You can learn more about NGiNX auth_request directive here.

nginx - proxy facebook/bots to a different server without changing canonical URL

TLDR;
How can I make it so that all scraper/bot requests reaching my frontend https://frontend.example.test/any/path/here are fed the data from https://backend.example.test/prerender/any/path/here without changing the canonical URL?
I have a complex situation where I have a Vue app that pulls data from a php API to render data. These are hosted in China so niceties like netlify prerender and prerender.io are not an option.
Initially I tried:
if ($http_user_agent ~* "googlebot|bingbot|yandex|baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp") {
rewrite ^/(.*)$ https://backend.example.test/prerender/$1 redirect;
}
which workd but Facebook used backend.example.text the canonical URL frontend.example.test.
Setting the og:url to the frontend app caused problems due to a redirect loop. I tried then setting the og:url to the frontend with a query param that skipped the nginx forward, but for some reason this wasn't working properly on the live server and I imagine facebook would still end up pulling the data from the final url anyhow.
Thus I imagine the only solution is to use proxy_pass but it is not permitted with a URI inside an if statement (and I have read the if is evil article).
I feel like all I need is something like a functioning version of:
location / {
if ($http_user_agent ~* "googlebot|bingbot|yandex|baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp") {
proxy_pass https://backend.example.test/prerender;
}
...
}
(I am of course aware of the contradiction of having to have Facebook sharing work in China, but the client is requesting this for their international users as well).
Here is the solution for your problem:
https://www.claudiokuenzler.com/blog/934/nginx-proper-way-different-upstream-user-agent-based-reverse-proxying-rule-without-if
I'm copying here the main parts in case the link breaks:
Create a dynamic target upstream with the map directive:
map "$http_user_agent" $targetupstream {
default http://127.0.0.1:8080;
"~^mybot" http://127.0.0.1:8090;
}
Here "~^mybot" is a regular expression, if the user agent matches that expression it will use that upstream server.
If the user-agent does not match any entries, Nginx will use the "default" entry (saving http://127.0.0.1:8080 as $targetupstream variable).
Then you just have to use a that upstream in a proxy pass setting:
location / {
include /etc/nginx/proxy.conf;
proxy_set_header X-Forwarded-Proto https;
proxy_pass $targetupstream;
}
Now, you could use one upstream pointing to locahost at a port that is being used by nginx to serve static files (for client only) and another port for the server renderer.

Nginx reverse proxy didnt load site correctly

I have an nginx configuration which listens to any subdomain *.mydomain.com,
and I want to use subdomain as variable, to proxy request to other site.
Here is my nginx configuration
server {
listen 80;
server_name "~^(?<subdomain>.*).mydomain.com";
location / {
resolver 1.1.1.1 1.0.0.1 ipv6=off;
proxy_pass http://hosting.mydomain.com/$subdomain/;
proxy_redirect off;
access_log /var/log/nginx/proxy.log;
}
}
As I request the site directly and it loads perfectly
Site placed on AWS S3, and bucket static website address cnamed to mydomain
However, when I try to access via user1.mydomain.com, the page didn't load images, and css
This is the same site
And in browser network panel shows
Difference between direct and proxy access
This issue is made, because I have many sites stored in S3 bucket and located in different folders (the folder name is used as subdomain).
And I want to use a single domain to access all of them via subdomains.
Thanks in advance
You forgot to proxy pass the URI, you're serving user1/index.html for every request, including for JS and CSS requests, it's why all of responses are the same size (2kb, the size of user1/index.html), and it's also why you're getting Uncaught SyntaxError: Unexpected token < in the first line of Enterprise_skeleton.bundle.js because it's returning an HTML document that starts with <!doctype html> instead of the actual JS bundle.
Change
location / {
proxy_pass http://hosting.mydomain.com/$subdomain/;
}
to
location / {
proxy_pass http://hosting.mydomain.com/$subdomain$uri;
}

nginx different authentication for different directories

finding alternative of htaccess in apache for nginx, want to add different authentication rules for different directories. Different .htpasswd files for different directories. how to do it? Just like bitbucket does, it runs on nginx too.
I'm pretty sure that the authentication within BitBucket is handled by the application, there's no .htpasswd files anywhere to be found, authentication isn't handled by the web server.
However you can configure nginx to use a different .htpasswd for different paths using different location blocks for each path
location / {
root /var/www/app/;
index index.html index.php;
auth_basic_user_file /var/www/app/.htpasswd;
}
location /pathA {
root /var/www/app/pathA;
index index.html index.php;
auth_basic_user_file /var/www/app/pathA/.htpasswd;
}
or something to that effect

How to setup mass dynamic virtual hosts in nginx?

Been playing with nginx for about an hour trying to setup mass dynamic virtual hosts.
If you ever done it in apache you know what I mean.
Goal is to have dynamic subdomains for few people in the office (more than 50)
Perhaps doing this will get you where you want to be:
server {
root /sites/$http_host;
server_name $http_host;
...
}
I like this as I can literally create sites on the fly, just create new directory named after the domain and point the DNS to the server ip.
You will need some scripting knowledge to put this together. I would use PHP, but if you are good in bash scripting use that. I would do it like this:
First create some folder (/usr/local/etc/nginx/domain.com/).
In main nginx.conf add command : include /usr/local/etc/nginx/domain.com/*.conf;
Every file in this folder should be different vhost names subdomain.conf.
You do not need to restart nginx server for config to take action, you only need to reload it : /usr/local/etc/rc.d/nginx reload
OR you can make only one conf file, where all vhosts should be set. This is probably better so that nginx doesn't need to load up 50 files, but only one....
IF you have problems with scripting, then ask question about that...
Based on user2001260's answer, later edited by partlov, here's my outcome.
Bear in mind this is for a dev server located on a local virtual machine, where the .dev prefix is used at the end of each domain. If you want to remove it, or use something else, the \.dev part in the server_name directive could be edited or altogether removed.
server {
listen 80 default_server;
listen [::]:80 default_server;
# Match any server name with the format [subdomain.[.subdomain...]].domain.tld.dev
server_name ~^(?<subdomain>([\w-]+\.)*)?(?<domain>[\w-]+\.[\w-]+)\.dev$;
# Map by default to (projects_root_path)/(domain.tld)/www;
set $rootdir "/var/www/$domain/www";
# Check if a (projects_root_path)/(subdomain.)(domain.tld)/www directory exists
if (-f "/var/www/$subdomain.$domain/www"){
# in which case, set that directory as the root
set $rootdir "/var/www/$subdomain.$domain/www";
}
root $rootdir;
index index.php index.html index.htm index.nginx-debian.html;
# Front-controller pattern as recommended by the nginx docs
location / {
try_files $uri $uri/ /index.php;
}
# Standard php-fpm based on the default config below this point
location ~ \.php$ {
include snippets/fastcgi-php.conf;
fastcgi_pass unix:/run/php/php7.0-fpm.sock;
}
location ~ /\.ht {
deny all;
}
}
The regex in server_name captures the variables subdomain and domain. The subdomain part is optional and can be empty. I have set it so that by default, if you have a subdomain, say admin.mysite.com the root is set to the same root as mysite.com. This way, the same front-controller (in my case index.php) can route based on the subdomain. But if you want to keep an altogether different application in a subdomain, you can have a admin.mysite.com dir and it will use that directory for calls to admin.mysite.com.
Careful: The use of if is discouraged in the current nginx version, since it adds extra processing overhead for each request, but it should be fine for use in a dev environment, which is what this configuration is good for. In a production environment, I would recommend not using a mass virtual host configuration and configuring each site separately, for more control and better security.
server_name ~^(?<vhost>[^.]*)\.domain\.com$;
set $rootdir "/var/www/whatever/$vhost";
root $rootdir;
As #Samuurai suggested here is a short version Angular 5 with nginx build integration:
server {
server_name ~^(?<branch>.*)\.staging\.yourdomain\.com$;
access_log /var/log/nginx/branch-access.log;
error_log /var/log/nginx/branch-error.log;
index index.html;
try_files $uri$args $uri$args/ $uri $uri/ /index.html =404;
root /usr/share/nginx/html/www/theft/$branch/dist;
}
Another alternative is to have includes a few levels deep so that directories can be categorized as you see fit. For example:
include sites-enabled/*.conf;
include sites-enabled/*/*.conf;
include sites-enabled/*/*/*.conf;
include sites-enabled/*/*/*/*.conf;
As long as you are comfortable with scripting, it is not very hard to put together some scripts that will quickly set up vhosts in nginx. This slicehost article goes through setting up a couple of vhosts and does it in a way that is easily scriptable and keeps the configurations separate. The only downside is having to restart the server, but that's to be expected with config changes.
Update: If you don't want to do any of the config maintaining yourself, then your only 2 options (the safe ones anyways) would be to either find a program that will let your users manage their own chunk of their nginx config (which will let them create all the subdomains they want), or to create such a user-facing management console yourself.
Doing this yourself would not be too hard, especially if you already have the scripts to do the work of setting things up. The web-based interface can call out to the scripts to do the actual work so that all the web interface has to deal with is managing who has access to what things.