aws s3 intergration with mediawiki - amazon-s3

I working with mediawiki.
I want to change upload directory path to aws s3, i tried these two extensions but i getting some warning message.
I dont know these extension are working correctly.
https://www.mediawiki.org/wiki/Extension:LocalS3Repo and
https://www.mediawiki.org/wiki/Extension:AWS
If anybody is working with these extension or if you achieved these in any other ways
please explain me

I have been succesfully using the method described here, though in step 6, rather than using an apache rewrite, I changed the image paths in LocalSettings.php.
(It was quite a lot of work though, and I never figured out a way to the the cache-control and expires headers on the files, which was the real reason why I wanted to do it to begin with.)

Related

Archiving an old PHP website: will any webhost let me totally disable query string support?

I want to archive an old website which was built with PHP. Its URLs are full of .phps and query strings.
I don't want anything to actually change from the perspective of the visitor -- the URLs should remain the same. The only actual difference is that it will no longer be interactive or dynamic.
I ran wget --recursive to spider the site and grab all the static content. So now I have thousands of files such as page.php?param1=a&param2=b. I want to serve them up as they were before, so that means they'll mostly have Content-Type: text/html, and the webserver needs to treat ? and & in the URL as literal ? and & in the files it looks up on disk -- in other words it needs to not support query strings.
And ideally I'd like to host it for free.
My first thought was Netlify, but deployment on Netlify fails if any files have ? in their filename. I'm also concerned that I may not be able to tell it that most of these files are to be served as text/html (and one as application/rss+xml) even though there's no clue about that in their filenames.
I then considered https://surge.sh/, but hit exactly the same problems.
I then tried AWS S3. It's not free but it's pretty close. I got further here: I was able to attach metadata to the files I was uploading so each would have the correct content type, and it doesn't mind the files having ? and & in their filenames. However, its webserver interprets ?... as a query string, and it looks up and serves the file without that suffix. I can't find any way to disable query strings.
Did I miss anything -- is there a way to make any of the above hosts act the way I want them to?
Is there another host which will fit the bill?
If all else fails, I'll find a way to transform all the filenames and all the links between the files. I found how to get wget to transform ? to #, which may be good enough. It would be a shame to go this route, however, since then the URLs are all changing.
I found a solution with Netlify.
I added the wget options --adjust-extension and --restrict-file-names=windows.
The --adjust-extension part adds .html at the end of filenames which were served as HTML but didn't already have that extension, so now we have for example index.php.html. This was the simplest way to get Netlify to serve these files as HTML. It may be possible to skip this and manually specify the content types of these files.
The --restrict-file-names=windows alters filenames in a few ways, the most important of which is that it replaces ? with #. This is needed since Netlify doesn't let us deploy files with ? in the name. It's a bit of a hack; this is not really what this option is meant for.
This gives static files with names like myfile.php#param1=value1&param2=value2.html and myfile.php.html.
I did some cleanup. For example, I needed to adjust a few link and resource paths to be absolute rather than relative due to how Netlify manages presence or lack of trailing slashes.
I wrote a _redirects file to define URL rewriting rules. As the Netlify redirect options documentation shows, we can test for specific query parameters and capture their values. We can use those values in the destinations, and we can specify a 200 code, which makes Netlify handle it as a rewrite rather than a redirection (i.e. the visitor still sees the original URL). An exclamation mark is needed after the 200 code if a "query-string-less" version (such as mypage.php.html) exists, to tell Netlify we are intentionally shadowing.
/mypage.php param1=:param1 param2=:param2 /mypage.php#param1=:param1&param2=:param2.html 200!
/mypage.php param1=:param1 /mypage.php#param1=:param1.html 200!
/mypage.php param2=:param2 /mypage.php#param2=:param2.html 200!
If not all query parameter combinations are actually used in the dumped files, not all of the redirect lines need to be included of course.
There's no need for a final /mypage.php /mypage.php.html 200 line, since Netlify automatically looks for a file with a .html extension added to the requested URL and serves it if found.
I wrote a _headers file to set the content type of my RSS file:
/rss.php
Content-Type: application/rss+xml
I hope this helps somebody.

Overriding httpd/apache upstream proxy httpcode with another

I have some react code (written by someone else) that needs to be served. The preferred method is via a Google Storage Bucket, fronted by their Cloud CDN, and this works. However, due to some quirks in the code, there is a requirement to override 404s with 200s, and serve content from the homepage instead (i.e. if there is a 404, don't serve a 404, serve the content of the homepage and return as a 200 instead)
(If anyone is interested, this override currently is implemented in CloudFront on AWS. Google CDN does not provide this functionality yet)
So, if the code is served at "www.mysite.com/app/" and someone hits "www.mysite.com/app/not-here" (which would return a 404), what should happen is that the response should NOT be 404, but a 200 with the content being served from index.html instead.
I was able to get this working by bundling all the code inside a docker container and then using the solution here. However, this setup means if we have a code change, all the running containers need to be restarted, and the customer expects zero downtime, hence the bucket solution.
So I now need to do the same thing but with the files being proxied in (with the upstream being the CDN).
I cannot use the original solution since the files are no longer local, and httpd can't check for existence of something that is not local.
I've tried things like ProxyErrorOverride and ErrorDocument, and managed to get it to redirect, but that is not what is needed.
Does anyone know how/if this can be done?
If the question is: how to catch the 404 error provided by Cloud Storage when a file is missing with httpd/apache? I don't know.
However, I think that isn't the best solution. Serving files directly from Cloud Storage is convenient but not industrial.
Imagine, you deploy several broken files successively, how to rollback in a stable format?
The best is to package your different code release in an atomic bag, a container for instance. Each version are in a different container and performing a rollback is easier and consistent.
Now your "container restart" issue. I don't know on which platform you are running your container. If your run it on a Compute Engine (a VM) it's maybe the worse solution. Today, there is container orchestration system that allows you to deploy, scale up and down the containers, and to perform progressive rollout, to replace, without downtime, the existing running containers by a newer version.
Cloud Run is a wonderful serverless solution for that, you also have Kubernetes (GKE on Google Cloud) that you can use with Knative for a better developer experience.

Amazon S3 static website doesn't serve css or js files

I've been trying to set up a static website on Amazon S3. I've got things set up to use my personal domain, and so far I've been able to access the content just fine. All the links work, both for pages in the "root" directory and pages in subfolders, so it seems that S3 can follow the paths I'm using.
The problem is that none of the CSS stylings is being applied to the pages (it works fine on the development server on my local machine). I've tried using relative and absolute paths, but this doesn't seem to be the problem. I can see the content just as it should be, and I can navigate the site normally, but there's just no styling.
I've tried messing with permissions on the folders, but I'm clearly not getting something right. Am I missing something obvious? Surely S3 can use individual stylesheets?
Thanks in advance for any thoughts.
The reason is, amazon S3 sets the content-type of css files to binary/octet-stream, follow this tutorial to solve this issue.
You need to select your css file and then from the Properties tab click on Metadata. This is to assign optional metadata to the object as a name-value pair. The Content-Type key must have the Value text/css

Why is my favicon appearing on Amazon S3 endpoint but not on the forwarded domain?

I have tried everything possible and am out of ideas as to why my favicon is still not appearing. If I told you how much time I've spent trying to figure this out you'd understand why i'm on the verge of losing my mind.
Here's the rundown [i'm not technical- just starting to learn so please bear with me]:
I'm using Amazon S3 as my host. GoDaddy is the DNS and I have forwarding with a mask setup so that the amazon endpoint is directed to the actual domain.
Here's the strange thing-- the favicon appears on the amazon endpoint but doesn't on the forwarded domain which is where I want it to appear. The favicon also appears when I do some testing using Dreamweaver.
I can assure you that it isn't a matter a clearing the cache as I've done that numerous times and have ran tests to make sure that it's working. I've tried all the possible different types of variations of code and nothing works. I'm led to believe that it's not an issue with the code, cache, file but rather something else that is out of my realm of knowledge.
So I come to Stackoverflow.
Please-- any help will be GREATLY appreciated!
For anyone having such problem - making the favicon public and using a direct link found in the file's properties on s3 did the charm.
That means use a full URL that is always going to work from everywhere. Depending on how things are set-up a hostname could resolve to something like localhost on multiple machines, so you want to make sure that the host name you're using always has the resource at that location. CORS should have anything to do with it as it is a standard full GET request.

Updating Files on Apache

I'm having trouble with my Apache Web Server. I have a folder (htdocs\images) where I have a number of images already in place. I can browse them and see them on my web server (and access them via HTML). I added a new image in there today, and went to browse to it, and it can't be found. I double and triple checked the path and everything. I even restarted Apache and that didn't seem to help.
I'm really confused as to what's going on here. Anybody have any suggestions?
Thank you.
Edit I just turned on the ability for the images directory to be listed, browsed to it (http://127.0.0.1/images/) and I was able to see all the previous images that were in the folder, but not the new one.
Turn directory indexes on for htdocs\images, remove (or move out of the way) any index.* files, and point your browser at http://yoursite/images/
That should give you a full listing of files in that directory. If the file you're looking for isn't there, then Apache is looking at a different directory than you think it is. You'll have to search your httpd.conf for clues -- DocumentRoot, Alias, AliasMatch, Redirect, RedirectMatch, RewriteRule -- there are probably dozens of apache directives that could be causing the web server to get its documents from somewhere other than where you think it's looking.
make sure the caSE and spelling are 100% correct.
There is not magic in programming (some may disagree:), so look for silly errors. Wrong server? Case of your letters? Wrong extension?
There's a chance it could be due to the cookies stored on your device. I would delete all cookies to the website you're working on before you refresh again