static site hosting on S3 - 404 errors on files that show in S3 management console - amazon-s3

I'm hosting a site using S3. Until recently, no problems. Last week, I uploaded some new .html files, and all of these new files result in a 404 error, while older .html files load with no problems.
I can see the new files in the S3 bucket using the web interface. When I compare permissions and any other settings for the new vs. old files, I'm not able to see any difference. When I view the properties of a new file, the provided link works (none of my images or CSS loads, however). Also, I can get the file to load using the bucket's endpoint. But, when using our custom domain name, all new files fail to load, with the following error:
404 Not Found
Code: NoSuchKey
Message: The specified key does not exist.
Key: test1/s3a-debug.html
RequestId: 2573FF0356xxxxxxxxxx
HostId: qYnv8alWnV/xxxxxxxxx/xxxxxxxxxxxx+NPkHLO8arfTVizUds=
I'm at a loss to explain why recently uploaded files throw a 404, and yet older files that look identical load just fine. I've seen other people report similar problems, but I've yet to find a thread with a solution.
any help would be greatly appreciated. Thanks!

Go to S3 bucket properties -> Static web hosting and add index.html as Error document.
It considers everything as directory once you append something to host URL. Now after adding index.html as Error document, when it doesn't find the required path as a folder, it will go to index.html and redirect your request.

Related

How to disable the download of files in an Apache2 webserver?

I took over a website which I'm supposed to admin and somebody brought to my attention that certain Indexes and Files are available, which shouldn't be. I will be using dummy names.
You were able to access example.com/intern before, but I changed a line in /etc/apache2/apache2.conf according to this https://stackoverflow.com/a/31445273 . This worked partly, as I get a 403-Forbidden when I now navigate to example.com/intern and that's basically what I want.
However the directory intern governs a file called file.php.bak aswell as file.php. When I navigate to example.com/intern/file.php I get a white website. I am however not sure, if you are able to access file.php in another way, because the site does load and I don't get a 403 like before. What is way worse and the reason I am struggling with this is: If I go to example.com/intern/file.php.bak then my Browser (Firefox) offers me to download file.php.bak, which I can read in plaintext. I want all files in intern to not be accessible via the website, but I have no idea how to do this. Can anybody help?
Things I've tried:
Removing the Indexes from the apache2.conf file like mentioned above. It only puts the 403 on the directory itself and not recursively for all the files in it.
Writing a .htaccess file as described here: https://fedingo.com/how-to-prevent-direct-file-download-in-apache-server/ and putting it in intern with the same result as in 1)
Putting an empty index.html file in the intern directory. This leads to no more 403 in example.com/intern, but the download on example.com/intern/file.php.bak is still possible. I've also tried index.php with the same result.
File System:
The application runs from /var/www/application which is also the folder for the /var/www/application/index.php I want to use. The /var/www/application/intern directory is also there. While it isn't browsable anymore, the files in it still are accessible. /var/www/application/intern/file.php can be navigated to via example.com/intern/file.php, but it seems like it can't be downloaded or read as it results in a white page. /var/www/application/intern/file.php.bak can however be downloaded via example.com/intern/file.php.bak.
Let's say Apache document root is set to DocumentRoot "/folder_one/folder_two"
Placing files in a folder_one will prevent people browsing your apache server and requesting the files directly.
Place index file in folder_two and include some code such as PHP to tell apache to include whatever files you want from folder_one.
In this manor Apache will still be able to serve whatever files you want from folder_one and people will not be able to request the files directly as the are located in a directory above the Apache document root.

Browser cannot download .txt file if the project is .Net Core Restful API

I am trying to activate my SSL for my project but one of the the urls doesn't respond.
This is the web site and when you paste the url to the browser, it downloads the .txt file.
http://www.xxx.co.uk/.well-known/pki-validation/83CB00D29E282E1FFD6DFB220F030EF4.txt
This is the Restful .Net Core API domain and when you paste the url to the browser, it returns 404 error.
http://api.xxx.co.uk/.well-known/pki-validation/83CB00D29E282E1FFD6DFB220F030EF4.txt
Under the IIS I have compared the psychical paths, permissions and it seems everything is same. I believe something in Rest API blocks the .txt file download. Should I check the web.config or something I need to add to the C# source code? Should I update ConfigureServices(IServiceCollection services)? In IIS I have checked MIME Types and the .txt is already defined.
Any suggestions?
I believe my problem is pointing to this subject but I am not sure how to enable/define .txt files in startup.cs. Any code snippets?
How to Serve Static File
EDIT: Finally I found a solution. Now the browser can download the .txt files. What I did is, I created a virtual directory in IIS. Here are the steps:
Go to the C: drive
Create a new folder called well-known
Inside the .well-known folder, create another folder named pki-validation
so far, your folders should look like this: C:\.well-known\pki-validation
Upload the TXT file in the pki-validation folder
Open the IIS Manager on your server
Do a right click on your website and select Add Virtual Directory
In the Alias section write .well-known
In the Psychical Path area enter the path to the well known folder. For example: C:\well-known
Press OK to create this alias
The urls are now serving the .txt files. I hope these steps one day saves the other developers time.

Akamai CDN Issue with URL Query Parameter

I am working on a client project, where the AKAMAI CDN has configured. They got Amazon S3 for hosting.
Problem:
I've committed the code in branch and could see the changes deployed on server in a codebase
Now I am trying to hit server URL in browser and trying to verify my code change
I couldn't see the UI change as per
I observer the CSS file URL is coming with query parameter (i.e.: server.com/css/filename.css??browserId=other&themeId=AbcTheme_WAR_abctheme&?t=125786954258&languageId=en_US&b=8569&t=1259648753695)
Now I am opening same URL in browser but now removing url query parameters from the file
This time I could see my changes in the same file
Questions:
Is this an issue related to CDN?
Is the CDN managing different versions of the same file to be served?
If so my changes should be merged into the latest file pointing to a webpage, which has url query parameters.
I know CDN will take time to refresh the pages but I am trying to verify my changes after 48 hours of the deployment.
Any help would be appreciated.
Thanks.

Make Indexed File Downloadable In Apache Solr

I am trying to indexed pdf file to Solr which I have done successfully using the command
curl "http://localhost:8983/solr/update/extract?literal.id=id=true"-F myfile=#filename.pdf"
I am able to see the file contents and search, but when I try to click on file name it shows
HTTP ERROR 404
Problem accessing /solr/collection1/id. Reason:
not found
What I want is to have a link which allows downloading the file, I know Solr merely indexes the file and stores it. I was wondering if there is a way by which I can add attribute location like you have done and proceed from there, can you please share with me what you have done, if you want any more clarity regarding my problem do ask.
We have the actual files hosted through a separate web application to be download from with auditing and additional security.
you can always directly host these files through http server.
If you are having the file names with id, it is as easy as appending the id.extension to the fixed http hosted url.
Else index the path of the file with an additional parameter e.g. literal.url.
The url will the solr field which will now be available with the Solr response.

Directory Listing in S3 Static Website

I have set up an S3 bucket to host static files.
When using the website endpoint (http://.s3-website-us-east-1.amazonaws.com/): it forces me to set an index file. When the file isn't found, it throws an error instead of listing directory contents.
When using the s3 endpoint (.s3.amazonaws.com): I get an XML listing of the files, but I need an HTML listing that users can click the link to the file.
I have tried setting the permissions of all files and the bucket itself to "List" for "Everyone" in the AWS Console, but still no luck.
I have also tried some of the javascript alternatives, but they either don't work under the website url (that redirects to the index file) or just don't work at all. As a last resort, a collapsible javascript listing would be better than nothing, but I haven't found a good one.
Is this possible? If so, do I need to change permissions, ACL or something else?
I've created a simple bit of JS that creates a directory index in HTML style that you are looking for: https://github.com/rgrp/s3-bucket-listing
The README has specific instructions for handling Amazon S3 "website" buckets: https://github.com/rgrp/s3-bucket-listing#website-buckets
You can see a live example of the script in action on this s3 bucket (in website mode): http://data.openspending.org/
There is also this solution: https://github.com/caussourd/aws-s3-bucket-listing
Similar to https://github.com/rgrp/s3-bucket-listing but I couldn't make it work with Internet Explorer. So https://github.com/caussourd/aws-s3-bucket-listing works with IE and also add the possibility to order the files by names, size and date. On the downside, it doesn't follow folders: only the files at one level are displayed.
This might solve your problem. Security settings for Everyone group:
(you need the bucketexplorer.com software for this)
If you are sharing files of HTTP, you may or may not want people to be able to list the contents of a bucket (folder.) If you want the bucket contents to be listed when someone enters the bucket name (http://s3.amazonaws.com/bucket_name/), then edit the Access Control List and give the Everyone group the access level of Read (and do likewise with the contents of the bucket.) If you don’t want the bucket contents list-able but do want to share the file within it, disable Read access for the Everyone group for the bucket itself, and then enable Read access for the individual files within the bucket.
I created a much simpler solution. Just place the index.html file in root of your folder and it will do the job. No configuration required. https://github.com/prabhatsharma/s3-directorylisting
I had a similar problem and created a JavaScript-and-iframe solution that works pretty well for listing directories in S3 website files. You just have to drop a couple of .html files into the directory you want to list. You can find it here:
https://github.com/adam-p/s3-file-list-page
I found s3browser, which allowed me to set up a directory on the main web site that allowed browsing of the s3 bucket. It worked very well and was very easy to set up.
Using another approach base in pure JavaScript and AWS SDK JavaScript API. Not need PHP or other engine just pure web site (Apache or even IIS).
https://github.com/juvs/s3-bucket-browser
Not intent for deploy on your own bucket (for me, no make sense).
Using the new IAM Users from AWS you can provide more specific and secure access to your buckets. No need to publish your bucket to website and make all public.
If you want secure the access, you can use the conventional methods to authenticate users for your current web site.
Hope this help too!