Liferay robots.txt new line disappearing - seo

I am trying to exclude all my liferay testing environment from search engines.
The new line is disappearing and \r\n or \n as separators are not working either.
This is my robots file:
User-agent: *
Disallow: /
This is my web.xml snippet:
<filter>
<filter-name>RobotKiller</filter-name>
<filter-class>com.robot.kill.KillARobot</filter-class>
</filter>
<filter-mapping>
<filter-name>RobotKiller</filter-name>
<url-pattern>/robots.txt</url-pattern>
</filter-mapping>
domain/robots.txt:
User-agent: *Disallow: /

I think I know what the problem is. The Content-Type HTTP header is set incorrectly on this file. You have the content type set to text/html when it should be set to text/plain.
When you view the file in your browser, it interprets it is HTML which treats new lines as spaces. You should be able to use your browser's view source feature to see it formatted correctly.
The robots.txt file may work for the search bots, even with an incorrect Content-Type header, but it would be better not to take any chances.

Related

apache server serves some pages with upper case UTF-8 and others with lower case utf-8 charset header

I am using MAMP to host php pages over and apache server. In my root directory I have a .htaccess file to do some rewrites and other things including adding charsets to the headers of certain files like so:
AddCharset UTF-8 .css .js
Now this code is working fine but I have an issue that is really annoying.
The page test.php returns a bunch of html which contains links to some .css and .js files. Now when I call up the test.php file over my browser it returns the following in the response header for each file:
text/html; charset=UTF-8
text/css; charset=utf-8
application/javascript; charset=utf-8
I really don't understand why it return the charset in upper case for html but in lower case for all other content types. This is not a severe problem of functionality but it really bothers me. To solve this I tried to include the ending .html in the AddCharset like so:
AddCharset UTF-8 .css .js .html
But this does not change anything, it still returns the the charset for the text/html file in user case. Could anyone tell me how to make the server return either consistently upper or lower case charset headers.
After some trying around to see where the charset header was actually set, I discovered that PHP outputs a charset automatically so to change the UTF-8 to utf-8 you need to change the Charset in the php.ini file. This fixed the problem.

adding x-content-type-options nosniff results in blank page being displayed

My first question. Just for reference, I support middleware and am not a developer. I’m trying to get the – Header always set X-content-type-options nosniff - working in the IHS httpd.conf file. The issue is - without the nosniff argument the page displays correctly. However when I add the header a blank page is displayed.
Below are some of the response header values. The content-length is non-empty and the return code is 200 so the content is there.
I believe the nosniff header is doing it’s job in that the content-type is text/html however the application server is returning test/javascript content (see below – Extract from the response), therefore it doesn’t display the page?
The Response header sets the following:
x-Content-Type-Options nosniff
x-Frame-Options DENY
content-type: text/html; type=SSA; charset=UTF-8
content-length: 20000
Extract from the response:
script type="text/javascript"
src="/ab24/contenthandler/!xx/q/blahblah....."/script
I don’t believe it matters where I locate the header as I’ve tried it in both the ‘Main’ server section as well as the section as I can see the header being set either way.
I believe the correct solution would be to have the back end application server change its code to respond with the content-type header set to ‘text/javascript’? However since I don’t have access to the code I’m trying to figure out if there is a way I can handle it in the httpd.conf file?
What I’ve tried thus far is since the config file has the TypesConfig conf/mimes.type and the mod_mime.so module loaded I noticed there wasn’t a text/javascript entry in the file so I added it, restarted IHS but no luck, still displays a blank page :(. Does anyone have any ideas if this could be handled somehow in the configuration file via some other directive(s), etc, or is the only way with a new code deploy?
I did look at many of the other questions related to x-content-type-options but couldn’t find an answer to my question.

Apache 2.4 set mime type of file without extension

I have upgraded from Apache 2.2 to 2.4 on a RedHat 6.4 server and came across an issue with mime types.
Apache has a DefaultType Directive. In Apache 2.2 I set this to "text/plain". I have a webpage that lists all files in a given directory and the user can click to view the files. This directory contains all types of different file extensions and some files with no extensions. When a file was clicked, it would open up in a new window nicely formatted. There is not any code doing this. It is strictly the browser opening the file and deciding what to do based on its content type.
This directive has been disabled in Apache 2.4. The Apache documentation website instructs the user to to use the mime.types configuration file and the AddType Directive to configure media types.
My question is how do I assign the "text/plain" mime type to files with no extension? In Apache 2.2 those files would be given the "text/plain" content type by default through the DefaultType Directive. In Apache 2.4 I cannot figure out how to do this since I can't use this directive anymore. I do not want to use the ForceType Directive because it would override other already defined mime types.
I could create a php wrapper that loads the file and assign a content type but I'd prefer to keep the logic within apache where all other mime type definitions are located.
Any help would be appreciated. If additional information is needed please let me know.
Extensionless files only
This solution affects only extensionless, statically served files: (credit Eugene Kerner)
<FilesMatch "^[^.]+$">
ForceType text/plain
</FilesMatch>
Any unknown content
This one affects any response that would otherwise be transmitted without a Content-Type header. In other words, it mimics the behaviour of the old DefaultType directive:
Header set Content-Type "text/plain" "expr=-z %{CONTENT_TYPE}"
It should be possible to use setifempty here instead of the -z expression. But it fails and overwrites the header in every response, empty or not. I don’t know why. Eric Covener says it’s because the Content-Type header isn’t added “until the very last second”.

How to determine a webpage is compressed with Live HTTP Headers?

When I look at every page in live http headers, the page contains the below parts in header:
Accept Encoding: gzip, deflate
Content Encoding: Gzip
When I use websites to check whether it is compressed or not, it says it's not compressed. How can we be sure that a page is compressed?
For example I tested this site in Gzip tester and it says it's not compressed, but I see Content Encoding in live http headers.
Your headers are wrong, it should be:
Content-Encoding: gzip
so basically: dashes not spaces between words in the headername
It's your webserver that needs to add those and do the compression, see https://httpd.apache.org/docs/2.0/mod/mod_deflate.html

How to add an Access-Control-Allow-Origin header

I am designing a website (e.g. mywebsite.example) and this site loads font-face fonts from another site (say anothersite.example). I was having problems with the font face font loading in Firefox and I read on this blog:
Firefox (which supports #font-face
from v3.5) does not allow cross-domain
fonts by default. This means the font
must be served up from the same domain
(and sub-domain) unless you can add an
“Access-Control-Allow-Origin” header
to the font.
How can I set the Access-Control-Allow-Origin header to the font?
So what you do is... In the font files folder put an htaccess file with the following in it.
<FilesMatch "\.(ttf|otf|eot|woff|woff2)$">
<IfModule mod_headers.c>
Header set Access-Control-Allow-Origin "*"
</IfModule>
</FilesMatch>
also in your remote CSS file, the font-face declaration needs the full absolute URL of the font-file (not needed in local CSS files):
e.g.
#font-face {
font-family: 'LeagueGothicRegular';
src: url('http://www.example.com/css/fonts/League_Gothic.eot?') format('eot'),
url('http://www.example.com/css/fonts/League_Gothic.woff') format('woff'),
url('http://www.example.com/css/fonts/League_Gothic.ttf') format('truetype'),
url('http://www.example.com/css/fonts/League_Gothic.svg')
}
That will fix the issue. One thing to note is that you can specify exactly which domains should be allowed to access your font. In the above htaccess I have specified that everyone can access my font with "*" however you can limit it to:
A single URL:
Header set Access-Control-Allow-Origin http://example.com
Or a comma-delimited list of URLs
Access-Control-Allow-Origin: http://site1.com,http://site2.com
(Multiple values are not supported in current implementations)
According to the official docs, browsers do not like it when you use the
Access-Control-Allow-Origin: "*"
header if you're also using the
Access-Control-Allow-Credentials: "true"
header. Instead, they want you to allow their origin specifically. If you still want to allow all origins, you can do some simple Apache magic to get it to work (make sure you have mod_headers enabled):
Header set Access-Control-Allow-Origin "%{HTTP_ORIGIN}e" env=HTTP_ORIGIN
Browsers are required to send the Origin header on all cross-domain requests. The docs specifically state that you need to echo this header back in the Access-Control-Allow-Origin header if you are accepting/planning on accepting the request. That's what this Header directive is doing.
The accepted answer doesn't work for me unfortunately, since my site CSS files #import the font CSS files, and these are all stored on a Rackspace Cloud Files CDN.
Since the Apache headers are never generated (since my CSS is not on Apache), I had to do several things:
Go to the Cloud Files UI and add a custom header (Access-Control-Allow-Origin with value *) for each font-awesome file
Change the Content-Type of the woff and ttf files to font/woff and font/ttf respectively
See if you can get away with just #1, since the second requires a bit of command line work.
To add the custom header in #1:
view the cloud files container for the file
scroll down to the file
click the cog icon
click Edit Headers
select Access-Control-Allow-Origin
add the single character '*' (without the quotes)
hit enter
repeat for the other files
If you need to continue and do #2, then you'll need a command line with CURL
curl -D - --header "X-Auth-Key: your-auth-key-from-rackspace-cloud-control-panel" --header "X-Auth-User: your-cloud-username" https://auth.api.rackspacecloud.com/v1.0
From the results returned, extract the values for X-Auth-Token and X-Storage-Url
curl -X POST \
-H "Content-Type: font/woff" \
--header "X-Auth-Token: returned-x-auth-token" returned-x-storage-url/name-of-your-container/fonts/fontawesome-webfont.woff
curl -X POST \
-H "Content-Type: font/ttf" \
--header "X-Auth-Token: returned-x-auth-token" returned-x-storage-url/name-of-your-container/fonts/fontawesome-webfont.ttf
Of course, this process only works if you're using the Rackspace CDN. Other CDNs may offer similar facilities to edit object headers and change content types, so maybe you'll get lucky (and post some extra info here).
For Java based Application add this to your web.xml file:
<servlet-mapping>
<servlet-name>default</servlet-name>
<url-pattern>*.ttf</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>default</servlet-name>
<url-pattern>*.otf</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>default</servlet-name>
<url-pattern>*.eot</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>default</servlet-name>
<url-pattern>*.woff</url-pattern>
</servlet-mapping>
In your file.php of request ajax, can set value header.
<?php header('Access-Control-Allow-Origin: *'); //for all ?>