How can I test my website online (before it is public) and that no one can see it except me?
I know I can add a password but I don't want that google indexs it (before it's really public).
To prevent google from indexing it, use this meta tag in your head:
<meta name="robots" content="noindex,nofollow" />
This tells search engines you do not wish for your page to show up in the search results.
Add a robots.txt file to your website.
User-agent: *
Disallow: /
Save the text above into a text file called robots.txt, and upload it to your website root.
By using this, any well-behaved crawler will not read your website, and any well-behaved search engine not index your website.
Depending on how your website is designed (PHP, Python, Ruby), you would have to establish a web server. The most typical configuration is AMP, which stands for Apache, MySQL, and PHP/Python. (Typically this runs atop Linux, but it can be run under Windows too.)
Ruby on Rails comes with its own built-in web server. Most Python web frameworks (Django, CherryPy, Pylons) do too.
Related
I have seen two popular options to force IE to open an HTML in a particular mode:
1) <meta http-equiv="X-UA-Compatible" content="IE=edge" />
2) Specify it as a Header in httpd.conf
What are the advantages of either of these options? Is there a recommended approach to do this?
Most applications I have seen use Apache as a load balancer and it usually handles a request to www.url.com and sends it to one of possible application servers. Here anyway accessing the IP directly would not get the benefit of emulation because no Headers are set. Meta tag solves the problem closer than Apache does? So isn't that the better way to set a specific emulation or does Apache approach have other benefits?
Neither to be honest.
X-UA-Compatible is no longer supported (as of IE11 and above) and Microsoft recommends not using it and instead using the HTML5 doc type.
Saying that, to answer your question (in case interested in other headers like this), it depends. There are benefits to both.
Benefits of setting HTTP Headers
Can be set once at server level and don't need to remember to include on every page.
Useful if you don't have control over all the pages (e.g. many developers/contributors upload content to the site).
HTTP Header usually takes precedence (though not with X-UA-Compatible).
Benefits of setting at page level:
Doesn't require access to server (e.g. If page is hosted on a server where you don't have access to server config, or served across CDN).
Will be copied when page us served over CDN or other caching solution.
Can be set by page author (e.g. if page requires a specific header and author knows this).
It's usually easier to override per page if you need different settings per page rather than loading all that config in Apache.
When an individual page contains an x-ua-compatible header, it overrides headers provided by the server. There are times this is useful (for serving legacy websites that do not have DOCTYPE directives) and times when it's not. Usually, you know which situation you're by the problems you're trying to resolve.
The recommended practice is to use an HTML5 doctype (<!DOCTYPE html>) for most situations and to only use x-ua-compatible for legacy sites that rely on legacy markup. Ideally, this would be a temporary solution used only until a new version of the site has been developed so that it no longer relies on the legacy behavior.
I have a Domino site that is getting highs for cross site scripting on app scan.
We don't have a license to run appscan. Another group needs to do that (yeah big corporations :) ). But I have noticed that the IE browser will complain too with a url as such:
http://myserver.com/cld/cldg.nsf/vwTOC?OpenView&Start=28
(ie will warn you on crosssite scripting with such a url).
I noticed the notes.net forum site does not come up with such an error in IE, when I try to inject script tags. I guess it must scrub the url before the page is rendered? How is this being done in the notes.net forum? Is it done at server level or a database level?
I did found this thread
How to avoid a XSP/Domino Cross-Site Scripting Vulnerability?
where Steve mentions his blog and web rules but the blog mentions that they are not needed in 8.5.4. and above. Am I understanding that right? If so we are at 8.5.4. Is there something I still need to do to scrub my url?
Edit: We are at 8.5.3. Not 8.5.4. I was mistaken. Our admin is going to try Steves's suggestions
Having made sure that all the proper indexing options are set, my dev install of SP2010 is still not searching the content of word docs, only titles. Any suggestions?
Does your crawler account has sufficient permission to access the file attached to the list item ? Are you crawling your site as a SharePoint site or as a web site (in that case you need to make sure that you have link(s) pointing to the document(s).
Don't you have robots.txt file a the root of your web application that might have exclusions rules preventing the content to be properly crawled ?
If you really want to know what's happening when the crawler is doing it's job, you can install fiddler on your dev machine and change the proxy settings of your search service application to the one created by fiddler. Doing so will allow you to check in real time what url / content is currently being crawled and the http status code that are being returned to diagnose permissions / content issue.
Hope it helped.
My server has been compromised recently. This morning, I have discovered that the intruder is injecting an iframe into each of my HTML pages. After testing, I have found out that the way he does that is by getting Apache (?) to replace every instance of
<body>
by
<iframe link to malware></iframe></body>
For example if I browse a file residing on the server consisting of:
</body>
</body>
Then my browser sees a file consisting of:
<iframe link to malware></iframe></body>
<iframe link to malware></iframe></body>
I have immediately stopped Apache to protect my visitors, but so far I have not been able to find what the intruder has changed on the server to perform the attack. I presume he has modified an Apache config file, but I have no idea which one. In particular, I have looked for recently modified files by time-stamp, but did not find anything noteworthy.
Thanks for any help.
Tuan.
PS: I am in the process of rebuilding a new server from scratch, but in the while, I would like to keep the old one running, since this is a business site.
I don't know the details of your compromised server. While this is a fairly standard drive-by attack against Apache that you can, ideally, resolve by rolling back to a previous version of your web content and server configuration (if you have a colo, contact the technical team responsible for your backups), let's presume you're entirely on your own and need to fix the problem yourself.
Pulling from StopBadware.org's documentation on the most common drive-by scenarios and resolution cases:
Malicious scripts
Malicious scripts are often used to redirect site visitors to a
different website and/or load badware from another source. These
scripts will often be injected by an attacker into the content of your
web pages, or sometimes into other files on your server, such as
images and PDFs. Sometimes, instead of injecting the entire script
into your web pages, the attacker will only inject a pointer to a .js
or other file that the attacker saves in a directory on your web
server.
Many malicious scripts use obfuscation to make them more difficult for
anti-virus scanners to detect:
Some malicious scripts use names that look like they’re coming from
legitimate sites (note the misspelling of “analytics”):
.htaccess redirects
The Apache web server, which is used by many hosting providers, uses a
hidden server file called .htaccess to configure certain access
settings for directories on the website. Attackers will sometimes
modify an existing .htaccess file on your web server or upload new
.htaccess files to your web server containing instructions to redirect
users to other websites, often ones that lead to badware downloads or
fraudulent product sales.
Hidden iframes
An iframe is a section of a web page that loads content from another
page or site. Attackers will often inject malicious iframes into a web
page or other file on your server. Often, these iframes will be
configured so they don’t show up on the web page when someone visits
the page, but the malicious content they are loading will still load,
hidden from the visitor’s view.
How to look for it
If your site was reported as a badware site by Google, you can use
Google’s Webmaster Tools to get more information about what was
detected. This includes a sampling of pages on which the badware was
detected and, using a Labs feature, possibly even a sample of the bad
code that was found on your site. Certain information can also be
found on the Google Diagnostics page, which can be found by replacing
example.com in the following URL with your own site’s URL:
www.google.com/safebrowsing/diagnostic?site=example.com
There exist several free and paid website scanning services on the
Internet that can help you zero in on specific badware on your site.
There are also tools that you can use on your web server and/or on a
downloaded copy of the files from your website to search for specific
text. StopBadware does not list or recommend such services, but the
volunteers in our online community will be glad to point you to their
favorites.
In short, use the stock-standard tools and scanners provided by Google first. If the threat can't otherwise be identified, you'll need to backpath through the code of your CMS, Apache configuration, SQL setup, and remaining content of your website to determine where you were compromised and what the right remediation steps should be.
Best of luck handling your issue!
We have a FogBugz 6 installation, with a good deal of wiki content in place. We're transitioning to use Atlassian products (JIRA and Confluence), so we'd like to get that wiki content into Confluence. How would you approach this?
Unfortunately, FogBugz doesn't appear to provide any kind of wiki export functionality, and Confluence doesn't provide any FogBugz wiki import.
FogBugz does have an API, but its a little light on the details w.r.t. accessing wiki content. We don't really care about past revisions of pages (just content, links, and images/attachments), so it's not clear that the API gets us any further than scraping the FB wikis with wget or something, and working with the HTML and images/attachments from there.
Confluence has a pretty full-featured content import utility that supports a number of source wikis:
TWiki
PmWiki
DokuWiki
Mediawiki
MoinMoin
Jotspot
Tikiwiki
Jspwiki
Sharepoint
SWiki
Vqwiki
XWiki
Trac
No FogBugz option there, but if we could export the FogBugz wiki content into one of the above wikis, then we could likely use the Confluence multi-wiki importer from there.
Alternatively, we could use wget to scrape the FogBugz wiki content, and then find a way to get static HTML + images + attachments into either Confluence or into one of the above other wikis as a stepping stone to Confluence.
Thoughts?
A colleague ended up figuring this one out, and the process ended up being generally-applicable to other web content we wanted to pull into Confluence as well. In broad strokes, the process involved:
Using wget to suck all of the content out of FogBugz (configured so that images and attachments were downloaded properly, and links to them and to other pages were properly relativized).
Using a simple XSLT transform to strip away the "template" content (e.g. logos, control/navigation links, etc) that surrounded the body of each page.
(optionally) Using a perl module to convert the resulting HTML fragments into Confluence's markup format
Using the Confluence command line interface to push up all of the page, image, and attachment data.
Note that I said "optionally" in #3 above. That is because the Confluence CLI has two relevant options: it can be used to create new pages directly, in which case it's expecting Confluence markup already, or it can be used to create new pages using HTML, which it converts to Confluence markup itself. In some cases, the Confluence CLI converted the HTML just fine; for other data sources, we needed to use the perl module.