Where to put robots.txt file? [closed] - seo

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Where should put robots.txt?
domainname.com/robots.txt
or
domainname/public_html/robots.txt
I placed the file in domainname.com/robots.txt, but it's not opening when I type this in browser.
alt text http://shup.com/Shup/358900/11056202047-My-Desktop.png

Where the file goes in your filesystem depends on what host you're using, so it's hard for us to give a specific answer about that.
The best description is: put it wherever the index.html (or index.php or whatever) file is that represents your homepage. If that's domainname/public_html/index.html, for example, put it in domainname/public_html/robots.txt.

i think the better way to describe it is to have it in the root web folder of your domain... so http://example.com/robots.txt you can also put your sitemap.xml in the root or refer to it with a Sitemap: http://example.com/fldr/smap.xml line in your robots.txt.
dont forget: you can use Google Webmaster Tools to check to make sure you haven't restricted anything you didnt mean to(you also get to see queries and links woohoo!).
suggestion: id consider using the <META NAME="ROBOTS" CONTENT="INDEX, NOFOLLOW"> if possible because you will still earn linkjuice for links on the page but it wont show up in googles index while a robots.txt directive can leave a plain url with do description in SERPs but will loose all value of links pointed to it because its robots.txted out (its ranking b/c of anchor text so get credit for it)

In the root of your web directory (where you put the files that show up on your website)

In this case you should put it in domainname/public_html/robots.txt, as the public.html folder is where your index file will be.

Related

Best way to prevent Google from indexing a directory [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I've researched many methods on how to prevent Google/other search engines from crawling a specific directory. The two most popular ones I've seen are:
Adding it into the robots.txt file: Disallow: /directory/
Adding a meta tag: <meta name="robots" content="noindex, nofollow">
Which method would work the best? I want this directory to remain "invisible" from search engines so it does not affect any of my site's ranking.
In other words, I want this directory to be neutral/invisible and "just there." I don't want it to affect any ranking. Which method would be the best to achieve this?
Robots.txt is the way to go for this.
According to Google, you only use the meta tag if you don't have rights to create/edit the robots.txt file.

Robots.txt in my project root [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I've seen tutorials/articles discussing using Robots.txt. Is this still a necessary practice? Do we still need to use this technique?
Robots.txt file is not necessary but it is recommended for those who want to block few pages or folders on your website being crawled by search engine crawlers.
I agree with the above answer. Robot.txt file is used for blocking pages and folders from crawling by search engines. For eg. You can block the search engines from crawling and indexing the Session IDs created, which in rare cases could become a security threat! Other than this, I don't see much importance.
The way that a lot of the robots crawl through your site and rank your page has changed recently as well.
I believe for a short period of time the use of Robot.txt may have helped quite a bit, but no adays most other options you'll take in regards to SEO will have more of a positive impact than this little .txt file ever will.
Same goes for backlinks, they used to be far far more important than they are now for you getting ranked.
Robots.txt is not for indexing . its used to blocks the things that you don't want search engines to index
Robots.txt can help with indexation with large sites, if you use it to reveal an XML sitemap file.
Like this:
Sitemap: http://www.domain.com/sitemap.xml
Within the XML file, you can list up to 50,000 URLs for search engines to index. There are plugins for many content management systems that can generate and update these files automatically.

GET vs POST in SEO [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
My web application retrieves a page for every request generated by a form submission. That form submits to the same URL of the page.
Each time the page loads with a different title tag. Does it indicate different pages with the same URL?
How does it affect SEO? how can I manage this situation?
Edit
This question is not purely SEO related no it requires SEO specific reasoning or answers it can be explained also technically how search engine robots work. if it still seems offtopic for moderators I request them to explain why
Try and use a rewiter rule to format your URL to a unqiune page if your always loading to the same page google ( or other search engines) will only index that single page.
http://www.seomoz.org/img/upload/anatomy-of-a-url.jpg
In addition to load the page each time with different title tag you need to append the URL with some uinque text like your GET variable data..
For getting crawled by spiders don't forget to submit your sitemap to search engines with relevant urls..

CANONICAL - Duplicate page issue [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Can someone help me with this problem.
currently google reports that this two link is duplicate.
http://www.ozkidsactivities.com/n/jules-pony-rides-&-mobile-animal-farm/ozkids-36?activityId=1218
http://www.ozkidsactivities.com/n/jules-pony-rides-and-mobile-animal-farm/ozkids-36?activityId=1218
but we already include the canonical tag:
<link rel="canonical" href="/n/jules-pony-rides-and-mobile-animal-farm/ozkids-36?activityId=1218" />
is there a problem with the relative path?
Thanks in advance!
red,
Canonical URL tags can reference the relative path (see Google's guidelines here - http://googlewebmastercentral.blogspot.co.uk/2009/02/specify-your-canonical.html), however, I'd suggest that it's better and safer to use the absolute URL (i.e., including the protocol and fully-formed hostname) - given that many websites tend to be accessible by numerous hostnames (alternative domains, test/development environments with exposed URLs, etc.) it's best to reference the correct absolute URL in order to avoid any adverse incorrect canonisation if/when search engines discover these URLs.
It looks like you've already fixed your solution, though, as well as solving the problem another way by redirecting the ampersand to the 'and'. Good work!

How should google crawl my blog? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 13 years ago.
Improve this question
I was wondering how (or if) I should guide Googlebot through my blog. Should I only allow visiting pages with single entries or should it also crawl the main page (which also has full entries)? My concern is that the main page changes when I add a new post and google keeps the old version for some time. I also find directing people to the main page annoying - you have to look through all the post before you find the one you're interested in. So what is the proper way to solve this issue?
Why not submit a sitemap with the appropriate <changefreq> tags -- if you set that to "always" for the homepage, the crawler will know that your homepage is very volatile (and you can have accurate change freq for other URLs too, of course). You can also give a lower priority to your homepage and a higher one to the pages you prefer to see higher in the index.
I do not recommend telling crawlers to avoid indexing your homepage completely, as that would throw away any link juice you might be getting from links to it from other sites -- tweaking change freq and priority seems preferable.
Make a sitemap.xml and regenerate it periodically. Check out Google Webmaster Tools.