Saturday, September 9

Webpage-level Google Search control?

I know that if I put "robots.txt", I can decide which pages to be visited by search engine and which pages not to be visited by search engine. If the page is not visited by search engine, people can't find that page through Google.

Now I want to know how to setup this kind of thing in one page? I don't have the access of the whole website, so I can't put a robots.txt. But I can control the webpages. For example, in this website benincampus.blogspot.com, I can change every page. How do I setup the page to let search engines know that they are NOT welcome to put these pages into their database?
Of couse, some search engines have not netequte(wrong spelling. I don't have dictionary now, sorry for that), so they might ignore my request. But thank god Google is not that kind of search engine.

I think the method should be something related to META of a page. If you know the answer, please reply me, thanks.


xj: Sorry I havn't read your comment yet. Today I have a long flight. I will get back to you as soon as I can.

2 Comments:

At September 09, 2006 10:27 PM, Anonymous Anonymous said...

META NAME="ROBOTS" CONTENT="ALL"

META NAME="ROBOTS" CONTENT="INDEX,NOFOLLOW"

META NAME="ROBOTS" CONTENT="NOINDEX,FOLLOW"

META NAME="ROBOTS" CONTENT="NONE"

default = empty = "ALL"
"NONE" = "NOINDEX, NOFOLLOW"

The CONTENT field is a comma separated list:
INDEX: search engine robots should include this page.
FOLLOW: robots should follow links from this page to other pages.
NOINDEX: links can be explored, although the page is not indexed.
NOFOLLOW: the page can be indexed, but no links are explored.
NONE: robots can ignore the page.
NOARCHIVE: Google uses this to prevent archiving of the page. See http://www.google.com/bot.html

 
At September 28, 2006 8:36 PM, Blogger Unknown said...

Thanks. I've used it into one website, but I don't know how to verify if it is indexed or not :)

 

<< Home