Newsletter and webmaster resources site   Bro@dcast - Easily Create, Manage and Send Your Own Email Campaigns!
  Advertise in SiteProNews SiteProNews Archives About SiteProNews SPN Privacy Statement FeedBack SiteProNews Homepage SiteProNews Image Map SEO-News Discussion Forums
  Stretch Your Budget - Advertise in SiteProNews
    QUICK LINKS
 
MAY 16,  ISSUE #636
WEB SEARCH

ExactSeek Links
      Add your Site
      Buy a Top 10 Listing
      Find Your Site Rank
      Schedule a Site Recrawl
      Enhance Your Site Listing
      ExactSeek Member Login

SEO Tools & Services
A key factor in building website traffic is to use the best tools available. ExactSeek's SEO solution gives you immediate access to 7 effective optimization tools. Get higher ranking on the top global search engines starting today.

Try it Fr-e-e for 90 Days
SEO Tools and Services


Search Engine Forums
Join the SEO-News discussion forums to post comments, tips and articles or learn from SEO experts.
Forum Posts
Yahoo 53
Google 589
SE Articles 30
Link Exchanges 142
General Discussion 52
Join the SEO-News Forums
BLOG Search/Submit
   Add a Blog
   Grab a Blog RSS Feed
   Search Blogs in 30 Categories
   Use Blog Express for RSS Feeds

ExactSeek Toolbar
Get the toolbar with one-click spyware scanning and webpage keyword analysis. Other features include web search on multiple meta engines, popup-blocking, Alexa site ranking, word highlighting, auto-upgrade, erase browser cookies and more.

Download Version 2.3

Webmaster Resources
      Site Ranking Tool
      Meta Tag Generator
      Link Popularity Checker
     Search Engine Submitter
     Internet Tools Directory
     Site Resource Directory
 
Traffic-Generators
Get Free-Traffic for Your Site with these great Traffic-Exchanges:


TrafficZap


TrafficSwarm


SPN Site of the Day
AllTheWebsites.org is a human edited web directory with numerous categories and sub-categories. A one stop promotion site for webmasters and a good place to get your link listed for improved link popularity.

Does your web site qualify as a SPN Site of the Day? Webmaster resource sites can apply via email: sotd@sitepronews.com
 

SPN App of the Day
Portello 1.005 (4.7 MB) is a plug-in for Internet Explorer that puts a webpage into editable mode. A fast and easy way to update your web site. Freeware for Windows 98/ NT/ ME/ 2000/ XP.

If you have a Webmaster App that you would like listed on the SPN site, send us an email with details to: wapps@sitepronews.com
 

Jayde Newsletters
Subscribe to SiteProNews, the Net's foremost Webmaster ezine, AllBusinessNews, the weekly newsletter for online businesses or SEO-News, our new weekly ezine for do-it-yourself website optimizers. Just enter your email address in the field below and use the Subscribe button.

HTML Newsletter
SiteProNews
AllBusinessNews
SEO-News


Must Read Ebooks
SPN offers one of the best eBook libraries on the Web. Our current selection includes Commercial and 178 plus Fr'ee eBooks.

Authors of EBooks may submit their publications to SPN via email: ebooks@sitepronews.com
 

Link to SiteProNews
Link your site to SiteProNews, the newsletter and resource site for Webmasters.

Or, Add SPN to your site with just 2 lines of Java-script code. Top content for your site without any of the work.

Visit our SPN Promotion Partners page. Some great sites have opted to support the SPN newsletter.

SiteProNews Partners
SubmitPlus - Promote your site to 110 search engines... FR'EE!

Template Monster - World's number one website templates are available for immediate download.

PreWired.com - Providing ISPs & Publishers a Web based revenue stream!

FindMyHost.com... Review detailed Report Cards of web hosts who made the grade.

Web-Source.net... Your Guide to Professional Web Site Design & Development.

TheCgiSite.com - A directory of programming resources.

SiteUptime.com - A fr-e-e website monitoring service, providing performance reports and uptime stats.

FreeTechMail.org - A search engine where you can review and subscribe to thousands of IT newsletters.

 

Submit Plus
Blog Search
FindMyHost
Add Me.com
DesignerWiz
Web Position
Alexa Toolbar
SubmitExpress
Website Builder
Fr-e-e SEO Tools
Automate Support
Website Templates
Make Extra Income
FreeWebSubmission



How To Control
Search Engine Robots

By Michael Rock

Wouldn't it be nice to be able to leave some code in your web site to tell the search engine spider crawlers to make your site number one? Unfortunately a robots.txt file or robots meta tag won't do that, but they can help the crawlers to index your site better and block out the unwanted ones.

First a little definition explaining:

Search Engine Spiders or Crawlers - A web crawler (also known as web spider) is a program which browses the World Wide Web in a methodical, automated manner. Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine, that will index the downloaded pages to provide fast searches.

Looking to Improve Your Search Engine Placement? Request Your FR-E-E Proposal Today!

A web crawler is one type of bot, or software agent. In general, it starts with a list of URLs to visit. As it visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit, recursively browsing the Web according to a set of policies.

Robots.txt - The robots exclusion standard or robots.txt protocol is a convention to prevent well-behaved web spiders and other web robots from accessing all or part of a website. The information specifying the parts that should not be accessed is specified in a file called robots.txt in the top-level directory of the website.

The robots.txt protocol is purely advisory, and relies on the cooperation of the web robot, so that marking an area of your site out of bounds with robots.txt does not guarantee privacy. Many web site administrators have been caught out trying to use the robots file to make private parts of a website invisible to the rest of the world. However the file is necessarily publicly available and is easily checked by anyone with a web browser.

The robots.txt patterns are matched by simple substring comparisons, so care should be taken to make sure that patterns matching directories have the final '/' character appended: otherwise all files with names starting with that substring will match, rather than just those in the directory intended.

Meta Tag - Meta tags are used to provide structured data about data.

In the early 2000s, search engines veered away from reliance on Meta tags, as many web sites used inappropriate keywords, or were keyword stuffing to obtain any and all traffic possible.

Provide World-Class Web Hosting Services Under Your Own Brand - Starts at $29 per Month!

Some search engines, however, still take Meta tags into some consideration when delivering results. In recent years, search engines have become smarter, penalizing websites that are cheating (by repeating the same keyword several times to get a boost in the search ranking). Instead of going up rankings, these websites will go down in rankings or, on some search engines, will be kicked off of the search engine completely.

Index a site - The act of crawling your site and gathering information.

How can the robots.txt file and meta tag help you?

In the robots.txt you can tell the harmful 'web crawlers' to leave your web site alone, and give helpful hints to the ones you want to crawl your site. Here is an example on how to disallow a web crawler to search your site:

# this identifies the wayback machine User-agent:
ia_archiver
Disallow: /

ia_archiver is the crawler name for the wayback machine that you may have heard of, and the / after disallow tells ai_archiver not to index any of your site. The # allows you to write comments to yourself so you can keep track of what you typed.

Type the above three lines into notepad from your computer and save it to the root directory of your web site as robots.txt. Web crawlers look for this document first at a web site before doing anything else. This helps the crawler to do its job, and helps the web site owner tell the spider what to do. Say for instance you have some data that you don't want the crawlers to see. (Like duplicate content for other browser referrer pages)

Real-Time Website Stats. Try it FR-E-E for 4 Weeks!

You can deter crawlers from indexing the 'duplicate' directory by typing this into your robots.txt file. Or if you would like to have the robots.txt file created for you, visit Rietta.com. To validate your robots.txt file to make sure it works properly you can visit SearchEngineWorld.com.

User-agent: *
Disallow: /duplicate/

Put a Google-Type Ad Box on 20+ Search Engines
Your Keywords - Top 10 Placement
All for $12/URL or Less, PLUS
Sign Up Today and Receive FR-E-E Bonus Software

The * after user-agent says that this action applies to all crawlers and /duplicate/ after disallow tells all crawlers to ignore this directory and not search it. For each user-agent and disallow line there must be a blank space between them in order for it to function correctly. So this is how you would create the above two commands into a robots.txt file:

# this identifies the wayback machine
User-agent: ia_archiver
Disallow: /

User-agent: *
Disallow: /duplicate/

One thing to note that is very important: Anyone can access the robots.txt file of a site. So if you have information that you don't want anyone to see don't include it into the robots.txt file. If the directory that you don't want anyone to see is not linked to from your web site the crawlers won't index it anyway.

An alternative to blocking indexing of your site is to put a meta tag into the page. It looks like this:

<meta name="robots" content="noindex,nofollow">

You put this into the <head> tag of your web page. This line tells the robot crawlers not to index (search) the page and not to follow any of the hyperlinks on the page. So as an example <meta name="robots" content="noindex,follow"> tells the robot crawlers to not index the page, but follow the hyperlinks on this page.

Did You Know That Google Has Its Own Meta Tag?

It looks like this:
<meta name="googlebot" content="noindex,nofollow,noarchive">.
This tells the Google robot crawler not to index the page, not to follow any of the links, and not to keep from storing cached versions of your web site. You will want this done if you update the content on your site frequently. This prevents the web user from seeing outdated content that isn't refreshed because of storage in the cache.

You can use the meta tag to specifically talk to Google's robots to avoid complications or if you are optimizing your site for Google's search engine. This concludes this month's article.

Until the next article have a great day!

Copyright (c) Michael Rock Web development contractor (Web Design and Hosting) Internet Presence


About The Author
The owner of this registered company has over twenty years experience with DOS, windows business applications, numerous programming languages, artistic development, and web design. Other areas of interest include web marketing, web promoting, and business marketing and development. After the persuasion of those praising his work, he decided to go into business himself and highly suggests everyone else to do the same.


Printer Friendly Version of this Article


Recommended Articles and News for Webmasters

Shopping Carts for the Faint Of Heart
The Secret Sauce in Web Site Marketing
Cracking the Google Code... Under the GoogleScope
Top 5 Online Press Release Writing Mistakes to Avoid

Need Content for Your Website - GoArticles.com has 38,100+ Articles
Add a RSS feed or Javascript feed in seconds.


Webmaster Resource Sites & Services

Search Engine Tools - The only way to increase your Web site's position and ranking on search engines is to use the optimal tools. The ExactSeek SEO Solution is F-R-E-E for 90 days!

Add Me! - a pioneer in search engine submission, and the most popular. They offer free-submission and paid submission.

Earn a Residual Income! - Internet gaming is the fastest growing segment of web-based commerce. Benefit from the popularity of online gaming with little or no marketing investment.

Google Ranking Secrets Revealed! Boost Your Google Ranking, Get More Orders, And Make More Money!

Automate Customer Support - Ticket tracking,
email answers, FAQs, more. Lower support workload. $49/mo-Free Trial!

Looking for Targeted Site Visitors? - The ePilot Advertiser Network receives
3 billion searches per month and has 300,000+ advertisers. Find out more.

Recommended Webmaster Tools & Services

Select from 1000's of Quality Templates
Need a new site look? Select from thousands of professional designs for a fraction of web design costs. Get a multi-page website up in just a few hours.

Download the Fr-e-e Alexa Toolbar
An indispensable tool for all web professionals that plugs right into your browser to provide Traffic-Data, Site Stats, and Contact Info for all the sites you visit!.

Build a Business Website in Under 5 Minutes.
Over 172,000 people just like you have used Exact Websites to build professional websites, complete with web pages, photo albums, email, links and 27 other features without ever having built a website before.

WebPosition
WebPosition helps you maximize your site's search engine visibility by providing a complete SEO solution including rank reporting, keyword research, page optimization and submission. Download a fr-e-e demo today!

Have an Opinion on Today's Article?
Post Your Comments in the SEO-News Forums
Sign Up for FR-E-E and Participate

 

  SiteProNews - The Net's most widely read Webmaster newsletter


(c) Copyright 2005 All rights reserved. Jayde Online, Inc.
Web design by
ControlV.