SEO home > SEO Tips > What is the difference between crawling and indexing?


What is the difference between crawling and indexing?

in regards to Search Engines

Posted: 21-10-2008

mattotude

User Level: SEO Trainee

Member Since: 10-10-2008


Tag on Technorati Submit to StumbleUpon Tag on del.icio.us Bookmark on Yahoo Tag on reddit Digg it Share on Facebook Sphinn

Top Answer

Rating: 4.0/5 (1 vote )

yoms

User Level: SEO Pro

Member Since: 09-10-2008

Crawling is where a spider, or robot, visits your site and pages within it. It does this with the intent of downloading and storing your content. This is called indexing. However, a spider/robot may choose not to store the content (therefore not 'indexing' it) or it may be disallowed from storing it if the website owner has used certain meta tags/robots.txt

Posted: 23-10-2008



Rating: 2.0/5 (1 vote )

theonion

User Level: SEO Trainee

Member Since: 10-10-2008

Something can be crawled and not indexed, but if you're indexed, you've been crawled.

Posted: 21-10-2008


Rating: 0.0/5 (0 votes )

decabbit

User Level: SEO Pro

Member Since: 30-10-2008

I agree with yoms - to be crawled does not guarantee indexing but indexing does not guarantee a completely successful crawl.

To be crawled it means you have not created any roadblocks to the "spider" or computer program that is employed by various search engines and various scrapers. This program goes around the web and traverses the billions of links in an attempt to uncover new or fresh content. Crawlers are also employed by spammers to crawl down various links and steal content. These often do not self-identify and so can be blocked.

Spidering therefore does not always result in indexing. It means a program successfully found at least one page of your site. You can block this from happening, as well as only allowing search engine spiders or blocking all spiders.

Indexing is something only a search engine can do visibly though I know of a number of businesses with an index of websites. So while any scraper can spider (and scrape) your site, only a search engine can database that information for serving up to a searcher in response to a request. Indexing a site is part scrape, part filter. A search engine will take the spidered information from your site and determine what to do with it. You may block indexation through a robot.txt or meta robots entry. A search engine may reduce the number of indexed pages due to duplicate content issues or bad encoding or malware or any of a number of other reasons.

So as has been said, a spider crawls the site but to be indexed means actually showing up on the search engine results as a result of a query on the stored database of scraped information.

*phew* that was long winded.

Posted: 23-12-2008


Login to subscribe to this question