Search Engines Technology Behind Searching

The Internet and its most visible component, the World Wide Web is the treasure of information on an amazing variety of topics. There are hundreds of millions of pages available waiting to present information. These precious pages are sitting on the servers waiting for their turn to present themselves, which in turn depends on their title the quality of their contents and their popularity.

When you need to know about some subject, how do you proceed? Either you know the web site related to your subject or like most people you visit an Internet search engine. Internet search engines are special sites on the Web that are specially designed to help people find information stored on other sites. There are many types of search engine available in the World Wide Web search engine services.

Some of them are Google, Yahoo, MSN Search, Snap. There are many answer-based search engine like Answerbag.com, Answers.com, Ask.com, BrainBoost.com, Google Answers etc. There are Job search engines like Hotjobs.com, Monster.com, Indeed.com, SimplyHired.com. Different search engine works in different ways, but above all they all perform three basic tasks:

• They search the Internet—based on important words.
• They keep an index of the word they find, and where they find them.
• They allow users to look for words or combinations of words found in the index.

Now question arises how these above-mentioned tasks are performed by search engine?

Suppose you want to find some information on any topic, before a search engine can tell you where a file and document is, it must be found. To find information on the hundreds of millions of web pages that exist, a search engine employs special software robots, called spiders, to build lists of the words found on the web sites. When a spider is building its lists, the process is called Web crawling. In order to get useful list of words, a search engine’s spider have to look at a lot of pages.

The starting point of the spider is usually the lists of heavily used servers and very popular pages. The spider will begin with a very popular site, indexing the words on its pages and following every link found within the site.

The search engine stores the information found by spiders on the Web in a way that it can be useful for further searching. Most engines allow you to type in a few words, and then search for occurrence of these words into their database. Each one has their own way of deciding what to do about approximate spellings, plural variations, and truncation. Instead of storing only words or URL most search engines stores brief description also relevant to the use of word. So that you can decide visiting which site will be useful to you.

When you put your query to search engine it gives you millions of pages that match your topic. The matches will even be ranked, so that the most relevant one comes first. Sometime it gives you non-relevant pages also but usually it comes last in the list of pages. One of the factors on which ranking of the page depends is the frequency of keywords on a web page. Some web sites also pay to the search engine to get high ranking in the list of pages.

Different engines have different strong points; use the engine and feature that best fits the job you need to do. One thing is obvious; the engine with the most pages in the database IS NOT the best. Not surprisingly, you can get the most out of your engine by using your head to select search words, knowing your search engine to avoid mistakes with spelling and truncation, and using the special tools available such as specifiers for titles, images, links, etc. The hardware power for rapid searches and databases covering a large fraction of the net is yesterday's accomplishment. We, as users, are living in a special time when search engines are undergoing a more profound evolution, the refinement of their special tools. So go ahead and make most of the use of search engines to get information on different topics

17-Sep-2006

More by : Ruchi Gupta

Top | Computing

Views: 4747 Comments: 0

Name *

Email ID

(will not be published)

Comment *

Characters

Verification Code*

Can't read? Reload

Please fill the above code for verification.