What you see and what you get when you use a search engine is a result of a fairly complex set of factors. These factors vary according to the search engine or directory you use. I did some sleuthing about how search engines in general operate and how Yahoo in particular works.
The spider software that each search engine uses is the first part of the process. No seach engine site has the capacity to have their spider examine all the pages on the Internet--the amount of pages is just too vast. What the spider looks at will vary from spider to spider. It would seem where on the Internet that these spiders work also varies. The strategy, agenda or market to which search index and engine developers appeal to or follow is a mystery but I would think that looking for sites that are popular and could provide marketing opportunities to the search engine service would be the ones to be indexed first by their spider. Already the ranking of information has been influenced.
It is not always completely clear exactly how each service's spider reads web pages. Yahoo say that they gather information from the "web page text, title and description as well as its its source, associated links, and other unique document characterisitcs." From what I have read elsewhere on the Yahoo site these unique characteristics would include any metadata or metatag information that has been included in the site. This will determine what they will put in their index or database. This is what you actually search when using a search engine. You are not searching the world wide web directly.
How information is retrieved is a product of a complex algorythm that webservice uses (and which it constantly tweaks). How this search engine works is of course a trade secret. Knowing how they work makes them vulnerable to manipulation by webpage creators who wish to have their web pages ranked at the top without having to pay for the privilege. Sponsoring a link is surest way to get it moved to more relevant ranking position when the search engine is used to retrieve information from the index (a little more about that later).
Website designers can improve the ranking of their sites by careful titling of their webpages, including the terms that they believe they will be searched by in the text of their webpage and in their internal links. Sites that use keyword metatags and description metatags that are relevant to each particular page of their site will in general get a better relevancy ranking than sites which use the same general keywords for their entire site.
It would seem that the saviness of a website designer has a lot to do with whether or not a website will show near the top of the relevancy rankings. If titles and metatags, and links are not well planned no matter how important or relevant the site content it may still get a low relevancy score. Although this fact is may lead to a disappointing overlooking of good information relevant to search engines users it is something that is fairly easy to accept. A well crafted webpage would most likely have more superior content on average than those which are amateurishly made. The fact that results are greatly influenced by sponsorship shows the greatest need to search engine users to educate themselves. In Yahoo a web site owner can sponser their site by biding on keywords. The more they have bid on those words the higher their site will be ranked when those terms are searched. The search engine user is not necessarily getting the most relevant site, just the site which the sponsor wants you to see the most. What is the agenda of that sponsor? Who knows? They certainly want to either attract you for an opportunity to advertise to you and sell you a service or to push their point of view to the exclusion of other competing websites on the same topic.