Subject Research
This is the third in a series of articles where the focus is on sourcing 'content' on the
Internet. The full sequence of articles is as follows:
- Copyright Issues
- Help and Guidance Resources
- Subject Research
- Content Providers
Database Without Structure
Making the most of the Internet is a matter of making use of the (often) huge amount of
data available on practically any subject you care to mention. The drawback is that it is
sometimes harder and more time-consuming than expected. There is a lot of information 'out
there', some of which is neatly sorted and categorized in databases (the owners of which may
or may not charge you for access) but, because the Internet is not organized hierarchically,
there is no central authority directing 'where' things should 'go'. The data then, is only
of use to you if you know how to find it.
Logic dictates that the easiest way to do this is to use Internet Search Sites (search
'engines' and directories), which also have the advantage of being free (to users). After
all, that's what they're for, isn't it? The answer, in theory, is yes. In theory it is just
a matter of entering in key words and bingo!—up pops a list of places where documents
containing those words can be found. Unfortunately though, it just isn't that simple!
Volume Of Data
The main problem is the sheer volume of data available on many subjects. When your search
result consists of literally thousands of links, you might get the (understandable) feeling
that you've gone from a paucity of facts to a state of information overload. In other words
you may be only a little better off because you are now faced with having to do a further
search—of the search results! Not only that but, because of the volume of data and number
of sites, it's practically impossible for the Search Sites to keep ahead of changes
constantly occurring on the millions of pages they index, not to mention the thousands of
new sites coming online every day.
'Spamdexing'
Sometimes the 'volume of data' problem manifests itself in unexpected ways. It is common,
for example, to get totally irrelevant results that do not seem to have any of the keywords
you are searching for. This is often caused by 'spamdexing', which is an attempt on the part
of some website owners to artificially 'load' their pages with common, sometimes hidden,
keywords even though they might have little or nothing to do with the content of their site.
They do this in order to get a high ranking (in other words, for their site to appear within
the first 10 or 20 links returned) in all search results, knowing that any links lower down
the list, even if they are more relevant than theirs, will invariably get overlooked. People
get very annoyed when their searches turn up a load of rubbish and they usually blame the
search engines even though the real culprits are more likely to be the owners of those
rubbish sites trying to cheat the system.
Search Options
Engines
The problem of irrelevant search results is most common with search 'engines'. The reason
is that their entries are gathered automatically by Internet robots (called spiders) and
processed into their databases electronically.
Nevertheless, they have certain advantages over manually processed databases:
- They can 'spider' the web 24 hours a day for 365 days a year, which results in many
more sites being indexed than would be possible otherwise.
- They can index individual pages on sites rather than just the 'root URL'.
Directories
Does this mean that search directories, or 'portals', as they are now sometimes called,
are better because they use real people who are not so easily duped? Some people think so,
but the problem is once again one of volume. Because of the time involved, a database whose
entries are checked manually necessitates some fairly strict rules and this can lead to a
tendency towards bureaucracy and a lack of flexibility. The end result is often that fewer
sites get indexed overall and amongst the missing might be some perfectly good ones - and
even some veritable goldmines of information. Meanwhile, other sites with little or no
worthwhile content manage to mysteriously achieve positions of prominence, presumably by
appealing to a sense of what constitutes a 'good' site on the part of the reviewer. The
best-known example of a search directory is Yahoo!. (http://www.yahoo.com/)
Topic-Specific Indexes
There are hundreds of very specific databases on the Internet covering a wide variety of
subjects. These can sometimes be queried using specialized search engines like Pilot-Search.com,
which is a literary search engine. (http://www.pilot-search.com/)
They can also often be found by using standard searching methods and sometimes by
consulting 'megasource jumpstations' or, as they are now coming to be known, 'Vortals'
(vertical portals). These are similar to the search directories mentioned above except that
they each concentrate on a particular niche, or market segment. They are invariably much
smaller but cover their subject more comprehensively. All the resources that turned up
during the course of research for the first two articles in this series, for example, are
archived at either 101 Publishing Answers
or 101 Writing Answers.
The main advantage in using this type of site is the concentrated focus, and the fact
that many of them include details of resources that, for one reason or another, simply do
not appear elsewhere. The downside is that they can often be as hard to find as the
information you hope to get from them. One of the reasons for this is that many of the most
popular Search Sites refuse to list them (presumably because they see them as competitors).
As mentioned before in this series, there is more to the Internet than the World Wide
Web. Because of this there are some 'protocol-specific' searching applications that seek out
data that is available on Usenet 'newsgroups', Gopher sites, e-mail Mailing Lists etc. Some
of these consist of 'discussion threads' from within forums and are not always suitable for
specific subject research. However, they can sometimes prove invaluable for ongoing research
and an increasing number archive their most popular discussion threads on the web too. An
example of this type of resource would be Deja.com, which is used for finding discussion
groups of all kinds. (http://www.deja.com/)
MetaSearch Tools
There are a number of meta-search applications, or meta-crawlers, on the web that submit
keywords to more than one search engine at a time. They then usually sort the results,
remove any duplicates and present you with a broader, more valid selection of resources than
you would be likely to get from one Search Site alone. Although web-based themselves, many
of these are capable of searching other protocols such as e-mail and Usenet too. An example
of this type of resource is WebInfoSearch, which searches more then 50 search engines with a
click and White Page and Yellow Page listings. You can also search for people by name,
business name, reverse telephone or reverse address. (http://www.webinfosearch.com/)
Focused Searching
Probably the most effective way of getting accurate results when using many engines,
directories and databases is by using 'precision searching'. In its simplest form, this
means entering the most precise phrase you can think of to describe what you are looking
for, thus narrowing the search as much as possible from the outset. If this doesn't produce
the desired result, then widen the search term one step at a time until it does.
Most people do the opposite. They start by using a single, all-encompassing keyword,
which results in a very large list to start with; then they start to narrow the search by
qualifying that word in increasing detail. The problem with this method is that the searcher
has to wade through huge amounts of data in the process and often never gets to the point of
refinement necessary to produce worthwhile results.
Precision searching necessitates the use of search delimiters; the commonest one being
"THIS PHRASE" (in other words, enclosing the search phrase within quotation
marks). Other important delimiters are the words AND, OR and NOT. This article is not the
place to launch into a tutorial on refining devices for searching. Suffice to say that you
will improve your results dramatically if you take the trouble to learn these and the many
other search terms in common use. Most of them can be found by simply using the 'Help'
feature in the search engine or other device you are using. The very popular search engine
AltaVista is particularly helpful in this respect. (http://www.altavista.com/)
Resources
The URLs listed below are only meant to guide you, to give you some ideas and to
illustrate the range of useful sites available to people wishing to research subject
material on the Internet. They are only a fraction of a huge range of sites available and
are not meant to be in any way representative of what is on the Internet as a whole.
However, you will find pointers within them to every type of resource mentioned in this
article.
- SquirrelNet
- A simple and concise guide to searching the Internet and finding cool sites.
SquirrelNet ranks the top search engines and tells you which is best for different types
of searching needs. This site also links to free online games, hidden jobs and links
about squirrels(!).
http://www.squirrelnet.com
-
- direct search
- A huge compilation of search tools, directories and resources that allows a lot of
material that is normally hidden or invisible to general search engines to be accessible
and searchable.
http://gwis2.circ.gwu.edu/~gprice/direct.htm
-
- Internets Search Engines, Databases and Newswires
- Search the Internet's collection of search engines and databases in every useful
category. Glowing reviews from press and universities about the 1000's of reference
search engines.
http://www.internets.com/
-
- InfoSpace
- Real world information where you'll find yellow pages, white pages, classifieds,
shopping sites, finance information, government data, chat rooms, and much more.
http://www.infospace.com/
-
- Sookoo Strategy Searches
- What if you are searching for information about a concept, rather than a specific
company or person? Try Sookoo, the business strategy search specialist. At this site you
can drill through categories such as big thinkers, leadership, trends or change
management—or search on just about any term you can think of.
http://www.sookoo.com/
-
- IMG Network Search Resources
- The official IMG Network Search page where all the tools and help you need can be
found.
http://www.img.net/search/
-
- ePilot.com
- Search the Web with the ePilot Desktop Application!
http://www.epilot.com/
-
- AffirmNet Search Resources
- Whether looking for information on a subject of your interest or trying to see whether
the Internet is aware of the existence of your web site, these search engines and
directories are some of the most useful ways to make sense of the overwhelming size and
compexity of the Internet.
http://www.affirmnet.com/search.html
-
- WorldPages.com
- The world's premier Internet Yellow Pages and White Pages, Email directory, featuring
117 Million U.S. & Canadian white and yellow pages listings, 30 million URLs,
125,000 web sites hosted for local businesses, government listings, email and web
search, maps, classifieds, ecommerce, and links to over 350 international online
government, business and email directories worldwide.
http://www.worldpages.com/
The Ultimates
- A new type of index with twenty-five net services at your fingertips.
http://www.theultimates.com/
-
- theDigitalDetective.com
- Your master index to the world: find ancestors, audio, businesses, domain name,
driving directions, e-mail, how-to, laws, location, maps, news, people, phone numbers,
pictures, places, software, and zip codes.
http://www.thedigitaldetective.com/
-
- Search Search Sites
- Use SSS to find appropriate search engines and search directories for whatever you are
looking for.
http://www.motherofallsearches.com/search.htm
-
- Liszt, The Mailing List Directory
- A really big directory of mailing lists (and newsgroups, too). If you like anything,
you're bound to find something you like here.
http://www.liszt.com/
-
- CNET Search
- Search hundreds of sites in one place.
http://new.search.com/
The Invisible Web
- For those hard-to-find resources.
http://www.invisibleweb.com/
-
- Windweaver
- Top search engines, directories, libraries and metasearch pages reviewed plus links to
recommended search tools and an online search skills course. Other resources at
Windweaver include searching tutorials, search resources, recommended Web sites, hints
for email communication and mailing lists, etc.
http://www.windweaver.com/
© 2000 Mike Alexander (Revised 2003)
|
Mike Alexander is the creator and owner of '101 Newsletter Answers', the 'How-To'
place where the focus is on 'Power Communicating' with newsletters. http://www.101newsletteranswers.com
|
|
Permission to publish this article is granted at no charge
provided it remains unaltered including the author's 'bio' (shown immediately above
this). To receive a plain text version, send a message from your regular
email address to <archives> (without the brackets) at the domain
<101newsletteranswers.com>. In the Subject field, put <arciv25> if
you would like it formatted to 60 characters a line plus underlined headings etc, or
<uarciv25> if you would prefer a 'ready-to-format' version with no line breaks.
Whenever possible, the author would also appreciate an
electronic copy of the publication in which it appeared. Please send any such messages
to <articleinc> at the domain <101newsletteranswers.com> with the ezine or
website title as the message Subject.
To return to where you came from, close this
window or use the menu bar at the bottom of the page.
|
Important Email Information
To lessen the chances of email addresses falling prey to address harvesting software (as
used by spammers) we avoid showing them in full. Instead, we only show prefixes (what
comes before '@'). Please be sure to add the @ and (usually) 101newsletteranswers.com.
Subscribe To Our Ezine
|