December 17, 2005

Google power: Unleash the full potential of google/ Chris Sherman

If you want to increase your search productivity using Google, read this book. Experienced searchers might also find this as a good refresher.

The book is obviously about using Google but where relevant, other search features and tools (like those from Yahoo!, Altavista and others) are also introduced. Later chapters include interviews of expert searchers, where they explain their favoured search techniques and resources.

cover
NLB Call No.: 025.04 SHE - [COM]
Click here to check for item availability.


Some notes that I took:
  • Chapter 1 - mainly on how google works (good diagram to show how the crawler, indexer, and search engine works)
  • Faster to retrieve cache copy (than click on website) if required info isn't time sensitive
  • Some "metrics" related to how google retrieves webpages (these could be important clues when designing blogs and websites):
  1. - terms appearing in the title of webpage
  2. - appearing in unique font elements (bold, italics)
  3. - other "prominent" areas of pg, e.g. Bulleted list
  4. - frequency in which the terms appear on a page
  5. - off-page metrics like no of links to that page

  • Page 25 - Google isn't case sensitive. E.g. NLB and nlb is the same to google
  • Recommends that you use multiple search terms to make your search more specific. E.g. qi gong will give very broad results Vs qi gong chinese medicine
  • internal exercises asthma)
  • Use 32 words or less in your search (Google will ignore any thing more than 32 words)
  • Use quotes, i.e. phrase searching. E.g. "qi gong" Vs qi gong (it seems that phrase searching is a favoured techique by the experts cited in the book).
  • You also use quotations to ensure stop words (i.e. the, a, is) are included in your search terms. E.g. "to be or not to be".
  • 32 google translation tool
  • 36 abt boolean searching (common mistake is confusing or and And operator)
  • 38 other operators + - *wildcard ~fuzzy
  • "Brave new *"
  • "Dinosaur ~facts" will return "dino info"

Chpt 3 is about understanding google's search results page. Intuitive to most users but if you're planning on how to max the retrieval of your blog post or page, worth a read (e.g. Impt of page title & link, even image alt text)

Chpt 4 - using the advanced search page.
From p.105 on google operators:
  • cache: www.archive.org (shows most recently crawled version of stored page in google)
  • related:www.northpole.com (shows similar pages)
  • link:www.searchengineguide.com (shows all pages linking to a specific URL)
  • info:www.sina.com (show info about a specific URL)
  • "burning man" filetype:ppt (restrict results to a file type)
  • "employee memo" ext:doc (restricts to pages of specific file type)
  • goliath frog site:www.ssarherps.org (limit to particular site)
  • allintest:dormedary arabia water (show pages where all search terms appear in the body portion of the page)
  • allintitle:combinatorial mathematics (show pages where search terms appear in title)
  • intitle:funicular (show pages where single search term appears in title)
  • allinurl:crazy eights (show pages where all search terms appear in URL)
  • inurl:ouagadougou (show pages where a single search term appears in URL)
  • allinanchor:issaquena county (show pages where all search terms appear in the text of links pointing to the page)
  • inanchor:neanderthal (show pages where a single search term appears in the text of links pointing to the page)

Chapter 6: searching for images (how google searches for clues for images, and therefore how you can improve retrieval of your images if you want them to be found). Results depend on file name, text surrounding the image. P. 151 - operators like cache:, link:, and related: have no effect on image search. Tip: to limit to Macromedia Flash files, search for "filetype:swf" operator. Try search terms site:research.microsoft.com filetype:pdf

What I like about this book is how it shows tools beyond google. Examples listed (for images):
  • search.yahoo.com/images
  • www.altavista.com/image
  • pictures.ask.com
  • www.ctr.columbia.edu/webseek
  • www.davidrumsey.com/collections
  • flickr.com

Chpt 7 on google groups, various ways of searching (useful if you want to go beyond webpages and into information from discussions). Tip: combine with terms like "forums", "message boards", "discussion groups", "mailing lists".

Chpt 8 on using google tool bar

Chpt 9: google labs (try google scholar)
P. 195 - something called google sets (www.langreiter.com/space/google-set-vista) where it generates a visual image of related terms

p. 198 - On Google WebAlerts (google.com/webalerts)
p. 200 - google.com/webalerts or newsalerts

p. 204 - www.google.com/press/zeitgeist.html - snapshot of most popular queries Google received during previous month

Chpt 11 talks about web research managers, collaborative bookmark managers like furl.net, www.onfolio.com

p. 224 - monitoring fav websites:
  • product update pages: www.mozilla.org
  • movie trailers: www.movie-list.com
  • fav blogs: www.researchbuzz.com
  • radio prog schedules: www.npr.com.org/about/schedules
  • best selling books: www.nytimes.com/pages/books/bestseller

chapt 12, p.230 - the art of googling people;
p. 231 rules of thumb:
  • always put person's name in double quotes
  • google ignores most punctuation symbols, so "john smith" also results "st. john's, smith square". advises that put a plus sign infront of search terms to force an exact match
  • google isn't case sensitive
  • use boolean OR to expand results

p. 232 on strategies to search for people (create your own biography, basically build list of terms; start with few terms; eliminate negative words; dont' overlook googlegroups)

p. 241 - finding email address tricky bec google ignores "@" symbol. try things like DOT AT anselATadmasDOTcom or insert space ivanchew @ nlb . gov . sg)

p. 243 - 245 finding personal webpages and blogs (but google just came up with blogger search)
tip like "barbara smith" intitle:"home page" or allintitle: "barbara smith" "home page"
"charles miller" site:geocities.com OR site:anglefire.com

p. 250 - Examples listed:
  • "charles miller" site:blogger.com OR site:blogspot.com
  • "name" intitle:"user profile" site:blogger.com (find blogger's profile)
  • "name" "my web page" site:blogger.com (find blogger's homepage)
  • "name" "recent posts" site:blogger.com (find recent posts mentioning other people)
Mentions tools like feedster.com, technorati.com, bloglines.com, www.daypop.com


p. 261 - finding reliable health information. good coverage on how to evaluate information

286 - searching googlenews using the advanced news search interface. Other tools like:
  • www.findory.com
  • newsbot.msnbc.msn.com
  • newsnow.co.uk
  • www.topix.net
  • news.yahoo.com/fc?
  • newsblaster.cs.columbia.edu


p. 295 - on "weblogs and nontraditional news sources"... "bloggers have broken numerous news stories before the mainstream media picked them up." [Aside: Librarians need to ask & answer the question "why search for content in blogs?" ~ Ivan]

p. 296 brief mention of RSS (alternative name for RSS - "rich site summary")
Recommends this site for list of news - http://directory.google.com/Top/Reference/Libraries/Library_and_Information_Science/Technical_Services/Cataloguing/Metadata/RDF/Applications/RSS/News_Readers/

p. 300 - on Froogle.com

p. 348 - about discovering relationships with linkage maps TouchGraph GoogleBrowser www.touchgraph.com/TGGooleBrowser.html

p. 351 - 4 : finding out what companies don't want found; "googledorking" - googledork is "an inept or foolish person as revealed by google".
Googledorking - process of trolling the internet for confidential information (that has been placed there by mistake). Some common key words in googledorking - ("total", "profit margin", "salary", "marketing plan", "confidential", "secret", "do not circulate") with "filetype:" operator.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.