December 2008 - Archive


Merry Christmas and Happy New Year from Thunderstone Software LLC


Results Ranking Options

Thunderstone uses an internal algorithm for determining how "good" a search hit is, but this algorithm is very customizable in both Webinator and the Thunderstone Search Appliances. On the "Search Settings" page, towards the bottom, are several options that will assist you with adjusting the ranking algorithm:

  • Word Ordering -- Controls how important word order is for results ranking: hits with terms in the same order as the query are considered better. For example, if searching for "bear arms", then the hit "arm bears", while matching both terms, is probably not as good as an in-order match. The default weight is Medium (500).
  • Word Proximity -- Controls how important proximity of terms is for results ranking. The closer the hit's terms are grouped together, the better the rank. The default weight is Medium (500).
  • Database Frequency -- Controls how important frequency in the table is for results ranking. The more a term occurs in the table being searched, the worse its rank. Terms that occur in many documents are usually less relevant than rare terms. For example, in a web-walk database the word "HTML" is likely to occur in most documents: it thus has little use in finding a specific document. The default weight is Medium (500).
  • Document Frequency -- Controls how important frequency in document is for results ranking. The more occurrences of a term in a document, the better its rank, up to a point. The default weight is Medium (500).
  • Position in Text -- Controls how important closeness to document start is for results ranking. Hits closer to the top of the document are considered better. The default weight is Medium (500).
  • Clicks from Home -- Controls how important being close to a Base URL is. The more times the walk had to click on links to get to the page, the lower weight it will have. The default weight is off.


"What I really like about Webinator, still, is the fact that it's relatively easy to configure. It's much easier to configure than it was back when we bought the original product, when everything was run through command lines. I like the notion of relevance in terms of returned hits. It seems to make a lot more sense to me than, for example, Google page ranking - which places a much higher priority on popularity than it does on the actual content of the pages where text matches. Another thing that has been nice is the fact there is support for synonym matching within the server. And I think Vortex as a scripting language is very powerful. Even though I haven't used it to its fullest ability, it's proven to be quite flexible when we've needed to make modifications."

Mark J. Weixel
Director of Informatics
University Center for International Studies (UCIS)
University of Pittsburgh


"A Christmas Carol in Prose, Being a Ghost Story of Christmas"
(commonly known as A Christmas Carol -- with illustrations by John Leech)

When Charles Dickens self-published what he referred to as his "little Christmas Book" on December 19, 1843, he found himself in debt -- and he desperately needed to raise money.

He wrote the novella over a six-week period in October and November. Consistent with the musical form (after all, he did call it a carol,) Dickens presented the work in five staves -- rather than chapters.

In one week after having it printed, he successfully sold every copy of the first run -- more than 6,000 books. The work's instant popularity (eight stage adaptations went into production within two months of its inital publication) has continued to this day.

The circumstances of Bob Cratchit at work in Scrooge's counting-house and at home with his poor -- but loving -- family (including Tiny Tim,) bring to mind timely thoughts of social justice and transformative generosity that deserve to remain with us throughout the holiday season and, hopefully, well beyond it.

You can find an online version of the classic book at A Christmas Story by Charles Dickens.

Feedback, suggestions and questions are welcome. Send your email to

Copyright © 2020 Thunderstone Software LLC. All rights reserved.