All posts in the topic Search, Keywords, and Captions
I was thinking about search results when I read the following quote from Scott Prevost, in an article about the Microsoft Bling search engine. He was discussing the captions that appear in search results. “One of the challenges in developing captions is finding the right pieces of text on a page to represent that link, so semantic processing really helps. It helps pick the right sentences, sentences that may have the right concepts but not necessarily the keywords from [the user's query]. It helps us pick the piece of the sentence that's most relevant and not chop it off in places that makes it unreadable…” http://www.theregister.co.uk/2009/07/01/powerset_and_bing/ GroupServer displays captions in the search results for posts http://groupserver.org/s?t=0&p=1 Like Scott Prevost was saying, the sample caption is one that contains the search term http://groupserver.org/s?s=search&t=0&p=1 However, that is only for posts. Keywords are returned on topic searches: http://groupserver.org/s They are words that appear frequently in a topic but not frequently in other topics (the tried and true TF-IDF algorithm). Like the captions, the keywords augment the subject, providing more “information scent”. Unlike the captions, they reveal some of the deeper meaning behind the document. I was thinking that we could do a similar thing to Microsoft Bling. Calculate the keywords for a *post* and use those to select a *sentence* that characterises the document. Currently we do not have the infrastructure to do this, but Richard has plans for eventually using a full-text retrieval system to support our search system. It would be worth looking at this idea when that is integrated with GroupServer.
This site is provided by OnlineGroups.Net, where you can start your own free online groups site, using the open source web-based mailing list manager GroupServer.