From: Sengly.Heng@gmail.com   
      
   On Jun 15, 9:13 am, Ted Dunning wrote:   
   > On Jun 13, 7:43 pm, Sengly wrote:   
   >   
   > > Dear all,   
   >   
   > > I am implementing a search system. I would like to seek your   
   > > suggestion on re-ranking methodology. My problem is that I have a set   
   > > of resulting documents to a query and each one of them with a matching   
   > > score and also a list of relatedness score between each two of them. I   
   > > would like to re-rank my resulting documents by downgrading the score   
   > > of those near duplicate results. This is in order to keep the results   
   > > with precision but also to enable diversity. I would like to know   
   > > whether there exists any standard approach to do so?   
   >   
   > > Any suggestions or pointers would be highly appreciated.   
   >   
   > > Best regards,   
   >   
   > > Sengly   
   >   
   > There are a number of ad hoc methods for doing this. I think that a   
   > better approach is to fold highly similar documents together.   
   > Clustered results that have high uniformity can be reviewed quickly by   
   > searchers since they can focus or eliminate many documents in a single   
   > step. If the documents in the cluster are sufficiently similar, then   
   > little error is introduced by these mass decisions.   
   >   
   > Another alternative is to not re-order your results list at all, at   
   > least initially. Then allow users to rate the documents that they see   
   > and apply their ratings not only to the document rated, but also to   
   > very similar documents. This allows the result list to be re-ordered   
   > on the fly based on user input and should provide much the same   
   > efficiencies as clustered presentations. It also preserves what speed   
   > you have in presenting the original results so that the user remains   
   > well disposed toward your search system. The cost of re-ranking as   
   > users mark documents that they particularly like or dislike or that   
   > they save for later consideration is less than clustering, but also   
   > having there be some cost in response to a user action is not so bad.   
   > You can do the re-ranking using any sort of document classifier that   
   > produces reasonably conservative results with small numbers of   
   > training algorithms.   
   >   
      
   Thank you very much for your suggestion but the problem is that my   
   system is not an interactive one. So, I have to re-rank the results   
   with only matching score to the query and only pair similarity of   
   items in the result list.   
      
   I welcome any other suggestion.   
      
   Best regards,   
      
   Sengly   
      
   [ comp.ai is moderated ... your article may take a while to appear. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|