... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.ai
Awaiting the gospel from Sarah Connor
1,954 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 1,773 of 1,954
Sengly to Ted Dunning
Re: Seeking suggestion on results re-ran
16 Jun 08 00:57:33
   From: Sengly.Heng@gmail.com   
      
   On Jun 15, 9:13 am, Ted Dunning  wrote:   
   > On Jun 13, 7:43 pm, Sengly  wrote:   
   >   
   > > Dear all,   
   >   
   > > I am implementing a search system. I would like to seek your   
   > > suggestion on re-ranking methodology. My problem is that I have a set   
   > > of resulting documents to a query and each one of them with a matching   
   > > score and also a list of relatedness score between each two of them. I   
   > > would like to re-rank my resulting documents by downgrading the score   
   > > of those near duplicate results. This is in order to keep the results   
   > > with precision but also to enable diversity. I would like to know   
   > > whether there exists any standard approach to do so?   
   >   
   > > Any suggestions or pointers would be highly appreciated.   
   >   
   > > Best regards,   
   >   
   > > Sengly   
   >   
   > There are a number of ad hoc methods for doing this.  I think that a   
   > better approach is to fold highly similar documents together.   
   > Clustered results that have high uniformity can be reviewed quickly by   
   > searchers since they can focus or eliminate many documents in a single   
   > step.  If the documents in the cluster are sufficiently similar, then   
   > little error is introduced by these mass decisions.   
   >   
   > Another alternative is to not re-order your results list at all, at   
   > least initially.  Then allow users to rate the documents that they see   
   > and apply their ratings not only to the document rated, but also to   
   > very similar documents.  This allows the result list to be re-ordered   
   > on the fly based on user input and should provide much the same   
   > efficiencies as clustered presentations.  It also preserves what speed   
   > you have in presenting the original results so that the user remains   
   > well disposed toward your search system.  The cost of re-ranking as   
   > users mark documents that they particularly like or dislike or that   
   > they save for later consideration is less than clustering, but also   
   > having there be some cost in response to a user action is not so bad.   
   > You can do the re-ranking using any sort of document classifier that   
   > produces reasonably conservative results with small numbers of   
   > training algorithms.   
   >   
      
   Thank you very much for your suggestion but the problem is that my   
   system is not an interactive one. So, I have to re-rank the results   
   with only matching score to the query and only pair similarity of   
   items in the result list.   
      
   I welcome any other suggestion.   
      
   Best regards,   
      
   Sengly   
      
   [ comp.ai is moderated ... your article may take a while to appear. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]