... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.ai
Awaiting the gospel from Sarah Connor
1,954 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 679 of 1,954
Dmitry A. Kazakov to Sebastian Stern
Re: Data Mining of Preference Orderings
01 Apr 05 22:33:46
   From: mailbox@dmitry-kazakov.de   
      
   On Fri, 01 Apr 2005 02:26:05 GMT, Sebastian Stern wrote:   
      
   > For a project of mine, I have recently become interested in Data Mining. One   
   > common technique in Data Mining (used by on line book stores for example) is   
   > the mining of Association Rules.  Each rule has the form   
   >   
   >   A => B   
   >   
   > where A and B are sets of objects (e.g., books), and each rule can be   
   > interpreted as stating "The possession of A implies the possession of B".   
      
   Shouldn't it be possession of A implies interest in B? Clearly, possession   
   /= interest.   
      
   > The 'degree of confidence' in a rule is defined as the conditional   
   > probability that a subject (user) is interested in an objects B under the   
   > condition that the subject already posesses objects A.  This confidence is   
   > thus computed using the familiar formula for conditional probabilities:   
   >   
   > confidence(A => B) :=  P(B | A) := P(A and B) / P(B)   
      
   If A is a set of books, then P(A) is a probability of what? Are books   
   random? Maybe P(A) = P(User has A)? Is this random?   
      
   > Only those rules with a confidence above a certain threshold are then used   
   > to recommend objects B to a subject that is already in the possession of   
   > objects A.   
   >   
   > So far so good.  However, the purpose of such a system is always to   
   > recommend ever more and more objects, i.e., to increase an aggregate of   
   > objects (the goal of an on line book store is to sell as much books as   
   > possible, and buyers want to own more than one book).   
   >   
   > Such a system does _not_ distinguish between _degrees_ of preference, i.e.,   
   > it does not produce and _ordering_ of preference between different objects;   
   > and this is the crux of my post.   
      
   Yes, it is the difference between possessing and having an interest in   
   something, which the model above does not respond to.   
      
   > (Note that if confidence(A => B) > confidence(A => C), this does not imply   
   > that predicted_preference(B) > predicted_preference(C).  The ordering of   
   > association rules (with the same condition) by confidence is not the same as   
   > the ordering of objects by predicted preference.)   
   >   
   > So, for my project I am looking for a way   
   > (1) to infer or predict the prefencences of objects somehow, based on user   
   > input (ideally, the subject would have to input as little data as possible   
   > himself).   
   > (2) and then order all objects by descending (predicted) preference, so that   
   > (hopefully) the user would only have to look at the top one.   
   >   
   > For input, the system could present two objects at a time and let the   
   > subject choose which he prefers.  The choice of the subject would reflect   
   > his relative preference for one of the two objects.  The preference relation   
   > is a strict ordering relation between objects, parametrized on the subject   
   > and time (but let us assume that the subject's preferences do not change   
   > over time).   
   >   
   >   O1 <    O2   
   >       S,t   
      
   I remotely remember a study that shown preference relation is not   
   transitive. So a person can give sort of answers: O1 < O2 < O3 < O1. The   
   problem is that the relation is of course fuzzy and when the answer is   
   forced to certain Boolean, that heavily distorts the result.   
      
   > Alternatively, the input could consist of assigning a grade, or a monetary   
   > amount to each object in some set of objects.  (This is actually just a way   
   > of monotonically mapping the ordering relation between objects on the   
   > ordering relation between numbers: value(O1) < value(O2) implies O1 < O2.)   
      
   That changes nothing. I think that such measure (value : object -> ordered)   
   simply does not exist, because preferences are not ordered in the strict   
   sense.   
      
   > Furthermore, if possible, I would like the subject to be able to input that   
   > he prefers some object (a 'maximum object') above all possible objects, and   
   > that he prefers some object below no object (i.e. the 'absent object').   
   >   
   > So my questions are, roughly, these:  Has such a thing been done before?  If   
   > so, could you provide me with some references to e.g. books and/or articles?   
   > (I have looked into 'fuzzy association rules', but as far as I can tell   
   > these do not meet my needs.  How should I input and represent the preference   
   > relation?   
      
   I think that preference has a fine structure which gets lost when mapped to   
   a numeric value. Minimally one should distinguish: O1 > O2 and not (O1 >   
   O2). This immediately leads to intuitionistic fuzzy preference values:   
      
   Pos(O1 > O2), Nec(O1 > O2)   
      
   [ Nec(O1 > O2) = 1 - Pos(not (O1 > O2)) ]   
      
   Maybe it should be split even finer. Say you have some set of basic   
   properties, features (typical are genre, volume, price, artwork on the   
   cover ..., but you could invent something less evident). User classify   
   objects into features and they are compared in the feature space.   
      
   [ Basically all that is no more than to find a space of object's images   
   where components (of the image vector) would become clearly ordered. Though   
   the vectors itself will be still incomparable. But at least it would more   
   realistically model what happens in someone's head. ]   
      
   > What algorithms can be used to predict preferences?  Where can I   
   > find out more?  Am I making any sense? ;-)   
      
   I would try to formulate it in fuzzy terms and keep it fuzzy all the way   
   until the final stage.   
      
   --   
   Regards,   
   Dmitry A. Kazakov   
   http://www.dmitry-kazakov.de   
      
   [ comp.ai is moderated.  To submit, just post and be patient, or if ]   
   [ that fails mail your article to , and ]   
   [ ask your news administrator to fix the problems with your system. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]