Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.ai    |    Awaiting the gospel from Sarah Connor    |    1,954 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 679 of 1,954    |
|    Dmitry A. Kazakov to Sebastian Stern    |
|    Re: Data Mining of Preference Orderings    |
|    01 Apr 05 22:33:46    |
      From: mailbox@dmitry-kazakov.de              On Fri, 01 Apr 2005 02:26:05 GMT, Sebastian Stern wrote:              > For a project of mine, I have recently become interested in Data Mining. One       > common technique in Data Mining (used by on line book stores for example) is       > the mining of Association Rules. Each rule has the form       >       > A => B       >       > where A and B are sets of objects (e.g., books), and each rule can be       > interpreted as stating "The possession of A implies the possession of B".              Shouldn't it be possession of A implies interest in B? Clearly, possession       /= interest.              > The 'degree of confidence' in a rule is defined as the conditional       > probability that a subject (user) is interested in an objects B under the       > condition that the subject already posesses objects A. This confidence is       > thus computed using the familiar formula for conditional probabilities:       >       > confidence(A => B) := P(B | A) := P(A and B) / P(B)              If A is a set of books, then P(A) is a probability of what? Are books       random? Maybe P(A) = P(User has A)? Is this random?              > Only those rules with a confidence above a certain threshold are then used       > to recommend objects B to a subject that is already in the possession of       > objects A.       >       > So far so good. However, the purpose of such a system is always to       > recommend ever more and more objects, i.e., to increase an aggregate of       > objects (the goal of an on line book store is to sell as much books as       > possible, and buyers want to own more than one book).       >       > Such a system does _not_ distinguish between _degrees_ of preference, i.e.,       > it does not produce and _ordering_ of preference between different objects;       > and this is the crux of my post.              Yes, it is the difference between possessing and having an interest in       something, which the model above does not respond to.              > (Note that if confidence(A => B) > confidence(A => C), this does not imply       > that predicted_preference(B) > predicted_preference(C). The ordering of       > association rules (with the same condition) by confidence is not the same as       > the ordering of objects by predicted preference.)       >       > So, for my project I am looking for a way       > (1) to infer or predict the prefencences of objects somehow, based on user       > input (ideally, the subject would have to input as little data as possible       > himself).       > (2) and then order all objects by descending (predicted) preference, so that       > (hopefully) the user would only have to look at the top one.       >       > For input, the system could present two objects at a time and let the       > subject choose which he prefers. The choice of the subject would reflect       > his relative preference for one of the two objects. The preference relation       > is a strict ordering relation between objects, parametrized on the subject       > and time (but let us assume that the subject's preferences do not change       > over time).       >       > O1 < O2       > S,t              I remotely remember a study that shown preference relation is not       transitive. So a person can give sort of answers: O1 < O2 < O3 < O1. The       problem is that the relation is of course fuzzy and when the answer is       forced to certain Boolean, that heavily distorts the result.              > Alternatively, the input could consist of assigning a grade, or a monetary       > amount to each object in some set of objects. (This is actually just a way       > of monotonically mapping the ordering relation between objects on the       > ordering relation between numbers: value(O1) < value(O2) implies O1 < O2.)              That changes nothing. I think that such measure (value : object -> ordered)       simply does not exist, because preferences are not ordered in the strict       sense.              > Furthermore, if possible, I would like the subject to be able to input that       > he prefers some object (a 'maximum object') above all possible objects, and       > that he prefers some object below no object (i.e. the 'absent object').       >       > So my questions are, roughly, these: Has such a thing been done before? If       > so, could you provide me with some references to e.g. books and/or articles?       > (I have looked into 'fuzzy association rules', but as far as I can tell       > these do not meet my needs. How should I input and represent the preference       > relation?              I think that preference has a fine structure which gets lost when mapped to       a numeric value. Minimally one should distinguish: O1 > O2 and not (O1 >       O2). This immediately leads to intuitionistic fuzzy preference values:              Pos(O1 > O2), Nec(O1 > O2)              [ Nec(O1 > O2) = 1 - Pos(not (O1 > O2)) ]              Maybe it should be split even finer. Say you have some set of basic       properties, features (typical are genre, volume, price, artwork on the       cover ..., but you could invent something less evident). User classify       objects into features and they are compared in the feature space.              [ Basically all that is no more than to find a space of object's images       where components (of the image vector) would become clearly ordered. Though       the vectors itself will be still incomparable. But at least it would more       realistically model what happens in someone's head. ]              > What algorithms can be used to predict preferences? Where can I       > find out more? Am I making any sense? ;-)              I would try to formulate it in fuzzy terms and keep it fuzzy all the way       until the final stage.              --       Regards,       Dmitry A. Kazakov       http://www.dmitry-kazakov.de              [ comp.ai is moderated. To submit, just post and be patient, or if ]       [ that fails mail your article to |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca