Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.ai    |    Awaiting the gospel from Sarah Connor    |    1,954 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 683 of 1,954    |
|    Sebastian Stern to All    |
|    Re: Data Mining of Preference Orderings     |
|    03 Apr 05 02:48:27    |
      From: sebastianstern@wanadoo.nl              Dmitry A. Kazakov:       | Sebastian Stern:       | > For a project of mine, I have recently become interested in Data       | > Mining. One common technique in Data Mining (used by on line book       | > stores for example) is the mining of Association Rules. Each       | > rule has the form       | >       | > A => B       | >       | > where A and B are sets of objects (e.g., books), and each rule       | > can be interpreted as stating "The possession of A implies the       | > possession of B".       |       | Shouldn't it be possession of A implies interest in B? Clearly,       | possession       | /= interest.              No, it really should be "the possession of A implies the possession of B".       Association rules are defined in terms of possession; it is then assumed       that possession means interest, and absense of posession means absense of       interest. This binary modelling of interest is too course-grained for my       purposes. Possession is indeed not the same as the _degree_ of interest;       that is the whole reason for me writing my post: I am looking for a way to       predict '_degree_ of interest', not the _probability_ of a discrete       'yes/no'-kind of interest (as ordinary association rule systems do).              | > The 'degree of confidence' in a rule is defined as the conditional       | > probability that a subject (user) is interested in an objects B under       | > the condition that the subject already posesses objects A. This       | > confidence is thus computed using the familiar formula for       | > conditional probabilities:       | >       | > confidence(A => B) := P(B | A) := P(A and B) / P(B)       |       | If A is a set of books, then P(A) is a probability of what? Are books       | random? Maybe P(A) = P(User has A)? Is this random?              P(A) is the so-called 'degree of support' of an object set: it is the       frequency that the set occurs in the data base, i.e., the number of subject       owning the object set A, i.e., P(User has A). (Note that using absolute or       relative frequencies does not change the result.)              | > Such a system does _not_ distinguish between _degrees_ of preference,       | > i.e., it does not produce and _ordering_ of preference between       | > different objects; and this is the crux of my post.       |       | Yes, it is the difference between possessing and having an interest in       | something, which the model above does not respond to.              Precisely, and I am looking for a way to predict (degree of) interest.              | > For input, the system could present two objects at a time and let the       | > subject choose which he prefers. The choice of the subject would       | > reflect his relative preference for one of the two objects. The       | > preference relation is a strict ordering relation between objects,       | > parametrized on the subject and time (but let us assume that the       | > subject's preferences do not change over time).       | >       | > O1 < O2       | > S,t       |       | I remotely remember a study that shown preference relation is not       | transitive. So a person can give sort of answers: O1 < O2 < O3 < O1. The       | problem is that the relation is of course fuzzy and when the answer is       | forced to certain Boolean, that heavily distorts the result.              Yes, the preference ordering relation may not always _appear_ to be       transitive, but this is not due to fuzzyness; it is due to the fact that       preference may change over time (that is why I included a time parameter,       and the parenthetical remark that it should be ignored). Your example of       inconsistent preference ordering can be resolved as follows:              The subject inputs the following preferences:        O1 < O2        S,t1               O2 < O3        S,t2       At this point everything is consistent. When the user inputs        O3 < O1        S,t3       an inconsistency occurs. This means the subject's preferences have changed,       so the system simply throws away or ignores the oldest inputs that are       inconsistent with the newest ones, so the remaining set becomes:        O2 < O3        S,t2               O3 < O1        S,t3              (If a subject cannot choose between two objects, what he really does is       choose the 'absent object' (see initial post), the representation of 'no       object chosen'.)              Given this method of resolving inconsistencies, the preference relation can       _always_ be made strict. That is why you should ignore the time parameter,       and assume the ordering relation is strict. This is really not important       for my request.              | > Alternatively, the input could consist of assigning a grade, or a       | > monetary amount to each object in some set of objects. (This is       | > actually just a way of monotonically mapping the ordering relation       | > between objects on the ordering relation between numbers:       | > value(O1) < value(O2) implies O1 < O2.)       |       | That changes nothing. I think that such measure (value : object ->       | ordered) simply does not exist, because preferences are not ordered       | in the strict sense.              See above. Because the ordering relation _is_ strict, it can always be       monotonically mapped on numbers.              | > So my questions are, roughly, these: Has such a thing been done       | > before? If so, could you provide me with some references to e.g.       | > books and/or articles? (I have looked into 'fuzzy association rules',       | > but as far as I can tell these do not meet my needs.) How should I       | > input and represent the preference relation?       |       | I think that preference has a fine structure which gets lost when       | mapped to a numeric value. Minimally one should distinguish: O1 > O2       | and not (O1 > O2). This immediately leads to intuitionistic fuzzy       | preference values:       |       | Pos(O1 > O2), Nec(O1 > O2)       |       | [ Nec(O1 > O2) = 1 - Pos(not (O1 > O2)) ]       |       | Maybe it should be split even finer. Say you have some set of basic       | properties, features (typical are genre, volume, price, artwork on the       | cover ..., but you could invent something less evident). User classify       | objects into features and they are compared in the feature space.              Again, given the fact that preference is indeed strict at any given moment       in time, the use of intuitionistic fuzzy preference values needlessly       complicates things. Please assume the ordering relation is strict.              | [ Basically all that is no more than to find a space of object's images       | where components (of the image vector) would become clearly ordered.       | Though the vectors itself will be still incomparable. But at least it       | would more realistically model what happens in someone's head. ]              See above.              | > What algorithms can be used to predict preferences? Where can I       | > find out more? Am I making any sense? ;-)       |       | I would try to formulate it in fuzzy terms and keep it fuzzy all the way       | until the final stage.              What fuzzy association rules do is divide a continuous value into discrete       yet fuzzy steps, so e.g. the continous value 'length' which previously       ranged over real numbers from 1 to 10 can now assume the ternary values              [continued in next message]              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca