Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.ai    |    Awaiting the gospel from Sarah Connor    |    1,954 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 719 of 1,954    |
|    Ted Dunning to All    |
|    Re: Benchmarks for relevance factor    |
|    05 May 05 05:19:29    |
      From: ted.dunning@gmail.com              There are lots of ways to handle this kind of data.              One way is to convert your probabilistic data (with confidence factors)       into a deterministic data set by running through your original data       many times and producing new training examples that have a target       variable selected with the indicated probabilities. This assumes that       your confidence factors are really probabilities which may be too big a       leap.              Another way to handle data with probabilistic training examples is to       note that most training algorithms are maximum likelihood estimators       that are maximizing the probability of the training data given the       model parameters (i.e. \hat \theta = argmax_\theta p(Y | \theta)). For       independent training examples, this is just               \hat \theta = argmax_\theta log p(Y | theta)        = argmax_\theta sum_i y_i log p(y_i | \theta) +       (1-y_i) log p(y_i | \theta)              Normally the target variables (y_i) are binary for supervised training       and because of this various other formulations that are equivalent to       the expression above are used to simplify later derivations. If you       use this expression directly, however, then nothing says that the x_i       have to be binary. In fact, there is a fairly well known theorem often       attributed to Gibbs that says that for two distributions p and q that       sum_i p_i log q_i has a maximum iff p = q. But if p is the true       probability of something then sum_i p_i log q_i is just the expected       value of log q which explains why the maximum likelihood estimator       converges asymptotically to the actual probability distribution.              If this is a bit terse, please just say so. The details are easy to       provide.              [ comp.ai is moderated. To submit, just post and be patient, or if ]       [ that fails mail your article to |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca