Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.ai    |    Awaiting the gospel from Sarah Connor    |    1,954 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 722 of 1,954    |
|    Ted Dunning to All    |
|    Re: How can I analyse the similarity mea    |
|    14 May 05 05:58:30    |
      From: ted.dunning@gmail.com              Depending on what you are doing, Pearson's correlation is likely to be       a *really* bad measure of similarity because it is subject to very bad       behavior when you are looking at small numbers of examples.              Better in many cases to use measures of anomalous association such as       G^2 (I recommended this in my 1993 paper in computational linguistics)       or Fisher's exact test (search for Ted Pedersen's work).              Much better than that, however, is to really analyze what you are       trying to do and put a solid probabilistic model underneath it. If       you do that, you can know how reliable your inferences are and avoid       the problems with small counts. See the work of David Mackay for       examples of the maximum evidence method. David Heckerman has a very       nice tutorial on Bayesian networks as well.              How about you say what you are trying to do?              [ comp.ai is moderated. To submit, just post and be patient, or if ]       [ that fails mail your article to |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca