... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"

comp.ai

Awaiting the gospel from Sarah Connor

1,954 messages

[ << oldest | < older | list | newer > | newest >> ]

Message 35 of 1,954

Tom Osborn to Ke Yin

Re: Quinlan info gain theory - multi-cla

13 Aug 03 07:24:59

   From: xyz@xyz.com   
      
   "Ke Yin"  wrote in message news:bh28n3$99v$   
   @mulga.cs.mu.OZ.AU...   
   > roi_noa@yahoo.com (Dan) wrote in message news:...   
   > > Hello,   
   > > I read Quinlan's research (ID3,C4.5) and would like to ask a question:   
   > >   
   > > He writes:   
   > > "To simplify the following treatment, we will assume that there are   
   > > only two such classes denoted P and N, although the extension to any   
   > > number of classes is not difficult."   
   > >   
   > > When dealing with K classes, is the way the probability (entropy)   
   > > claculated is the only change in the equations?   
   > >   
   > > i.e. instead of   
   > > E(A)=-[p/(p+n)]*log2[p/(p+n)] - [n/(p+n)]*log2[n/(p+n)]   
   > > should be   
   > >   
   > > E(A)=-sum {[Ci/(C1+C2+...+Ck)]*log2[Ci/(C1+C2+...+Ck)]}   
   > >      i=1->k   
   > >   
   > > Should log2 be changed to logk?   
   > >   
   > > Are there any articles (or books) dealing with info-gain algorithm in   
   > > a multi-class space?   
   > >   
   > > Thanks in advance,   
   > > Dan.   
   >   
   > Dan,   
   >   
   > What you wrote was the correct maths to calculate information gain in   
   > when there are multiple classes. As to the base of the logarithm, it   
   > doesn't matter, because that only indicates the unit that information   
   > is measured in. Typically the information measured with log2 is in   
   > bits and that measured by loge is in nits.   
      
      
   Well, almost right. There needs to be a correction for the uncertainty   
   from sampling error. Ie, avoiding bias when the counts are low.   
      
   In the 2-way case, E(A) becomes:   
      
    E(A)=-[(p+1)/(p+n+2)]*log2[(p+1)/(p+n+2)] - [(n+1)/(p+n+2)]*log   
   [(n+1)/(p+n+2)]   
      
   This assumes the uninformed case (initial prior) P(p) = P(n).   
      
   For k-way,   
      
   E(A)=-sum {[(Ci+1)/(C1+C2+...+Ck+k)]*log2[(Ci+1)/(C1+C2+...+Ck+k)]}   
                   i=1->k   
      
   In some later work, Ross Quinlan set a stopping condition on smaller nodes and   
   used neural nets to estimate from there ("down").   
      
   Tom.   
      
   [ comp.ai is moderated.  To submit, just post and be patient, or if ]   
   [ that fails mail your article to , and ]   
   [ ask your news administrator to fix the problems with your system. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)

[ << oldest | < older | list | newer > | newest >> ]