home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.ai      Awaiting the gospel from Sarah Connor      1,954 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 1,783 of 1,954   
   dominicpouzin@gmail.com to All   
   Use intrinsic information for numeric sp   
   11 Jul 08 13:12:18   
   
   I am writing a C4.5-type decision tree algorithm and wondering if I   
   should account for the intrinsic information when splitting over a   
   numeric attribute.   
      
   Consider the following sample:   
   A1         A2   
   0           play   
   3           no play   
   5           play              <- split just above this   
   9           play   
   15         no play   
      
   To evalute various splits at different values, one can calcualte the   
   information and the gain ratio (i.e. information / intrinsic   
   information). Then the best split point must be selected.   
      
   For example when splitting just above A1 = 5, we have:   
   info = 2/5 * info([1,1] + 3/5 * info[2,1]         // above split = (1   
   play+1 no play), below split = (2 play+1 no play)   
   intrinsic_info = info[2,3]                            // above split =   
   2 rows, below split = 3 rows   
      
   Given the fact that a numeric split will yield only 2 branches, I am   
   wondering how important it is to incorporate intrinsic information   
   (and calculate the information gain) to decide which split point is   
   best.   
      
   Thoughts?   
      
   [ comp.ai is moderated ... your article may take a while to appear. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca