home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.ai      Awaiting the gospel from Sarah Connor      1,954 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 1,080 of 1,954   
   Milind to widowss   
   Re: why my accuracy is so low   
   13 Jun 06 03:59:55   
   
   From: milind.a.joshi@gmail.com   
      
   widowss wrote:   
   > using naive bayes classifier to calssify the public 20new groups.   
   > I filter the words whose letters are less than 3.   
   >  I stemmed the words.   
   > The left words are the elements of the feature vectors.   
   > when I chose 5 classes to be classified, the accuracy is about 45%.   
   > In the book "machine learning", it said that this accurancy can up to   
   > 89%.   
   > Can anybody tell me how to improve my classifier?Thanks a lot.   
   >   
      
   Hi,   
      
   Take reported accuracy levels with a pinch of salt, because the results   
   could apply to one type of classification problem, and your problem   
   could be different. Also, with classification, Precision and Recall are   
   the numbers that matter, as are some of their derivatives.   
      
   You would recall that a Naive Bayes Classifier makes one assumption -   
   namely, that the data is IID, and that it is possible to estimate   
   posterior probability with a knowledge of the prior... That does not   
   apply to many classification problems out there, and your public   
   newsgroup classification could be one of them.   
      
   Also, the overlap between the 5 classes you chose might be high,   
   causing some problems... Generally, if the classes have a clear   
   boundary that distinctly separates them, our job is easier, but what if   
   the boundary is hazy, and there are elements that belong to 2 or more   
   classes?   
      
   Besides, as with any other computer program, your program could have   
   bugs!   
      
   Why don't you try to do a comparision using some toolkits like   
      
   JNBC, http://jbnc.sourceforge.net/   
      
   Andrew McCallum's BOW http://www.cs.cmu.edu/~mccallum/bow/   
      
   If you have access to MATLAB, there are some toolkits in there that you   
   could use too.   
      
   Regards,   
   Milind Joshi   
   IDEA TECHNOSOFT INC.   
   http://www.ideatechnosoft.com   
      
   [ comp.ai is moderated ... your article may take a while to appear. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca