home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.ai      Awaiting the gospel from Sarah Connor      1,954 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 572 of 1,954   
   Ted Dunning to All   
   Re: What is a "Score" in Datamining   
   24 Jan 05 20:26:11   
   
   From: tdunning@san.rr.com   
      
   My suggestion will sound harsh and probably won't help you in time for   
   exams, but the real key is to think about data mining not as an   
   academic course, but in terms of something that might make sense in   
   your life.  At the least, you need to try to think about what is really   
   happening in the whole field.   
      
   To this end, I would recommend that you set aside all of your books for   
   a moment and write down what you think that datamining is trying to do.   
      
   Then try to write down a statement of some classic datamining problem.   
   Try to imagine how you would attack this problem if you hadn't already   
   seen the solution.  For example, imagine the problem of trying to   
   predict what grade you might get in your course.  Or how would you tell   
   who in your class is next to have their car break down.  Or could you   
   predict the year of birth of somebody given their first name.  In any   
   of these problems, you should start with a picture that shows what goes   
   in and what comes out.   
      
   If you do think about a problem like this, especially if the problem is   
   particularly simple such as a binary decision, then presumably the   
   output is something that represents your estimate.  In practice when   
   you are estimating a binary value, there are any number of reasons why   
   it is better to give an output which indicates just how strong your   
   prediction is rather than just having a binary output.   
      
   When you do build such a machine that is predicting a binary output,   
   but which is producing a continuous value, then this continuous output   
   value is often called a score.  The term really doesn't have much   
   semantic or etymological significance other than the idea that when the   
   score is high, you win by finding what you are searching for   
   (credit-worthy applicant, fraud case, whatever).   
      
   Does this help?   
      
   [ comp.ai is moderated.  To submit, just post and be patient, or if ]   
   [ that fails mail your article to , and ]   
   [ ask your news administrator to fix the problems with your system. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca