XPost: comp.ai.nat-lang   
   From: ted.dunning@gmail.com   
      
   On Apr 14, 5:28 am, "Dmitry A. Kazakov"    
   wrote:   
      
   > > When the information is known to be there, I can build a confidence   
   > > rating for whether the text is correct as output by the software (a   
   > > text analyzer or information extraction system).   
   >   
   > > But when I can't find the information, reporting that and reporting   
   > > confidence about that has been driving me a little crazier than usual.   
   > In short the problem you have is in the meaning of confidence and   
   > application of the excluded middle law.   
   >   
   > What is 95%? You need a model of. It cannot be probability because the   
   > document is obviously not random (neither the way of its processing is). So   
   > you turned to confidence factors which is all OK, but confidence should   
   > have a meaning too.   
      
   The statement "It cannot be probability ..." is essentially a   
   tautology. It should read, "We cannot use the word probability to   
   describe our state of knowledge because we have implicitly accepted   
   the assumption that probability cannot be used to describe our state   
   of knowledge".   
      
   The fact that an object has been constructed in its present state by   
   non-random processes outside our ken is no different as far as we can   
   tell than if the object were constructed at random (note that random   
   does not equal uniform).   
      
   Take the canonical and over-worked example of the coin being flipped.   
   Before the coin is flipped a reasonable observer who knows the physics   
   of the situation and who trusts the flipper would declare the   
   probability of heads to be 100%. After the coin is flipped, but   
   before it is revealed, the situation is actually no different. Yes,   
   the coin now has a state whereas before the coin was only going to   
   have a state, but, in fact, the only real difference is that the   
   physics has become somewhat simpler, the most important factor in our   
   answering the question of the probability has not changed. We still   
   do not know the outcome.   
      
   Moreover, if the person flipping the coin looks at the coin, that does   
   not and cannot change our answer.   
      
   When WE look at the coin, however, we now suddenly, miraculously   
   declare that the probability is now 100% that the coin has come up   
   heads. Nothing has changed physically, but our estimate has changed   
   dramatically.   
      
   Moreover, if we now examine the coin and find that it has two heads,   
   our previous answer of 50% is still valid in the original context. If   
   we were to repeat the experiment, our correct interpretation is to   
   give 100% as the probability before the flip. The only difference is   
   our state of knowledge.   
      
   So philosophically speaking, probability is a statement of knowledge.   
      
   Moreover, by de Finetti's famous theorem, even if this philosophical   
   argument is bogus, the mathematics all works our AS IF there were an   
   underlying distribution on the parameters of the system. That means   
   that we can profitably use this philosophical argument AS IF it were   
   true.   
      
   The upshot is that even if you are a frequentist in your heart of   
   hearts, it will still pay to behave as if you were a Bayesian. And I,   
   as a Bayesian, will be able to behave as if you were rational because   
   I will not know your secret.   
      
    So let's just call it a probability and be done with. You can keep   
   your secret and I won't tell anybody.   
      
   [ comp.ai is moderated ... your article may take a while to appear. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|