... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.ai
Awaiting the gospel from Sarah Connor
1,954 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 1,472 of 1,954
Ted Dunning to earl.dukersch...@doit.wisc.edu
Re: Any interest in sparse K-Map issues?
23 Jul 07 00:08:49
   From: ted.dunning@gmail.com   
      
   On Jul 20, 7:33 pm, earl.dukersch...@doit.wisc.edu wrote:   
   > Where the two types intersect, you can seek data.  There will be cases   
   > where there   
   > are more intersections at some points than others, so you can seek   
   > those that are most   
   > likely to improve the guess.   
   >   
   > You can also detect regions of the Karnaugh map that are not covered   
   > by the possible regions of some set of data points (not necessarily   
   > those above), which is another   
   > target for data gathering.   
      
   What you are describing is what is normally called active learning.   
   It can be very effective in that the number of training cases required   
   can be decreased dramatically for a desired level of performance.  In   
   a sense, it is related to stratified sampling, but there are   
   differences in basic goals.   
      
   A more traditional statement of the basic idea in active learning is   
   that you have some space of hypothetical rules that can explain the   
   data that you have seen so far.  In order to decrease the rule set as   
   quickly as possible, you should ask the yes/no question that most   
   closely bisects the set of rules that you have because whatever the   
   answer you get, the number of still viable rules will drop by half.   
      
   If you are able to do this at each step of learning, clearly you have   
   learning time that is linear in the number of bits in your rules while   
   random sampling of questions will leave you with exponential   
   convergence time because so many questions will be ones that you   
   already know the answer to.  Lots of times you can't split the space   
   of rules exactly in half on every step, though, so things aren't quite   
   as good as they might be (but it still can pretty awesomely better   
   than random sampling of questions).  You might also have cases where   
   asking a slightly less than optimal question now will give you a   
   better chance at a 50/50 split later.  This would make the normal   
   greedy algorithm slightly less than optimal in terms of learning time,   
   but again, the practical impact of this is dwarfed in comparison with   
   the overall gain of the method.   
      
   When the answers you get are not necessarily correct or where you have   
   probably (but not certainly) correct presuppositions about which rules   
   are likely, then you must adapt this basic framework using Bayesian   
   inference of some sort.  Essentially, what you do is ask the question   
   whose (uncertain) answer will give you the largest decrease in the   
   entropy of your rule set.  One desirable thing you can do with the   
   Bayesian framework even with always-correct answers is set a prior   
   expectation on the rule set.  For instance, in your example, you might   
   like to say that more complex formulae are less likely (or less   
   desirable) than more complex ones.  One easy way to encode this is to   
   assign a prior proportional to exp(-number of gates in formula) or   
   exp(-number of minterms in formula).   
      
   There are lots of approaches for building systems that more or less   
   approximate this desired goal and there are gobs of applications for   
   systems like this, especially when you are building hybrid human/   
   computer systems where you have a finite amount of human resources and   
   a much less constrained amount of computer power.  For instance, if   
   you are building a web-page classifier by yourself, you will probably   
   require human judgments to base the classifier on, but you probably   
   don't want to spend the rest of your life classifying example pages.   
   With active learning, the computer can tell you at each step which   
   classifications would help it the most.  Another example would be part   
   of speech tagging.  Tagging lots and lots and lots of sentences is   
   painfully boring and after the first hundred or so, the value of your   
   input in terms of performance of the resulting system will be   
   considerably less than at the beginning.  Letting the system identify   
   important sentences for you to tag can dramatically decrease your   
   total tagging effort (as in decrease it by three or more orders of   
   magnitude).   
      
      
   [ comp.ai is moderated ... your article may take a while to appear. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]