... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.ai
Awaiting the gospel from Sarah Connor
1,954 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 634 of 1,954
Ted Dunning to All
Re: Logistic Regression Details and Pseu
28 Feb 05 05:05:54
   From: ted.dunning@gmail.com   
      
   Octave has an implementation of logistic regression.  R has a   
   particularly powerful implementation.  So would almost any statistical   
   package.  I have a version in Java that I could send to anybody who is   
   interested.  The dearth of code for logistic regression explicitly is   
   probably due to the fact that all of the hot-shots like to implement   
   everything in the framework of generalized linear models, of which   
   logistic regression is just a special case.  Thus a search for logistic   
   regression specifically might not turn anything up.   
      
   Bare logistic regression by itself is pretty uninteresting for building   
   models for the reasons that Phil mentions.  Where it gets useful is   
   when you attach good regularization either in the form of variable   
   selection (SAS does stepwise, forward and backward selection methods)   
   or in the form of constraints on the magnitude of coefficients (minimum   
   sum squared deviation is common, but more clever schemes are very   
   useful).  Generally in my work I tend to use both methods for different   
   reasons.   
      
   The common reason for naive logistic regression algorithms to fail is   
   due to separability.  You might think that haveing separable training   
   examples is good, but it actually can cause problems.  In the case of   
   logistic regression with separable inputs the "optimum" set of weights   
   has unbounded magnitude.  Of course, the likelihood of separability   
   goes up dramatically as the dimension goes up.  A little bit of penalty   
   for large weights cures this tendency with no noticable side-effects.   
   I recommend always adding a very small bit of regularization for this   
   even if you would otherwise not have any at all since it avoids so   
   numerical problems and usually accelerates convergence as well.   
      
   The likely reason that Phil has had good luck with using a back-prop   
   training algorithm is that there is often an inherent bias against   
   large weights due to (possibly inadvertent) early stopping.  Early   
   stopping is essentially equivalent to a large weight penalty,   
   especially if you start with very small weights.  Since logistic   
   regression training algorithms know that there really is an optimum   
   (way out by infinity), they head for the stars in seven league boots.   
   Back-prop on the other hand clunks along making small steps in   
   generally the same stellar direction, but performance on the training   
   set begins to look pretty flat after a bit and so things get stopped   
   early.   
      
   I tend to prefer making the regularization explicit, though, since   
   early stopping really limits the flexibility in terms of what you can   
   regularize.   
      
   [ comp.ai is moderated.  To submit, just post and be patient, or if ]   
   [ that fails mail your article to , and ]   
   [ ask your news administrator to fix the problems with your system. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]