Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.ai    |    Awaiting the gospel from Sarah Connor    |    1,954 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 634 of 1,954    |
|    Ted Dunning to All    |
|    Re: Logistic Regression Details and Pseu    |
|    28 Feb 05 05:05:54    |
      From: ted.dunning@gmail.com              Octave has an implementation of logistic regression. R has a       particularly powerful implementation. So would almost any statistical       package. I have a version in Java that I could send to anybody who is       interested. The dearth of code for logistic regression explicitly is       probably due to the fact that all of the hot-shots like to implement       everything in the framework of generalized linear models, of which       logistic regression is just a special case. Thus a search for logistic       regression specifically might not turn anything up.              Bare logistic regression by itself is pretty uninteresting for building       models for the reasons that Phil mentions. Where it gets useful is       when you attach good regularization either in the form of variable       selection (SAS does stepwise, forward and backward selection methods)       or in the form of constraints on the magnitude of coefficients (minimum       sum squared deviation is common, but more clever schemes are very       useful). Generally in my work I tend to use both methods for different       reasons.              The common reason for naive logistic regression algorithms to fail is       due to separability. You might think that haveing separable training       examples is good, but it actually can cause problems. In the case of       logistic regression with separable inputs the "optimum" set of weights       has unbounded magnitude. Of course, the likelihood of separability       goes up dramatically as the dimension goes up. A little bit of penalty       for large weights cures this tendency with no noticable side-effects.       I recommend always adding a very small bit of regularization for this       even if you would otherwise not have any at all since it avoids so       numerical problems and usually accelerates convergence as well.              The likely reason that Phil has had good luck with using a back-prop       training algorithm is that there is often an inherent bias against       large weights due to (possibly inadvertent) early stopping. Early       stopping is essentially equivalent to a large weight penalty,       especially if you start with very small weights. Since logistic       regression training algorithms know that there really is an optimum       (way out by infinity), they head for the stars in seven league boots.       Back-prop on the other hand clunks along making small steps in       generally the same stellar direction, but performance on the training       set begins to look pretty flat after a bit and so things get stopped       early.              I tend to prefer making the regularization explicit, though, since       early stopping really limits the flexibility in terms of what you can       regularize.              [ comp.ai is moderated. To submit, just post and be patient, or if ]       [ that fails mail your article to |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca