Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.ai    |    Awaiting the gospel from Sarah Connor    |    1,954 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 561 of 1,954    |
|    Ted Dunning to All    |
|    Re: locally weighted regression    |
|    17 Jan 05 01:57:03    |
      From: tdunning@san.rr.com              Actually, without more information, it is impossible to say what your       results mean.              You need to give just a bit more information such as how many data       points you have and whether you can obtain more data to test any models       that you create.              Here are a few scenarios:              a) you have tens of thousands of data points or more and can get more       any time you like. This occurs often in signal processing       applications. In such a situation, it seems likely that your quadratic       fit results really does mean something. To test this without       mathematics, look at the residuals on the training data and then look       at the residuals on data that you didn't use in the regression or in       the selection of regression models. If the average magnitude of the       residuals is about the same in both cases (or better yet, the       distribution is similar), then you probably have something.              b) you have hundreds of data points and getting more is difficult or       impossible. Here things become murkier. You should institute a strict       discipline of using only a portion of your data for trying different       regressions and reserve two other portions, one to test a number of       regressions for evaluating whichever model seems best. See below for       references to mathematical techniques that can help you in cases where       you can't hold data back.              c) you have a dozen to a few dozen data points. This situation is       REALLY difficult to deal with. You probably can't judge between all of       the models that you are describing and unless you luck into a model       form (usually be deep knowledge of your system) that really works       incredibly well, you are in a really difficult spot statistically       speaking. You can falsify some regressions with this much data, but it       is very difficult to derive models of any complexity that will work for       unseen data.              If you are up for some serious thinking and are will to basically roll       your own regression code, you might take a look at David Mackay's work       on the evidence method in regression problems. Using such Bayesian       techniques with code written by some random schmoe is pretty difficult,       however.              Good luck.              [ comp.ai is moderated. To submit, just post and be patient, or if ]       [ that fails mail your article to |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca