Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.ai    |    Awaiting the gospel from Sarah Connor    |    1,954 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 566 of 1,954    |
|    Guybrush Treepwood to Ted Dunning    |
|    Re: locally weighted regression    |
|    17 Jan 05 22:11:57    |
      From: schemer@hotmail.com              Ted Dunning wrote:              > Actually, without more information, it is impossible to say what your       > results mean.       >       > You need to give just a bit more information such as how many data       > points you have and whether you can obtain more data to test any models       > that you create.       >       > Here are a few scenarios:       >       > a) you have tens of thousands of data points or more and can get more       > any time you like. This occurs often in signal processing       > applications. In such a situation, it seems likely that your quadratic       > fit results really does mean something. To test this without       > mathematics, look at the residuals on the training data and then look       > at the residuals on data that you didn't use in the regression or in       > the selection of regression models. If the average magnitude of the       > residuals is about the same in both cases (or better yet, the       > distribution is similar), then you probably have something.       >       > b) you have hundreds of data points and getting more is difficult or       > impossible. Here things become murkier. You should institute a strict       > discipline of using only a portion of your data for trying different       > regressions and reserve two other portions, one to test a number of       > regressions for evaluating whichever model seems best. See below for       > references to mathematical techniques that can help you in cases where       > you can't hold data back.       >       > c) you have a dozen to a few dozen data points. This situation is       > REALLY difficult to deal with. You probably can't judge between all of       > the models that you are describing and unless you luck into a model       > form (usually be deep knowledge of your system) that really works       > incredibly well, you are in a really difficult spot statistically       > speaking. You can falsify some regressions with this much data, but it       > is very difficult to derive models of any complexity that will work for       > unseen data.       >       > If you are up for some serious thinking and are will to basically roll       > your own regression code, you might take a look at David Mackay's work       > on the evidence method in regression problems. Using such Bayesian       > techniques with code written by some random schmoe is pretty difficult,       > however.       >       > Good luck.       >       The situation is like this. It is a task for school, we get 20 datapoints       and must use different regression based learners. The question is; which       one performs best over to the inputpoints.       >From what I see at the different plots, the quadratic regression through 3NN       performs best.       It is asked whether this then tells about the function with which the data       were generated. But I can't find anything about that in our textbook.       (Machine Learning, Mitchell, T.)              Following question, it is asked to let Vizier calculate the best learner.       The program says global quadratic regression will be the best fit.       The question here is, how does this relate to your intuition in the previous       question.              But as said, I can't find anything in the book to answer the first question.              [ comp.ai is moderated. To submit, just post and be patient, or if ]       [ that fails mail your article to |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca