... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.ai.fuzzy
Fuzzy logic... all warm and fuzzy-like
1,275 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 449 of 1,275
Fuzzy to Dmitry A. Kazakov
Re: Fuzzy Prediction from grouped data (
17 Apr 05 11:42:00
   XPost: comp.soft-sys.matlab   
   From: not@here.com   
      
   "Dmitry A. Kazakov"  wrote in message   
   news:c1x9d15twfhw$.36vug05uapdd$.dlg@40tude.net...   
   > On Sat, 16 Apr 2005 17:44:34 +0100, Fuzzy wrote:   
   >   
   > > "Dmitry A. Kazakov"  wrote in message   
   > > news:1y2599uy74umw.sllstdc5wvou.dlg@40tude.net...   
   > >> On Sat, 16 Apr 2005 12:22:49 +0100, Fuzzy wrote:   
   > >>   
   > >>> The examples I've seen in the fuzzy world all relate to categorising   
   one   
   > >>> item of data at a time.   
   > >>> Are there any straightforward "standard" paradigms out there to   
   classify   
   > > a   
   > >>> data item based on grouped data results?   
   > >>>   
   > >>> For example, say I have this training data set.   
   > >>>   
   > >>> X =1,2,1,3,4,2,4,1 (interval category)   
   > >>> Y= 3,6,3,6,8,3,4,1 (response)   
   > >>>   
   > >>> As we can see from this toy data the highest or "best"  response may   
   be seen   
   > >>> to be around 4 tailing away from this.   
   > >>>   
   > >>> So, when I have a new data item to classify, say a 3, I can give a   
   > >>> prediction for its response. Linguistically, I would wish to classify   
   as a   
   > >>> "Preference", Say "most preferred", "neutral", "least preferred" - I   
   know   
   > >>> how to defuzz, I'm just stating this to give a flavour of what I'm   
   after.   
   > >>>   
   > >>> I know I can use standard distribution stats such as mean and standard   
   > >>> deviation, but I wondered how the fuzzy world would view this problem.   
   In   
   > >>> fact would a method be to formulate the fuzzy sets based on such   
   > >>> distribution stats.   
   > >>>   
   > >>> Any thoughts, references etc appreciated.   
   > >>   
   > >> To apply fuzzy you need to formulate the problem in fuzzy terms. The   
   > >> problem of approximation of a function is not automatically fuzzy.   
   Neither   
   > >> it is statistical. It might become statistical first when Y(X) is   
   > >> considered as a random variable with some distribution and the goal is   
   to   
   > >> find the parameters of the distribution which would minimize the   
   > >> probability of an error. This is how we come to mean and dispersion,   
   > >> regression, least squares etc. Alternatively the meaning could be a   
   > >> distance treated as, say, energy of a physical process and again the   
   result   
   > >> might be least squares etc. Nothing changes here with fuzzy. It might   
   > >> become fuzzy if for instance, the values of X and Y are fuzzy, or the   
   > >> function is searched in a class of fuzzy-valued functions etc. Once you   
   > >> have a fuzzy formulation of the problem, "preference" receives a   
   meaning.   
   > >> Then you can expect Y*(3) yielding a fuzzy number. The membership   
   function   
   > >> of this number would represent the expectations of particular numbers   
   to be   
   > >> members of true Y(3). Most preferred would be ones with the highest   
   truth   
   > >> values.   
   > >   
   > > Say X and Y are imprecise. Does this make them fuzzy?   
   > > As I have historic values of X and Y, can this data not be used to   
   derive   
   > > some sort of fuzzy system - with perhaps the aid of expert knowledge to   
   > > clarify the system, given the historic data.   
   > >   
   > > If I take a toy example off the top of my head. Say different cattle are   
   > > presented with different feed and we wish to distinguish the preference   
   a   
   > > given cow has for a given feed as measured by the milk yield. We cannot   
   > > compare yield accross Cows.   
   > >   
   > > The feed is just grass, and the difference is how green it is as   
   described   
   > > by an "expert". So there may be some imprecision when fresh grass is   
   > > presented.   
   > >   
   > > We have the raw stats for previous cattle and we have a new batch of   
   grass   
   > > that has had its greenness allocated by the expert.   
   > >   
   > > Given the above, on a cow by cow basis, I wish a metric of how much the   
   cow   
   > > will prefer and/or dislike the feed. The estimation of  the magnitude of   
   the   
   > > preference/dislike will be aided by the amount of historic data we have   
   do   
   > > support any estimation.   
   > >   
   > > For example, a cow may only have rarely been presented with "brown"   
   grass   
   > > although the yied implies she may prefer it, but we cannot categorically   
   say   
   > > this given the lack of prior data - or perhaps we may wish to   
   extrapolate   
   > > this preference given the cows liking for grass close to this colour.   
   > >   
   > > I can see this can be done (or hope anyway), but what are the method(s)   
   used   
   > > to be able to translate the historic data so that it becomes meaningful   
   in   
   > > this context?   
   > >   
   > > I hope that makes sense - maybe not :)   
   >   
   > The point is that the metric does not come from fuzzy. Look how it happens   
   > in statistic to highlight the pattern:   
   >   
   > Variant 1. First you postulate that y=ax+b. Then you postulate that   
   > observed x's and y's are x+err and y+err, where err's distribution is also   
   > postulated. From that you find a and b minimizing the probability of   
   error.   
   >   
   > Variant 2. You again postulate y=ax+b. Then you say that x's and y's are   
   > exact, but a and b are random (vary from cow to cow). Further you   
   postulate   
   > a distribution of a and b and estimate the distribution's parameters to   
   > minimize the probability of error.   
   >   
   > In both cases and all their mixtures the metric is *postulated*: you   
   > pretend to know/presume/surmise how the cow works (y=ax+b). This knowledge   
   > does not come to you from the probability theory, it does from the   
   > farm-yard.   
   >   
   > Nothing changes here if you replace one type of uncertainty with another   
   > and switch to fuzzy. Mathematical apparatus may change, instead of least   
   > squares you could expect C-norm. But the rest will be same.   
   >   
   > What you have described as an example with cows looks like the variant 1.   
   > With the variant 2 there might be difficulties with its justification. If   
   > the parameters of the cow are not random, but fuzzy, then what does it   
   > physically mean? Note to make a prediction you'd describe a herd of all   
   > cows, not a particular cow. Should we treat it as if a cowboy forgot his   
   > spectacles and now is guessing what's the cow? [ You cannot get rid of   
   > randomness here, but you could mix fuzziness in. ]   
      
   No, that was a red herring introduced to imply that some sort of "averaging"   
   across cows "may" be possible to aid the model formulation. The cowboy   
   always   
   knows the cow and we are interested in predicting a specific cows response   
   given   
   the data to a new batch of feed.   
      
   Where I thought fuzzy would help is the possibility of belonging to more   
   than one set. The cow would probably have some preference AND dislike for   
   each batch - as the distribution perhaps may not be normal. Maybe a simple   
   y=ax+b doesn't cut it, perhaps the response is more sophisticated, may have   
   non linearities etc.   
      
   This is becuase the estimated value will serve as a parameter in a larger   
   model - say a cow health model - and the dislike it has for a feed may be   
   more informative than the preference it has for a feed.   
      
   Either way if we *assume* that the responses are fuzzy, a moot point, I am   
   looking for any techniques to be able to model the data as given to predict   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]